I have a crazy idea. How about we enable configuring a metadata validator per field via MPCONFIG? We could allow loading these as plugins like exporters if an internal one does not suffice. (Even with the SPA we still need a server side validation.) These configurations could even be requested as a JSON Schema expression, allowing for client side validation. Comments?
I like it! How would I write such a field-validator?
I'd say some stuff should be included like numeric min/max
Then you just use it by configuration
But to enable custom validators, these should use an SPI loaded Java plugin
Obviously you can use the same tricks again to go from there to Javascript/Python, but this comes with a performance penalty
They'd be loaded and connected to a field by configuration
It would certainly allow for much more complex validations
Probably for a first step it would be instance wide configuration only, but one should consider a later extension to configuration per collection.
Could the numeric and format validation logic be outsourced to the JSON Schema of a metadata block? This way, other instances could reuse the validation logic, and some existing features could be used already. Or is this not possible with the current implementation?
This starts to sound like a chicken and egg problem
We could of course write a custom validator that extracts information like this from a JSON Schema
In fact, that would be one of the ideas for these validators, enabling a textbox to be filled with JSON controlled by a schema
Using JSON Schemas as a format for metadata schema definitions in Java (so they become the source, not the target like we're talking about in the JSON schema topic) is a long lasting dream, but goes beyond what I envisioned
Also, this would still need configuration per collection I suppose
The validators I envision would have a Java interface that would be asked to hand out a JSON schema thing, to be included in the JSON schema you can retrieve now via the Dataverse API
Custom validators sound fun. Out of curiosity, do you have a specific use case?
Yes!
Do tell.
I'm grabbing the links as we speak
https://data.fz-juelich.de/guide/juelich/data-linking.html
We have this custom metadata field
In our fork, we added a custom metadata type "uri" for it
As the URL type would not be sufficient
With a text field and a custom validator, we could achieve the same but keep upstream compatibility, no fork necessary
Ah, so you want to support both http:// and smb:// for example.
Exactly :-)
Sounds fairly custom but maybe someone can reuse your validator some day. :grinning:
I can immediately envision more features for this
Controlled vocabularies without adding them to the schema
Lookup services
Restriction of URLs
Someone might want to disallow using certain author schemes that are in citation.tsv
#9750 from @luddaniel made it into 6.3 but it would be nice to drop in jar and not wait for a release.
But instead of forking the schema, the validator would bark at sth that is forbidden
Could the custom validators work on guestbook fields? See #10661 opened by @Dimitri Szabo
Also, what's the plan for keeping React in sync with these custom validators?
From a technical viewpoint, these validators would probably hook into Bean Validations.
So it should be possible to use these on anything we want them to use on.
Usually Bean Validators get attached using the decorator pattern, so I don't see why this shouldn't be possible for guest books.
My idea to expose these validators would be to make any of these plugins express themselves as JSON Schema. That way they could be picked up by any client.
As far as I know, the backend will still be the source of authority for any validation, right @Guillermo Portas ?
If I understood it correctly, the frontend will now use the API to retrieve the fields and data types. So aside from including this into the JSON Schema API endpoints, it should be possible to embed these validators in some serialized form into any other API endpoint as well.
It's not like we would remove everything else - we'd keep the data type around, but extend the definition to possible values/ranges/...
For email validation there was a mismatch that was corrected in https://github.com/IQSS/dataverse-frontend/pull/402
Yeah, but is a duplication of the constraint check. I'm suggesting we enable receiving these constraints as regex or whatever using the API, based on the custom validator implementation. In addition to being able to receive a JSON Schema thing the same interface could request responding with some Javascript validator, reusable in the SPA and other clients.
Reusable in the SPA would be great.
Last updated: Nov 01 2025 at 14:11 UTC