https://dataverse.harvard.edu/openapi now gives an error 500 :sad:
Yes, as predicted by @Jan Range in #9981 :disappointed:
#9981 has been prioritized but it needs a size. Any ideas?
Today we discussed created a spike issue to explore https://github.com/OpenAPITools/openapi-generator as suggested by @Oliver Bertuch
Recommending the pydantic-v1 generator for Python usage. PyDantic is pretty nice for data validation, parsing, and more. I've been using it for many projects including EasyDataverse and DVUploader.
But the point here is that we want to create the spec from the existing JAX-RS Java Code. So that will require Java tooling... :smirk:
Should this topic be moved to #dev?
Ah nvm, I thought this was about SDK generation in general :face_with_peeking_eye:
Sure, moved to #dev
Here's the spike issue: Investigate OpenAPI Generator Tool as alternative for OpenAPI pyDataverse support #10236
I'm planning on pulling it into the sprint, which begins today.
I must be doing something wrong, the example I copied has this...
<inputSpec>${project.basedir}/src/main/resources/api.yaml</inputSpec>
... but I want OpenAPI YAML as output, not input. :thinking:
@Oliver Bertuch have you done this already? Any tips for me? :sweat_smile:
I might have gotten that one wrong - the OpenAPI generator is about generating Java code from the spec, not the other way round
But: I did some digging, looking specifically for generating the OpenAPI YAML file... And here's what I found:
A) Use Swagger and Swagger Annotations. https://github.com/swagger-api/swagger-core/blob/master/modules/swagger-maven-plugin/README.md and https://github.com/openapi-tools/swagger-maven-plugin (not sure what the difference is). There is also https://github.com/stoicflame/enunciate which supports Swagger output as well as generating client code in a variety of languages.
B) Use MicroProfile OpenAPI annotations (these are the ones also useable with Payara /openapi endpoint) and use other plugins: https://geronimo.apache.org/microprofile/openapi.html or https://github.com/smallrye/smallrye-open-api/tree/main/tools/maven-plugin
I think the best shot is the SmallRye Maven Plugin, as it provides code scanning and keeps us compatible with the Payara OpenAPI implementation
Hope this helps @Philip Durbin
Yes! Thanks! I'll poke more at this soon.
(deleted)
(deleted)
(deleted)
I gave the issue a new title. More generic.
Agnostic to which tool we use.
Splendid!
I'm actually working on a different issue though. Hoping to get back to this next week, though someone else is welcome to pick it up, of course.
Hehehe, not me - I'm stuck with loads of other things. But definitely interesting stuff!
No worries. And we can hope that Payara fixes it. :grinning:
Yeah... good luck with that.
"Lasst alle Hoffnung fahren..." :see_no_evil:
I left them a nice note: https://github.com/payara/Payara/issues/6369#issuecomment-1885045621
Hi @Jan Range I was testing the mvn plugin recommended by the amazing @Oliver Bertuch and I was able to generate these:
openapi.json
openapi.yaml
Would you be so kind to give them a look and let us know what do you think?
Best,
Juan
Great work, @Juan Pablo Tosca Villanueva! Which tool did you end up using?
https://github.com/smallrye/smallrye-open-api/tree/main/tools/maven-plugin
awesome stuff
Let's see a PR :grinning:
Thanks @Juan Pablo Tosca Villanueva and @Oliver Bertuch for the great work! I've had a quick look through the yaml file and the only thing missing is linking the schemes to request bodies and responses.
For instance, the path api/v1/datasets/{id}/add already contains the information that multipart/form-data is expected, which is great, but the schema only expects a string although it should be at least jsonData and file.
This is the same for response bodies. For instance, the file retrieval path api/v1/files/{id} should return the DataFile object defined in schemas, but it is not referenced.
Both cases seem to apply to all endpoints defined in this yaml file. Maybe Payara needs to know where to put these objects. I did the same in Django for a couple of endpoints, perhaps there is a similar way in Payara?
I'll analyze it deeply soon and compile what is missing and needs modification. But so far, these seem to be the only issues. All objects are well-defined and that is a good start :blush:
I've found this tutorial, and it looks pretty similar to how Django handles OpenAPI. Maybe it helps to connect the schemes.
@Jan Range I haven't looked closely at the above yet, but I think two stages would be best:
How far would you say the new output helps toward the first stage?
@Philip Durbin sounds great! The new output is definitely helping the first stage a lot already. All endpoints are well-defined and we should be able to utilize it for code generation. We may not have the benefit of schemes, but for our testing, it is sufficient. I will give it a test next week and see how it performs.
I assume that the second stage will require work on the Java code base and thus I agree, that an issue or topic makes sense :smile:
It could even be a Google doc for now... just to jot down some ideas and solicit feedback.
Just created a Doc. Feel free to add points and todo's :blush:
https://docs.google.com/document/d/15knNe6RoN3NdySqhqDAOZa9OdwuoNjuyFE4LZEjUKOE/edit?usp=sharing
This is the branch I am working on https://github.com/IQSS/dataverse/tree/openapi-definition-endpoint if anyone wants to give it a look
/api/info/openapi/json and /api/info/openapi/yaml respectively will provide the requested definition
I need to add some integration tests (hopefully today)
And it will be ready for review
So you're providing API endpoints. Cool.
:smile:
It is actually 1 and I am leaving it open in case one day there are more formats :laughter_tears:
right, very cool
Do we also want a static version in the guides?
I would say no :laughing: with static files, more places that need to be updated but let me know what you think.
Would need to check if the plugin supports output to multiple directories tho
I guess people will be able to check the new endpoint on https://beta.dataverse.org to see what's coming
That is a good idea for a reference
I am not making this case sensitive, do we have anything against it? :laughing:
like you can call /api/info/openapi/JsOn and will work
Asking for the sake of consistency
The code looks fine but in the guides, please document one (sane) way.
IT is up next step is to add to the documentation and changelog
I just left a quick review: https://github.com/IQSS/dataverse/pull/10328#pullrequestreview-1885865582
Thanks! Let me give it a look :smile:
Ok so I merged the suggestions to the changelog and the notes now a couple of things:
"Maybe we should put this in the API changelog under 6.0." I think it is better to leave out of the changelog to make it more concise and clear but if you think it would be helpful we can add it :thinking:
About the files the answer is "I am not sure"
Can you please go ahead and add something to the API Guide? Under native, under info.
Sure that is my next step
awesome
sorry to jump the gun :sweat_smile:
just so excited
Don't worry! It makes me happy to work on something that someone is excited about :sweat_smile:
I'm just channeling enthusiasm from Jan. :grinning:
Do you think this is a good place?
https://guides.dataverse.org/en/latest/api/intro.html#lists-of-dataverse-apis
No, here, please: https://guides.dataverse.org/en/latest/api/native-api.html#info
Also, please put the word "Swagger" in there somewhere in case people search for it. It's the old name, as you may know.
Do you mind if I put this after the instalation server name?
Or do we want to keep a specific chronological order
Sure, sounds good.
@Philip Durbin Do we need this block on every call?
It is repeated in all the calls from the info section :melt:
Hmm, probably not.
I dunno, I guess I would leave it in for consistency.
Well, the thing is in other sections of the guide it is not on every definition
But in the info section is on all of them
I would defer that cleanup until later.
Ok
It's tough. We do want people to see it. They might land in the middle of the page.
Ok It is up
Waiting to be screamed about RST :laughing:
I think the files are indeed not necessary on git
I'm worried the files in git will be stale and confusing to someone looking at them in 5 years.
In theory those files will be updated every time that someone makes a build but yeah it would be pestering people to update those files when they make a change to the API
I removed them and added to gitignore
perfect
I think we are good, the only remaining item would be this?
info.api.openapi.exception.invalid.format and info.api.openapi.exception?
Oh, right. Maybe start with api? api.info.openapi.exception?
Also, do we need to write the file to disk? Can we do it in-memory?
Oh, but what about caching? This will never change within a release so we should cache it, right?
The plugin writes to file
Philip Durbin said:
Oh, right. Maybe start with api? api.info.openapi.exception?
Actually
The standard on the bundle is
#Access.java
access.api.allowRequests.failure.noDataset=Could not find Dataset with id: {0}
#Dataverses.java
dataverses.api.update.default.contributor.role.failure.role.not.found=Role {0} not found.
So i guess we keep the info.api?
I can add the openapi if you want to be more specific
Huh, who added those? :upside_down:
There's also api.prov and api.admin.datasetfield and api.ldninbox
Most of those come from Jim and Stephen :laughing:
I won't die on this hill.
So we leave it as it is? :smiling_devil:
I guess :disappointed:
:see_no_evil:
I am getting it out of draft as soon as Mr Jenkins is happy
So what about caching?
Is the file re-created every time someone hits that API?
The files are created on build with maven
So everytime someone makes a build they will be updated
I'm trying to think of something else that's like that for us.
Doesn't matter, I guess. Sounds fine. Efficient.
At least it solves the current issue for now and we can evaluate other solutions in the future :)
I would love to turn off the /openapi/ with this while payara fixes this on 2028
:laughter_tears:
I just have no idea if this is even possible
Me neither. If we figure out how to we should probably turn it off and document how to turn it on.
Maybe we can ask @Jan Range to spin up your branch and play around with it.
OK all the updates are in
Thanks for looking at this before going into your long weekend :laughter_tears:
Oh sure. Again, I think Jan should try it before we merge.
And I don't think there's a huge rush.
It's a great outcome. I'm glad the library worked.
I wonder what library Payara is using? The same one?
We can also compare to what is on https://dataverse.unc.edu/openapi
I don't think they are using the same
ok
https://github.com/OpenAPITools/openapi-diff
:eyes:
huh, try it, try it
At some point we'll need a new Zulip topic. :grinning:
Did you see Ollama use Discord? :rolling_on_the_floor_laughing:
Ok, now we're really off topic. :grinning:
:eyes:
We can talk about Zulip vs. Discord in #zulip if you like. :grinning:
Oh well that tool doesn't support 3.0
bummer
I wonder if we would be able to generate a full definition with what we have. I have seen other implementations that use SOA and the service layer probably define the response in terms of objects and then that is translated into an endpoint
By full definition do you mean the schema Jan wants? JSON in and JSON out?
Yes
Because I think there is nothing that tells to anything that is scanning what can be our response it is just a "response"
But if you have a service layer and that service layer returns a "Dataverse" object then that dataverse object can be interpreted and translated into a definition
But yeah let's take a step at a time
I think if we were using JAXB and JSONB it might work.
But I say we start a new topic for schema stuff when we're ready. Don't forget than Jan created a doc already: https://docs.google.com/document/d/15knNe6RoN3NdySqhqDAOZa9OdwuoNjuyFE4LZEjUKOE/edit?usp=sharing
I will keep this PR as Draft to keep it safe from Merging but as soon as Jenkins is done I will move it into ready for review and we wait to hear from @Jan Range about what he thinks about the spec and then I will unlock it
Probably it will still be on waiting for review when you come back @Philip Durbin :rolling_on_the_floor_laughing:
Jenkins has been a great guy today!
Sounds good. Recently Jan figured out how to spin up branches on his laptop.
10 builds and he is not catching fire I am proud of our boy!
Me too. Good boy. Here's a treat.
@Juan Pablo Tosca Villanueva just tested your PR and left a comment :-)
Jan Range said:
Juan Pablo Tosca Villanueva just tested your PR and left a comment :-)
Thank you so much for this Jan, if you have a chance could you please
openapi.yaml
run the same test with this file and let me know what is the output?
Of course :-) This is the output I have gotten:
Thanks again, quick question. Was your post on https://github.com/IQSS/dataverse/pull/10328#issuecomment-1949434744 the full log?
Yes, I have posted the full log to the PR. The one you've sent seems to have a few more issues. Could it be that the build I have drawn it from is a different one?
The second file that also provide errors is the one generated from Payara and if the first one provide less errors i think that would be a win but I still have to talk to the team about it. At least re enabling the first one would be a first step solution
The maven plugin we are using to generate these files is this
https://github.com/smallrye/smallrye-open-api/tree/main/tools/maven-plugin
But if we find a substitute to generate the files it would be good
I will keep playing and looking at the docs maybe there is anything that can be done
Great! Definitely a win for the first one. I would try to manually fix the issues raised by the code generation for the first file and see if the result is working. Have a great weekend!
@Juan Pablo Tosca Villanueva I cleaned up the current specs and generated a complete library and some nice-looking docs. The code works well except for the missing schemes. There are still some open points to fix. Shall I compile them and post them on this channel for discussion?
Jan Range said:
Juan Pablo Tosca Villanueva I cleaned up the current specs and generated a complete library and some nice-looking docs. The code works well except for the missing schemes. There are still some open points to fix. Shall I compile them and post them on this channel for discussion?
Sounds great @Jan Range, yeah if you compile them and then we/I can review what can we do to improve on these :smile: Thanks a lot! that looks amazing btw :tada:
Folks do you mind me chiming in and test some stuff?
Oliver Bertuch said:
Folks do you mind me chiming in and test some stuff?
That would be great @Oliver Bertuch
how dare you, Oliver
I think I already found a violation of the MP OA spec :see_no_evil:
We're supposed to put it here at META-INF: https://download.eclipse.org/microprofile/microprofile-open-api-3.1.1/microprofile-openapi-spec-3.1.1.html#_static_openapi_files
OK I found some more basic problems...
Using https://quobix.com/vacuum I saw some errors that will require code changes
At least if we want to stick with the smallrye plugin...
https://github.com/IQSS/dataverse/pull/10363
A lot of the warning https://redocly.com/docs/cli/installation/ and vacuum generate relate to us not using the proper annotations in our API classes... See https://download.eclipse.org/microprofile/microprofile-open-api-3.1.1/microprofile-openapi-spec-3.1.1.html#_annotations for a list of those. Especially @Tag / @Tags seems promising
The tag prop is very useful in grouping functionalities. Within the docs, I have added the first part after api/v1 as a tag to achieve this. In addition, operationId is crucial for function names in a generated library and summary for the names displayed on the left navigation.
If it helps, I have used GPT to generated both operationId and summary for test purposes. Maybe it can help as a starter.
Generating the summaries sounds like a good idea!
Would it help to manage a yaml/json file that holds metadata for each endpoint? Could include the above, the tag and thorough examples/docs. In the code it could simply reference the part by key. In Python I'd image something like this:
metadata = yaml.load(open("metadata.yaml"))
@OAIDescription(
url="api/v1/info/version",
**metadata["api/v1/info/version"],
)
def version():
pass
I am not sure if this is possible within Java, but I think it could help manage everything, since there are a lot of endpoints. What are your throughts?
I will look at your comments later today guys! sorry I was working with @Don Sizemore on the demo instance :sweat_smile:
I'm not sure where to go here.
Personally, I would love to see our great API docs from the guide included here
We have examples and descriptions, which is a great resource
I'm kind of wondering if it would be possible to make the description and examples a part of the spec and include that in a nice way within the Sphinx docs
Or, if that is not possible, it might make sense to migrate the API guide into something more standalone but integrated with the codebase generating the spec
There is https://sphinxcontrib-openapi.readthedocs.io/ which might be valuable...
Is there something we can merge that adds value? Or do we need to figure out more stuff before we merge?
We definitely should straighten out the errors before merging
Having a valid spec is a minimum
Anything else is about making it nicer, fancier, more accessible, better integrated
Yes, definitely. This PR of yours: Suggested changes to OpenAPI PR #10363
We probably should also do some experiments with using the generated file with the /openapi endpoint. Maybe also configure/disable scanning so we avoid the exceptions
So what I was thinking is that it would be a priority to restore the functionality as it was before since it was removed in 6.0 due to this Payara issue.
We can still check the old definition produced by Payara at https://dataverse.unc.edu/openapi for example with https://quobix.com/vacuum/ In my opinion it would be good to have an equivalent or similar as a "starting point".
Sadly Quiobix shows more errors with the one generated with the smallrye plugin than the one previously generated with Payara... and as @Oliver Bertuch mentions part of this may be caused by the improper use of some annotations but I am not sure is feasible at this point to update all the endpoints on the API and go trough the testing with the coverage that we currently have, just as an example yesterday we noticed that an endpoint is broken on develop thanks that we were doing a test with dv-uploader but this is not caught by any of the integration tests
So probably what I think is that find a replacement that is good enough for 6.2 and probably we can work on an improved version for a future 6.x version depending on checking the impact but I would love to hear your opinions
@Juan Pablo Tosca Villanueva do you think we should merge the PR from @Oliver Bertuch, once it's out of draft? Suggested changes to OpenAPI PR #10363
I am looking at it rn :sweat_smile:
So in it's current shape is not working for me :sweat_smile:
A couple of things, if this works we/I would need to additionally remove the created endpoints (?)
@Juan Pablo Tosca Villanueva what means "is not working for me"?
Also with this change, the OpenAPI definition should be reachable with /openapi ? and if that is the case do we need to do something on Payara since it takes over this URL
Happy to light the fire in a Zoom room
Sounds like a great idea :smile:
https://fz-juelich-de.zoom.us/j/69913636453?pwd=clE3SEFzeGd5MzdEbFZGUVFqNGJHQT09
One issue we found in my PR: the order of schema gen and war packaging needs to be reversed. Probably will only need moving the plugin before the WAR plugin.
(Otherwise the files are not copied into the WAR)
Also: although we told Payara not to scan, it still looked at the code... :rolling_eyes:
Sounds like a productive meeting.
We also spoke about generating the "response" part for the spec by using the @APIResponse annotations. Will require busy work, but might be worth it!
I suggest we disable Payara's OpenAPI module completely via https://docs.payara.fish/community/docs/6.2023.8/Technical%20Documentation/MicroProfile/OpenAPI.html#openApi-configuration and add our own endpoint /openapi to return the generated file from the application.
It doesn't look like we can make Payara properly not scan the code, so we will always run into problems.
I am going to bring part of this conversation on slack just to bring awareness about the @APIResponse annotation and probably talk about it during tech hours
There is one other thing we should try: put the YML/JSON in the META-INF at the root of the WAR. It might be the wrong place where we put it now... (Classpath META-INF folder)
Let me know if I can test something with the code gen and doc tools :blush:
I will check adding the API guides to the spec to see how they look in the docs I have posted
@Oliver Bertuch did you notice https://test-dv.readme.io/reference that Jan posted above?
Nope, I didn't!
pretty fancy :monocle:
same thing DataCite uses: https://support.datacite.org/reference
So we could probably ask @Kelly Stathis about it. :grinning:
Definitely one of the reasons why I was suggesting we could migrate the API guide into the code, so it ends up in fancy things like this
It might be a good idea to also try and split the API docs some into a part about dealing with entities and a part about RPC
I have found out that open source projects can apply for a free premium account, which allows them to add more detailed text documentation and recipes. I believe that adding recipes could be particularly useful for certain sections of the current API Guide.
Yep we use ReadMe over at DataCite, happy to answer any questions about the functionality! for what it's worth, I think we are on a paid plan but it might be an older /legacy tier.
@Oliver Bertuch Did you made a PR for your changes from https://github.com/poikilotherm/dataverse/tree/openapi-definition-endpoint ? :sweat_smile:
@Kelly Stathis interesting, thanks!
@Jan Range do you remember what the three things are you wanted? I know it'll be in the recording at https://py.gdcc.io :grinning:
Oh yes! It is https://github.com/IQSS/dataverse/pull/10363 but still on Draft, I would love to add these changes to the configuration and the location of the file and make the changes required on the current endpoints, also I spoke with @Philip Durbin and we are marking this as a flag feature and adding to the patch notes that this may change in the future.
-Multipart request body, tags and operation ids per what I see on the recording
Thanks!
I am not sure how all of these translate to changes to the code and also if it is feasible to add it on time for 6.2
Including testing and QA
Let's add the changes from Oliver's PR and then we can see what are the implications of adding these requests
But I would say at this status that is functional and with the changes from oliver it would be nice to merge for 6.2 and then add the others in another PR for 6.X
For context, we have sprint kickoff tomorrow and this is the last sprint for 6.2
Maybe we don't need it for 6.2. Maybe 6.3 is ok. I believe @Jan Range is the only one using it. And he can use Docker images locally for now, probably.
If we don't need it for 6.2 that is also fine
Still would have to check the implications of these changes because if the multipart or the annotations require updating or changing more than 1 endpoint :sweat_smile: I would dive today a bit on the documentation and see
Philip Durbin schrieb:
Jan Range do you remember what the three things are you wanted? I know it'll be in the recording at https://py.gdcc.io :grinning:
Sorry for the delay @Juan Pablo Tosca Villanueva - It was the request body multiparts (i.e. replacing a file), the missing operation IDs and the base server with variables:
Endpoints with multipart body
/api/v1/access/datafile/{fileId}/auxiliary/{formatTag}/{formatVersion} POST/api/v1/datasets/{id}/add POST/api/v1/datasets/{id}/addFiles POST/api/v1/datasets/{id}/addGlobusFiles POST/api/v1/datasets/{id}/replaceFiles POST/api/v1/datasets/{id}/thumbnail POST/api/v1/files/{id}/replace POSTCurrently only available as
requestBody:
content:
multipart/form-data:
schema:
type: str
Should look like this
requestBody:
content:
multipart/form-data:
schema:
type: object
properties:
jsonData:
$ref: '#/components/schemas/ReplaceFileRequest'
file:
type: string
format: binary
Duplicate endpoints
/api/v1/admin/groups/ip/{groupIdtf} and /api/v1/admin/groups/ip/{groupName}/admin/workflows/{identifier} and /admin/workflows/{id}Note there is a second duplicate endpoint as seen in the vacuum output @Jan Range: /admin/workflows/{identifier} and /admin/workflows/{id}
Thanks @Oliver Bertuch I am may not be awake enough yet :-D
Interesting how vacuum and redocly see the request body schemas not as errors
The validator I have used is from the code gen framework and I guess from a code gen perspective this is an error.
I see :smiley:
For the endpoints you mentioned, we probably will need to add @RequestBody annotations. See examples at https://download.eclipse.org/microprofile/microprofile-open-api-3.1.1/microprofile-openapi-spec-3.1.1.html#_requestbody (It might be enough to use the shorthand @RequestBodySchema, depending on complexity involved)
I'd still suggest we concentrate on making the spec available again, fixing the most crucial things first and maybe add CI actions to ensure it doesn't get broken by accident again. Fixing the response and request body things might be a good thing for a new issue / PR.
The larger we make the scope for the initial PR, the harder it is to get in, especially if we want it in for 6.2
6.2 could be as soon as two weeks: https://groups.google.com/g/dataverse-community/c/JrlK_nkowz4/m/SgMOHLCqCgAJ
Late March or early April
@Jan Range do you want this in 6.2? You're the main user, from what I can tell. :grinning:
I think having a minimal working version in 6.2 would be beneficial if we want to use it for pyDataverse and the docs. I agree with @Oliver Bertuch about making them available and fixing the urgent things mentioned above. Maybe even leave out the multipart stuff for 6.2 as this is something that can be easily fixed manually.
I am happy to help with adding the operationIds and tags to the Java code.
One thought is that as soon as we merge it, it will be available at https://beta.dataverse.org
I think it's better than the current broken /openapi endpoint. We could add a notification in the doc that it is alpha at the moment.
Sorry, which doc? The API Guide?
Sorry I've meant the OpenAPI doc/specs
Ah, right in the YAML. Gotcha.
Are you planning to add something at this point to your PR @Oliver Bertuch ? :smile:
So I was just finally able to update the branch and make it work with the WEB-INF change
Now, for the duplicated endpoints this are the declarations
@GET
@Path("ip/{groupIdtf}")
public Response getIpGroup( @PathParam("groupIdtf") String groupIdtf ) {
and
/**
* Creates or updates the {@link IpGroup} named {@code groupName}.
* @param groupName Name of the group.
* @param dto data of the group.
* @return Response describing the created group or the error that prevented
* that group from being created.
*/
@PUT
@Path("ip/{groupName}")
Sounds like good progress.
:smile:
But how can we fix this or is this really an issue, probably the other endpoint that is duplicated may be the same case
Can we remove duplicates in a future PR?
Sure! but this will result in at least 2 breaking changes AND us making a decision of 1) a new name for the renamed endpoint and B) what would keep the original name
Another solution could be breaking both so people don't make a mistake thinking it is the same one for a reason
Probably will bring this up on dv-tech so everyone is more aware of this naming when creating APIs
yeah
sounds like a new topic
out of scope for the current PR
Should I move it to ready for review? :angel:
:rolling_on_the_floor_laughing:
Sure, unless there's more on your list.
For now, I think that would be enough, re-establish the spec and also be compliant with the standard. We want to fix these issues for sure but I do agree on adding them as issues and then leaving it prioritization
go for it
Also another issue could be a spike to check the annotations for the response
What are the method signatures
Wait I think the solution to this is very easy
You just need to make the two path params be of the same name
The OpenAPI spec always uses the path as a toplevel key
For each of those you can have different method objects (Get, put, post,...)
We are having two equivalent path for the two different methods get and put
So you just need to make these names match
Oh I can test/try that
There is a nice example in the pet store example https://editor.swagger.io/
For /pet/{petId}
It's immediately clear why the linter says we're doing it wrong
Yeah :smile: that definitely works
I will add that to the PR
The fixes for the few duplicate operation ids should be very easy as well - just change the method signature, done. This is no breaking change.
Well this was solved with the solution you pointed to us earlier
With that all the errors left are
image.png
Juan Pablo Tosca Villanueva said:
Well this was solved with the solution you pointed to us earlier
I don't follow...?
I meant that the change of the Java method signatures would solve the duplicated operation ids.
this error
Yeah, that's the matching path params thing
Let me look at the other one
Juan Pablo Tosca Villanueva said:
With that all the errors left are
image.png
I was talking about those errors
The operation Ids are generated from the method names. I bet my as some method names are the same, but they are not the same to Java because the signature is different
Oh gotcha
Testing that 1 sec
That also works :smile:
Woohoo
up to 25%
:rolling_on_the_floor_laughing:
:partying_face::partying_face::partying_face::partying_face:
From 10 :smile:
Not bad!
Bad errors fixed
Great work and great start
I bet the responses and request bodies will be a lot thougher
Now I am praying Mr. Jenkins passes all the tests
lol
Yeah the response will be a lot of work
Changes are up if anyone wants to look at them
https://github.com/IQSS/dataverse/pull/10328
Or here is the spec
Do we want to disable the /openapi endpoint?
I think if there is a way we could think about it
I see pros and cons
what If we do it as a change on the code that is part of the repo and ppl update their installation and this get fixed at some point
Disabling is easy: https://docs.payara.fish/community/docs/Technical%20Documentation/MicroProfile/OpenAPI.html#set-openapi-configuration
Juan Pablo Tosca Villanueva said:
@Jan Range new version just baked for you sir :tada:
:rolling_on_the_floor_laughing:
I disabled it from both server and default config and still get the same results
Did you restart?
Might be cached
I did when I changed the default but i am restarting again
after changing the server
Same
Hmm ok
Bummer
Huh, the setting was re-enabled when I restarted the container
Ok so here is the deal
It works if I restart the container
but if I do a mvn docker:run
it re-enable the setting
The setting will not be permanently saved
You will need to add it to the bootscript
So it gets set when you boot
How would regular installations would handle it
would we need to generate a script of some sort?
Just set the setting
The domain is persisted on a classic install
Now we could even add the endpoint on the regular /openapi :rolling_on_the_floor_laughing:
But I just think than can cause issues
I'm wondering if we could make it an option...
Maybe :man_shrugging: :smile:
Anyway, Jenkins is running the tests but I think everything should be alright
I am going to log out, thanks again Oliver :tada:
Great. Have a good weekend
You too!
Here's an idea we could try: we need a new Application anyway because of the path. https://jakarta.ee/specifications/restful-ws/3.1/apidocs/jakarta.ws.rs/jakarta/ws/rs/core/application
The getClasses() thing could add the openapi endpoint when a feature flag is enabled.
We obviously may not put it in dataverse.api, so it won't be picked up by the regular API
Although I'm not sure what would happen if we add the same class containing the path to both apps :sweat_smile:
What dependencies we need to add for these annotations for the response? I was trying with:
<dependency>
<groupId>org.eclipse.microprofile.openapi</groupId>
<artifactId>microprofile-openapi-api</artifactId>
<scope>provided</scope>
</dependency>
But it seems that it can't resolve the dependencies, not sure if I am missing something :thinking:
Duhhh, it works I just needed to reload the project on IJ
So I made an example using these annotations on one of our simpler endpoints, the version one
@GET
@Path("version")
@Operation(summary = "Get version and build information", description = "Get version and build information")
@APIResponse(responseCode = "200",
description = "Version and build information")
@Tag(name = "version", description = "Version and build information")
public Response getInfo() {
String versionStr = systemConfig.getVersion(true);
String[] comps = versionStr.split("build",2);
String version = comps[0].trim();
JsonValue build = comps.length > 1 ? Json.createArrayBuilder().add(comps[1].trim()).build().get(0) : JsonValue.NULL;
return ok(Json.createObjectBuilder()
.add("version", version)
.add("build", build));
}
And just with that it shaved 34 warnings :rolling_on_the_floor_laughing:
I'm kind of wondering if we should give https://download.eclipse.org/microprofile/microprofile-open-api-3.1.1/microprofile-openapi-spec-3.1.1.html#_oasmodelreader a try. And if it doesn't work, we open an issue at Payara's
I can't help it: I think we need to get rid of the JsonPrinter/JsonReader classes, as described in #6810. It would make it a great deal easier to define the bodies. In addition: not just for us, but also for the sake of the SDK/SPI package, we should have proper DTO objects and schemas for them...
That sounds like a bigger overhaul :rolling_on_the_floor_laughing:
The documentation of the endpoints with the request still requires some work but won't break anything :smiley: (Hopefully)
But yeah it would be nice one day to put that into practice
Yeah, I just asked about that bigger overhaul at https://github.com/IQSS/dataverse/issues/6810#issuecomment-2018652136
I made this test with the multipart @Jan Range
openapi.yaml
The related commit: https://github.com/IQSS/dataverse/pull/10328/commits/fdb85c667fb1df46aee43214ec3dbdfcca86c9d3
Awesome, so it's just annotations.
@Juan Pablo Tosca Villanueva thanks for sending! I am currently sick and will check it asap once recovered :blush:
Jan Range said:
Juan Pablo Tosca Villanueva thanks for sending! I am currently sick and will check it asap once recovered :blush:
Sounds amazing! Thanks @Jan Range I hope you feel better soon! Is the PyDv meeting still happening today or it will be rescheduled? :scream:
Thanks @Juan Pablo Tosca Villanueva :-) I will open the room, but I do not have much to present due to being sick this week.
/info/version:
get:
tags:
- version
summary: Get version and build information
description: Get version and build information
operationId: Info_getInfo
responses:
"200":
description: Version and build information
@Jan Range Sorry it was info/version :sweat_smile:
But that is an example of the tags but this will require individual addition of the tags to each method
This would be another example fully documented which includes the multipart
/access/datafile/{fileId}/auxiliary/{formatTag}/{formatVersion}:
get:
operationId: Access_downloadAuxiliaryFile
parameters:
- name: fileId
in: path
required: true
schema:
type: string
- name: formatTag
in: path
required: true
schema:
type: string
- name: formatVersion
in: path
required: true
schema:
type: string
responses:
"200":
description: OK
content:
'*/*':
schema:
$ref: '#/components/schemas/DownloadInstance'
post:
tags:
- saveAuxiliaryFileWithVersion
summary: Save auxiliary file with version
description: Saves an auxlirary file
operationId: Access_saveAuxiliaryFileWithVersion
parameters:
- name: fileId
in: path
required: true
schema:
format: int64
type: integer
- name: formatTag
in: path
required: true
schema:
type: string
- name: formatVersion
in: path
required: true
schema:
type: string
requestBody:
content:
multipart/form-data:
schema:
type: object
responses:
"200":
description: File saved response
"403":
description: User not authorized to edit the dataset.
"400":
description: File not found based on id.
At least the POST part is fully documented
@Juan Pablo Tosca Villanueva hope you've had a great easter! Looks great, will put the yaml you supplied recently into the code generation. Regarding the tags, these will end up as modules within the software. For instance, when the tag version is given, the method to use the endpoint will look sth like this:
pydataverse.version.get_info()
Since tags correspond to a module, it would make sense to use the parent as the tag. In the above version example, it would make sense to use info as the tag due to it being part of the /info collection of endpoints. I did something similar in my version. You can find it here.
Hi @Jan Range thanks a lot! I hope you also had a good easter, and that you are feeling better now. Please let me know how this works on the code generator with the multipart included. I will check about the tag in a few :smile:.
Ok this is the version generated and checked and you can annotate the entire class so all the endpoints from info now would be tagged with this according to the docs
Looks great! I get a couple of errors, though. These relate mostly to duplicate OperationIds that are easy to fix. Here is an overview of all errors:
Regarding the items attribute in arrays. This can be fixed by supplying the following:
fileSystemDirectory:
type: array
items:
type: float # The type(s) to be expected
It's only the case for two schemes and can be fixed manually for now. Other than that everything looks fine :raised_hands:
This one should have this one fixed, but this may happen again in the future, for context I just updated this branch and this endpoint or method may have been added recently.
openapi.yaml
Regarding the items I am not sure how that would look on our side, at first glance we would have to specify the expected types on the request? If that is so it would be another of those cases that require more granular attention which is way beyond the scope of this PR at this moment :rolling_on_the_floor_laughing:
Perfect! Those are not shown anymore.
Is this array only on a couple of methods or it is across the entire API?
Juan Pablo Tosca Villanueva schrieb:
Regarding the items I am not sure how that would look on our side, at first glance we would have to specify the expected types on the request? If that is so it would be another of those cases that require more granular attention which is way beyond the scope of this PR at this moment :rolling_on_the_floor_laughing:
I understand, but this is completely fine and easy to fix manually. From my view the current state is totally fine.
Juan Pablo Tosca Villanueva schrieb:
Is this array only on a couple of methods or it is across the entire API?
No, its just for two properties. Hence easily managable. On pyDataverse's repo, I could add a workflow that adds the respective item types before generation.
Did Payara fix /openapi upstream? See https://github.com/IQSS/dataverse/issues/9981#issuecomment-2032641856
Jan Range said:
Perfect! Those are not shown anymore.
I wonder why are only these 2 arrays not defining the type :thinking:
probably something we could look down the line
In theory if the /openapi is restored by Payara we could or should remove the created endpoints to provide the spec
It seems like these two are the only properties that have this issue. All other ones have a type specified. Maybe there is a missing type annotation in the source code?
The thing is... most of the annotations are not in the code :rolling_on_the_floor_laughing: that is why documenting all the endpoints will require a bigger effort
So for some reason the plugin is getting the type for most of the arrays
but not on particular for this one
We should also determine if we are going to continue using the plugin as part of our solution. As I understand when provided a file (as we are doing now) the spec will be served and that is why @Oliver Bertuch was requesting it to be on the /WEB-INF directory
If we decide to remove the plugin probably the annotations and the other changes will still be required from this PR
or the spec will be just as broken as before showing all the errors
Like in https://dataverse.unc.edu/openapi
I am going to post this on the PR issue
6.2024.4 is out. Let's try this!
Are we still talking about RC1?
No no official release
Ohhhh boys!!!
I looked this morning and it wasn't in. It's hot of the press!
@Don Sizemore beat me to it :see_no_evil: :racecar:
:hot: :bread:
Ohhhhh boys that is Oliver :tada:
Not sure this is the right topic though... I'll move it for you :wink:
3 messages were moved here from #dev > metadata schemas support by Oliver Bertuch.
Somehow I was typing on this channel and ended up on the other one :see_no_evil:
:rolling_on_the_floor_laughing:
:thinking: What is the difference with Payara and Payara web? I know about Payara micro but the build makes all of them :rolling_on_the_floor_laughing:
Different Jakarta EE profile
full vs web
we've always used full
:thumbs_up:
We kind of _require_ full because EJBs
yeah, and because of JMS, I think
See also https://jakarta.ee/specifications. Yes, JMS Is a part of the full platform only, too.
Do we need an image with this version or there is a possibility to generate one?
Is this the right Zulip topic?
No it's not :sweat_smile:
In #containers there probably is a better topic
#containers > β Payara 6.2024.1 ?
But it is related to this issue :eyes:
Sure. We can unresolve it
OK OK ppl put down your pitchforks and torches
:rolling_on_the_floor_laughing:
Jo JP. Go try mvn -Pct clean package -Dbase.image.tag=rev2 if you wanna give 6.2024. 4 a shot...
Requires a change to the app image, too...
Omg! Let me try this, brb :partying_face:
I only need to change init_3_wait_dataverse_db_host.sh?
Yes
Anything else is contained in rev2 base image
Building
gdcc/dataverse:unstable was found but does not match the specified platform: wanted linux/arm64, actual: linux/amd64" Not Found: 404
:sad:
Oh no... I just pushed my local image...
That is amd64 only
Sorry!!!
I forgot about that
I'll look into pushing a different image later/tomorrow that is actually multiarch
Sounds good I will try again when you have something :smile: thanks!!
I've just pushed a multiarch build. Note that I removed the jattach/wait4x pieces from the PR as we can merge them independent from the Payara upgrade. So you don't need the changes to the db wait script for now.
Amazing! I will check asap and let you know @Oliver Bertuch ! Thanks!
So we're close to knowing if the Payara-provided /openapi is working again? As of this PR Upgrade to Payara 6.2024.4 #10495
Would it be the same tag @Oliver Bertuch ? I am trying with "mvn -Pct clean package -Dbase.image.tag=rev2" and still get the same error
This should be a quick test once we get it running on 6.2024.4 so :fingers_crossed:
Oh it seems I had to delete my existing images beforehand... here we go wish me luck :rolling_on_the_floor_laughing:
It is alive!
But I don't think it is serving the generated OpenAPI files
from WEB-INF
To the left is the one generated by Payara to the right the one from Smallrye
I'm confused. It's serving up both OpenAPI YAMLs now? The Payara builtin one and the SmallRye one?
My PR still has the endpoints I implemented
So I can get both on the same installation
http://localhost:8080/openapi [left]
http://localhost:8080/api/info/openapi/yaml [right]
Great! What does @Jan Range think about this?!? He's our number 1 OpenAPI customer! :crazy:
Well I think we first have to decide what we do :rolling_on_the_floor_laughing:
If loading the OpenAPI spec from the WAR doesn't work, we should probably investigate and potentially create an issue at Payara's
But at this point Payara should be serving the generated api from META-INF right @Oliver Bertuch ?
I'm not sure if we're doing it right and have the file in the right place ....
I will do some testing in a bit, doing some detective work on a test :laughter_tears:
OK back to this :sweat_smile:
So in the containers we have inside the deployment 2 META-INF directories:
Vendors are required to fetch a single document named
openapiwith an extension ofyml,yamlorjson, inside the application moduleβs rootMETA-INFfolder.
https://download.eclipse.org/microprofile/microprofile-open-api-3.1.1/microprofile-openapi-spec-3.1.1.html#_location_and_formats
I suspect this to be /opt/payara/deployments/dataverse/META-INF
You could try by adding a simple test spec file in there and see if it turns up
And maybe disable scanning?
Nice! Sorry for following up so late. I will build and check it, but judging from the screenshots, it looks great! I am so happy that it is integrated now :tada: Great work!
openapi.yaml
This is a sample of the generated file from Payara @Jan Range
Sadly it seems it send us back to square one with the validations :rolling_on_the_floor_laughing:
Will put it through the code gen :smile:
Alright, I have put it through the validation and now there are in total 58 errors. Most of which are related to duplicate operationIds and tags. I will compile all of those to get an overview.
Others are related to parameters defined in the path differ to those given in the parameters object.
@Oliver Bertuch right now the configuration on maven is set to
<openapi.outputDirectory>${project.build.outputDirectory}/META-INF</openapi.outputDirectory>
I am not sure how to move it to /opt/payara/deployments/dataverse/META-INF :sweat_smile: any suggestions?
Is there a way I can work through these errors by myself and open a PR? Or is it a bit more complex to resolve these?
Probably our efforts at this point could be one of the two:
For me I am inclined to move forward 2 but would like to hear what you guys think, going through the errors I see for example Payara generates the operationId based purely on the name of the method while smallRye will attach the name of the class to generate an ID (Don't you love "standards"?)
As part of the solution No 2, we would have to check in a case by case but I think is better to have less dependencies
image.png
These are the duplicated errors, would just have to rename some java methods and that should have no impact on the code
Another set of errors is this:
True, option number 2 would be the cleanest. But wouldn't this require changing the method's name to avoid duplicate operation IDs?
Yeah but those methods are normally never used since they are called mostly by the URL defined by java annotations
So in theory you could rename them with no impact
not sure about the path parameters
The smallrye plugin can be configured what should be used as name
Is this ID important for the generation of client code?
Well having duplicated IDs is marked as an error
Which small rye gets away with because it concatenates the name of the class
but Payara is an evil fish
so it just uses the name of the method
I'm leaning towards option 1. Payara was broken before and it might break again. Also, the spec would be nice to add to the guides or as other downloadable artifacts
So we have a method resetCurationLabelSet on Admin and also on Datasets
Is the smallRye dependency big or changing frequently?
as an example
It's a build time dependency. Doesn't cost us much
Well if payara breaks that would be a reason to leave the custom endpoints
Juan Pablo Tosca Villanueva schrieb:
So we have a method
resetCurationLabelSeton Admin and also on Datasets
Yes, unfortunately these have to be unique even if they live in another module in the client code :-(
Oh wait we're talking about the endpoint?
No but if the justification to leave smallrye is that Payara can break then it would make sense to add a custom endpoint because if it breaks again even if it is serving a custom file created by small rye it wold be... well... broken :laughing:
True that :grinning:
Which of both would require more work to put in?
I could help renaming the methods in Payara's case. For SmallRye I would maybe need to get an idea of how to apply it.
You're right @Juan Pablo Tosca Villanueva Should we disable the normal endpoint?
Otherwise people might end up with the wrong spec...
We could even add our own endpoint in the same place when we disable OpenAPI on the server...
We'd need to modify the spec file anyway with the server deets etc
I just see pros and cons of going way, what makes me nervous about using smallrye as a default option is having a dependency that can lose support at some point... in theory, people could switch to other servers like we could move to an alternative from Payara if we are compliant with the default but if something happens to smallrye then it is a GG
Huh? Smallrye is no small thing... it's the MP implementation used by Redhat and many other Jakarta servers as well es Quarkus.
I honestly doubt it'll just go away
I think the second one could be safer because we could just have an integration test that tests the /openApi, a test like this would have prevented us from upgrading to specific versions of Payara where this could be broken
:shrug:
Tough decision :grinning:
I manually created a WAR with both files on /opt/payara/deployments/dataverse/META-INF sadly they were not served, still ignored. I am making a test with only the YAML file soon
Same :sad:
Maybe it's just Payara being a bug fest. For all of the MP stuff exist TCK checks. Might be worth a look if this part of the spec is tested and how.
Do we have the ability to disable /openapi? The one served by Payara automatically, I mean.
We should be
Ah, great. Let's add that to whatever pull request. Seems useful.
For our containers we probably should put this either in our base images or in the init_2 shellscript as asadmin command
The init_2 thing defines a pre/postboot thing, that should do the trick
Do you want it enabled or disabled in the base image?
For experimentation it's probably best to put it into the app script for now
We can decide later what we want to do
When disabled, we should be able to provide our own /openapi
yeah
I wonder if it would cause a deployment failure if this is not disabled. since the application would try to take over and already established path
I'm not sure who'd win
In spring the application would just not start if there are 2 fighting for the same
not sure on J2EE
Please, Jakarta EE. This isn't 2006.
:see_no_evil:
I wish this was 2006 :rolling_on_the_floor_laughing:
@Juan Pablo Tosca Villanueva wanna try this?
Disable OpenAPI in Payara?
I am making a delicious copy pasta while we speak :spaghetti:
Just edit this line here: https://github.com/IQSS/dataverse/blob/131e76cf77d82b2535556b62f050e869cb96d937/src/main/docker/scripts/init_2_configure.sh#L33
echo "set-openapi-configuration --enabled=false" >> "${DV_POSTBOOT}"
That should do the trick
I hope. Painstakingly typed on my phone
I hope postboot isn't too late... not sure if we need preboot.
And the syntax might be wrong :stuck_out_tongue_closed_eyes:
For now and this test I can disable it from the console :rolling_on_the_floor_laughing:
Bah
I will try that
But just I need to move the endpoint since we declare paths at a class level
Yes, it will require a second jersey config
Ok I am adding it rn otherwise i have to restart the server every time lol
Interestingly enough... You don't even need to disable if you define an @WebServlet("/openapi")
See... that's what I meant - they might have put a safeguard in place to just retract if someone is using that endpoint
I still can leave it open so ppl can define a parameter and get a json version :smiling_devil:
Hrhr
May I suggest not using a parameter?
IIRC the original endpoint supports accept content headers
Should be in the spec :grinning_face_with_smiling_eyes:
Well, the endpoint itself is not on the "API" and would be considered part of the spec
also since it lives outside /api/
I probably could just remove it for the sake of simplicity :shrug:
The JSON format will live :salute:
MPOA Spec says /openapi, default Yaml, support content negotiation
:rolling_on_the_floor_laughing:
Fair enough
Wait
Stupid me
:eyes:
It _is_ a parameter
brah
:rolling_on_the_floor_laughing:
![]()
It is?
Dang
I'm getting old and tired
Content format
The default format of the /openapi endpoint is YAML.
Vendors must also support the JSON format if the request contains an Accept header with a value of application/json, in which case the response must contain a Content-Type header with a value of application/json.
Query parameters
No query parameters are required for the /openapi endpoint. However, one suggested but optional query parameter for vendors to support is format, where the value can be either JSON or YAML, to facilitate the toggle between the default YAML format and JSON format.
It's both. :see_no_evil:
Sorry for the confusion
:thinking:
is "format" the suggested name of the parameter?
Yes
hell yeah, that is the name i had :rolling_on_the_floor_laughing:
Go to the HTML page to have decent formatting
But we should also support the header request
It's not hard to do...:sweat_smile:
There are examples in the main API
No, no, just summarizing
Well our api uses AbstractApiBean this would be an HttpServlet
Working on it i will post shortly
or maybe not that shortly lol
My wife just called so she may pick me up from work soon :sweat_smile:
but maybe** tonight
Please note that using the proper JAX-RS way is probably just requiring you to add a single class file that contains all of it.
AbstractApiBean is a tech debt left over from ancient times
If you do curl -H 'Accept: application/json' https://dataverse.unc.edu/openapi you get JSON out.
What would take priority in case that someone sends on the header application/json but on the parameter yaml
Headr > param > default ?
Error
Format and accept unaligned is a 400
Or make it a 418
Dataverse Easteregg
https://dataverse.unc.edu/openapi?format=json
also valid
huh, interesting!
@Jan Range did you see this? If you prefer JSON over YAML
We are reverse engineering it :rolling_on_the_floor_laughing: I love it
@Philip Durbin both are fine for me
Just an easy going guy
Fun fact, if we go for this solution we don't even need to wait for 2024.4
I updated the PR I am adding some tests for this, just basic testing but I need to add it to integration-test.txt right?
If it's an IT test, yes.
At least we know ahead of time if it breaks :smile:
Right. Hmm, would we have held up the 6.0 release if we knew /openapi was broken? :thinking:
Probably could go with a different minor version
for the payara upgrade
definitely
Saying that... it has been a long time since 6.0 until now for Payara to fix that but at least now we have a solution that works even on the current version
Ok everything is up
Ok tests passed and the thing looks good to me, idk what do you guys think is next for this PR :smile:
Is it served up at /openapi?
Yes
Nice. Did you have to disable the version from Payara?
No need to upgrade to payara or disable the original one
ok, it overwrites what's there?
Yeah it takes over the path :smile:
nice
:pray:
Tests also were added
I'm finally looking at the PR, https://github.com/IQSS/dataverse/pull/10328
Overall, looks good.
@Jan Range, you're still happy with it?
Still happy :smile: If there is enough time, I would test it a last time before merging.
Ok. Do you need it on a server? Or you can test locally?
I would clone the PR and use mvn to start it
Will test after dinner :smile:
Great. You're our easy baby. :grinning: No fussing. :baby:
@Juan Pablo Tosca Villanueva did you see https://quobix.com/vacuum/ doesn't like @Pattern(regexp="^[^:<>;#/\"\\*\\|\\?\\\\]*$" in FileMetadata.java? It's the one error. The rest are warnings.
:thinking:
Is this a new endpoint?
Let me see
"schema pattern should be a ECMA-262 regular expression dialect"
The error comes from here: https://github.com/daveshanley/vacuum/blob/v0.10.0/functions/openapi/schema_type.go#L151
:see_no_evil:
And you really don't want to see the output of the new OpenAPI doctor
:rolling_on_the_floor_laughing:
It's exhausting even looking at all these errors. :dizzy:
I wonder if this is something new implemented on vacuum
I just tested all the older versions of the spec and they show the same error
but they were passing
Oh there was an update 4 days ago
https://github.com/daveshanley/vacuum/releases/tag/v0.10.0
I'm going to change "Getting the OpenAPI Specification" to "Getting the OpenAPI Document" because that's what they call it: https://spec.openapis.org/oas/latest.html#openapi-document
@Jan Range if this shows up in the openapi doc, would you use it?
pattern: "^[^:<>;#/\"\\*\\|\\?\\\\]*$"
I'm asking because https://quobix.com/vacuum/ complains about it. "schema pattern should be a ECMA-262 regular expression dialect" (it's a Java regex).
For more context, this is the regex we use for filenames ("label" in the code). It's on the FileMetadata object.
SyntaxError: Unterminated character class at <eval>:1:15(3)
:thinking:
Where did you get this?
I may have a replacement for the MultiBody btw
https://github.com/pdurbin/vacuum/commit/e25b3b60017cc87799e90cee90f723f0994efea5
I opened an issue: https://github.com/daveshanley/vacuum/issues/503
Actually... I am making more tests brb :sweat_smile:
I modified the schema provided by doctor
Where I included
testfield:
patter: "^[^:<>;#/\"\\*\\|\\?\\\\]*$"
type: string
Which both doctor and vacuum validates
patter instead of pattern?
:upside_down:
Ok I fixed it...
So now we get on doctor: schemaΒ patternΒ should be a ECMA-262 regular expression dialect
https://json-schema.org/understanding-json-schema/reference/regular_expressions
So same as before.
Should SmallRye translate the Java regex into a Javascript regex?
mmmm
it seems what is causing trouble is the \\\\
or escaping \
if you put as value ^[^:<>;#/"\*\|\?\]*$ or "^[^:<>;#/\"\\*\\|\\?\\]*$"
It is a valid value
Should we open an issue? Treat it as something to follow up on? https://docs.datalad.org/projects/dataverse/en/latest/settingup.html#dataverse-limitations is related, by the way.
Yeah let's wait to hear what they have to say
I will make the change for Unirest
Wait to hear on https://github.com/daveshanley/vacuum/issues/503 ?
Yes
Sure, sounds good.
At least until I solve the unirest
Hi @Jan Range :wave: If you have a chance could you test this file? Specifically the multipart items please? :smile:
openapi.yaml
@Philip Durbin I made some changes, it seems that to begin with... we never needed the unirest class in there. I am surprised it wouldn't cause an error since we are not using it.
Let's wait to hear what @Jan Range reports about his testing with the latest changes
At this point I don't think I need to make any other changes so I will go and review Jim's PR
Sounds good. The biggest problem now is that tests are failing. I think I saw something about testSingleRecordOaiSet in Slack.
:eyes:
Here's the thread: https://iqss.slack.com/archives/C010LA04BCG/p1716987433514219
But tests were passing yesterday no?
I am so confused lol
this was merged last week :thinking:
Link me to where tests were passing.
:see_no_evil: I wasn't looking on the details :sweat_smile:
Sorry I normally go there when I do QA
For me the priority is to fix develop. Then merge develop into all PRs.
:see_no_evil:
I can approve as soon as it is done with the tests
:upside_down:
Nice, yes https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-10600/1/testReport/ looks good. I always look at the api time. 13 minutes. When the tests don't run, it's only a few seconds.
Thanks for approving!
Alternate fix: https://github.com/IQSS/dataverse/pull/10601
I just went through the doc and ran a validation. These are the open points:
multipart schema must be an objectβ All multipart bodies are currently typed as string. This can also be addressed in preprocessing prior to code generation.items attribute required for arrays in OpenAPI 3.0.X documents- Two array properties do not contain the items key. Also, easy to fix in pre-processing.duplicate-properties - hTTPUpload (HTTPUpload) will collide with httpUpload (HTTPUpload) [line 11325] when converted to field name- Also addressable in pre-processing.Overall looks great to me!
@Philip Durbin What do you think? :looking:
I just saw the reminder again that he is off today :sweat_smile:
We got an answer from vacuum:
The regular expressions supported by vacuum are ECMA-262 compatible only. This is because we use a built in JavaScript engine to compile the regular expression. If it fails - the pattern is no good and does not conform to the standard.
I also asked chat GPT about our regex :rolling_on_the_floor_laughing:
The regular expression ^[^:<>;#/\"\\*\\|\\?\\\\]*$ is not valid according to the ECMA-262 (ECMAScript) standard. Let me explain why:
Character Classes and Negation:
In ECMA-262, character classes (denoted by square brackets) allow you to specify a set of characters that match any one character from that set.
However, the negation (^ inside the brackets) is not allowed within character classes. In other words, you cannot negate the entire set of characters using [^...] in a character class.
Escape Sequences:
Some of the characters in the expression are correctly escaped (e.g., \\, \"), but others are not (e.g., *, |, ?).
In ECMA-262, special characters should be escaped using a single backslash (\). For example, \\, \", and \* are valid escape sequences.
Quantifier Position:
The * quantifier (zero or more occurrences) should be placed after the character set, not directly after the start anchor ^.
Valid: ^[^:<>;#/\"\\*\\|\\?\\\\]*$
Invalid: ^[^:<>;#/\"\\*\\|\\?\\\\]*$
Valid Alternative:
If you want to match any string that does not contain the specified special characters, you can use the following valid expression:
/^[^:<>;#/\"*|?\\]*$/
I guess we will have to talk about ECMA 262 regex on tech hours :thinking:
Is it a bug in SmallRye? Should they be translating a Java regex into a ECMA-262 regex?
I don't think it is a bug
It seems our regex are just for Java which would make sense if we are using only java
but since this is an endpoint used with JSON and JS frameworks probably we should standardize to use ECMA-262 regex
So we should improve our regex. Do you want to create an issue for that?
Sure! :smile:
In that issue we should also mention that people want non-ASCII characters in filenames. DataLad for example: https://docs.datalad.org/projects/dataverse/en/latest/settingup.html#dataverse-limitations
"Dataverse will not accept names like Γnderungen or DΓ©chiffrer, due to the Γ and Γ© in them."
https://github.com/IQSS/dataverse/issues/10607
Branch is updated so let's see Jenkins is being good today :smile:
Seems that all tests passed :smile:
/me looks at https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-10328/41/testReport/
Yep, the main thing I look for is that the API tests actually ran. 13 minutes. Good.
approved
merged: https://github.com/IQSS/dataverse/pull/10328
We should do a Merge Party! Great work @Juan Pablo Tosca Villanueva and @Philip Durbin
OMG!!! this definitely needs a party, now looking forward to make improvements :sweat_smile:
I will supply the beer :stuck_out_tongue:
And here it is! https://beta.dataverse.org/openapi
And actually, @Jan Range I know we talked about where you can get the latest openapi YAML file. Is https://beta.dataverse.org/openapi good enough for your purposes? It should usually be available just a few minutes after a PR is merged into the develop branch.
Also there are Integration Tests on this pr so if something breaks we will know :wink:
@Philip Durbin Of course, this is perfect! Totally forgot about the beta instance :smile:
Yeah, me too. JP reminded me. Seems like a good solution, at least for the latest in develop.
I still think that at release time we might want to push that version into the guides or maybe as an artifact of the release. Not sure. Not sure if it would be useful or not, honestly.
But I guess that could be a new Zulip topic. :grinning:
@Jan Range I just left you a comment: https://github.com/IQSS/dataverse/issues/9982#issuecomment-2168206837
@Dimitri Szabo thanks for reminds us about these older issues that are still open: https://github.com/IQSS/dataverse/pull/10328#issuecomment-2168073456
Yes! Thanks @Dimitri Szabo
@Dimitri Szabo do you know the person who opened https://github.com/IQSS/dataverse/issues/9612 ?
@Juan Pablo Tosca Villanueva does https://redocly.com/docs/cli/installation/#docker work for you?
Haven't tested
I was using vacuum mostly for validation :sweat_smile:
we suspect an infinite loop at the component level, because there are LOTS of them, and when trying to display the structure of a response from an endpoint it crashes.```
UNC still has the older version of payara where the spec is there
And we were able to validate that one with vacuum
I'm getting "openapi.yaml does not exist or is invalid."
With redocly?
yeah
docker run --rm -v $PWD:/spec redocly/cli lint openapi.yaml
"does not exist or is invalid." Is kind of anoying error message lol
"Guys we can only affort to put one error message on this application so make it count" :rolling_on_the_floor_laughing:
I guess does not exist probably was meant that the "spec" itself doesn't exist ont he document but it is finding the file :shrug:
I want to try the IBM one
Yeah, I don't know why redocly doesn't work for me. Oh well.
It actually worked for me
< ... 1391 more problems hidden > increase with `--max-problems N`
openapi.yaml: validated in 206ms
β Validation failed with 957 errors and 534 warnings.
run `redocly lint --generate-ignore-file` to add all problems to the ignore file.
This is the one validated from UNC:
< ... 1093 more problems hidden > increase with `--max-problems N`
openapi.yaml: validated in 167ms
β Validation failed with 804 errors and 389 warnings.
run `redocly lint --generate-ignore-file` to add all problems to the ignore file.
Philip Durbin said:
Dimitri Szabo do you know the person who opened https://github.com/IQSS/dataverse/issues/9612 ?
@Philip Durbin sirineREKIK is a former team member, we (@Steven Ferey + @JΓ©rΓ΄me Roucou + me) are in charge now
@Juan Pablo Tosca Villanueva thanks, when I'm back in the office I'll ask you to show me what I'm doing wrong. :upside_down:
Want to get on a zoom call? I just copy pasted your command lol
@luddaniel is it ok if we close https://github.com/IQSS/dataverse/issues/9612 ? It was opened with the old format (generated by Payara). Your team could open a fresh issue if you want for the new format (generated by SmallRye).
docker run \
--rm --tty \
--volume "$PWD:/data:ro" \
ibmdevxsdk/openapi-validator:latest \
--ruleset ruleset.yaml \
openapi.yaml
Summary:
Total number of errors : 502
Total number of warnings : 5567
That is from IBM on UNC's yaml
New openapi:
Summary:
Total number of errors : 650
Total number of warnings : 5824
More errors and warnings with the new openapi? ![]()
I posted on the issue
And the answer is "depends who you ask"
Each tool just throws you a different set of metrics
https://github.com/IQSS/dataverse/issues/9612#issuecomment-2168307243
@Jan Range does some post-processing of the document so we could add something like that but I would just go and fix the problem from the root but that is what we mentioned will require a lot of work and will take time :smile:
Nice comment. Since sirineREKIK left I think I'd prefer fresh issues from a current stakeholder such as @luddaniel
And ideally the issues would be small enough that we can size them to fit in a sprint.
That way, bit by bit, we can resolve the errors over time.
Philip Durbin said:
luddaniel is it ok if we close https://github.com/IQSS/dataverse/issues/9612 ? It was opened with the old format (generated by Payara). Your team could open a fresh issue if you want for the new format (generated by SmallRye).
you can close it :)
2 issues no code, what a great job team :rolling_on_the_floor_laughing:
Closed! Thanks! I see @JΓ©rΓ΄me Roucou left a follow up comment saying things are better. He can display the docs.
Jokes aside we want to get this OpenAPI in a better place for everyone and we will work to do that as soon as we can :smile:
I saw that! :+1:
I'm not sure how we decide which errors to fix first. I guess we'll let the people who use it tell us. :grinning:
We can make a raffle
:rolling_on_the_floor_laughing:
Pick a number between 1 and 5824
:see_no_evil:
1337
There you are :laughing:
We'll beg the community to fix our bugs. That's my usual approach. :sweat_smile:
Last updated: Nov 01 2025 at 14:11 UTC