Stream: geospatial

Topic: โœ” extract lat/long #9331


view this post on Zulip Philip Durbin ๐Ÿš€ (Mar 28 2023 at 13:16):

@Jan Range Hi! Just curious if you have any data or code for me for #9331. No rush! Thanks!! :heart:

view this post on Zulip Jan Range (Mar 28 2023 at 14:43):

Hi Phil! Just pushed the notebook to colab and git :-)

view this post on Zulip Jan Range (Mar 28 2023 at 14:43):

https://github.com/JR-1991/NetCDF-Example

view this post on Zulip Jan Range (Mar 28 2023 at 14:43):

Will post it in the issue :heart:๏ธ

view this post on Zulip Philip Durbin ๐Ÿš€ (Mar 28 2023 at 14:45):

Thanks!!

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 03 2023 at 20:58):

@Jan Range thanks for the code! I demo'ed it to @Ana Trisovic and Amber this morning.

Does https://linestrings.com/bbox/#-124.7666666333333,25.066666666666666,-67.058333300000015,-124.7666666333333 look right in terms of the bounding box you extracted?

Screenshot-2023-04-03-at-16-54-05-https-__linestrings.com.png

I'm asking because I'm hacking around with our Java code. It's pretty easy to pull out attributes like this, using the library we added in PR #9152:

netcdfFile.findGlobalAttribute("geospatial_lon_min");

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 05 2023 at 14:09):

https://www.ncei.noaa.gov/data/international-comprehensive-ocean-atmosphere/v3/archive/nrt/ICOADS_R3.0.0_1662-10.nc is a nice small file with a bounding box of https://linestrings.com/bbox/#343.68,41.8,353.78,343.68

Only 142KB.

From https://data.noaa.gov/onestop/collections/details/9bd5c743-0684-4e70-817a-ed977117f80c?f=temporalResolution:1%20Minute%20-%20%3C%201%20Hour;dataFormats:NETCDF

Screen-Shot-2023-04-05-at-10.05.39-AM.png

view this post on Zulip Jan Range (Apr 05 2023 at 18:22):

Hi Phil! Just came back from the netherlands :smile: In my code I did specify that when the unit of degree i.e. is east I will extract the maximum to the "longitude east" and minimum to "longitude west" fields. Same for the latitude. Made most sense to me or is there a specific convention?

view this post on Zulip Jan Range (Apr 05 2023 at 18:23):

Philip Durbin schrieb:

https://www.ncei.noaa.gov/data/international-comprehensive-ocean-atmosphere/v3/archive/nrt/ICOADS_R3.0.0_1662-10.nc is a nice small file with a bounding box of https://linestrings.com/bbox/#343.68,41.8,353.78,343.68

Only 142KB.

From https://data.noaa.gov/onestop/collections/details/9bd5c743-0684-4e70-817a-ed977117f80c?f=temporalResolution:1%20Minute%20-%20%3C%201%20Hour;dataFormats:NETCDF

Screen-Shot-2023-04-05-at-10.05.39-AM.png

Will check this one with my NB!

view this post on Zulip Jan Range (Apr 05 2023 at 18:23):

Btw did you already present? If so, how was it? :smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 05 2023 at 18:42):

Yes! @Ana Trisovic and I discussed the design doc (including this extract lat/long topic) yesterday during the community meeting. I thought it went well!

Now I'm hacking away at the extract lat/long thing. Thanks to Leonid, the code is already well organized so I'm adding a NetcdfFileMetadataExtractor near this existing extractor for FITS (astronomy) files: https://github.com/IQSS/dataverse/blob/v5.13/src/main/java/edu/harvard/iq/dataverse/ingest/metadataextraction/impl/plugins/fits/FITSFileMetadataExtractor.java

view this post on Zulip Jan Range (Apr 05 2023 at 20:16):

Sounds great! I am going to convert the H5Web Viewer to HTML/JS/CSS this week and open a PR for the viewers repo

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 05 2023 at 20:18):

Great! We even have a thread going for that if you run into any trouble! Hopefully not! https://dataverse.zulipchat.com/#narrow/stream/376593-geospatial/topic/plot.20arrays.20from.20HDF5

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 11 2023 at 19:05):

Whoops! the linestrings URL (and screenshot) above is wrong due a copy/paste error on my part. (I had 343.68 in there twice!) The correct URL should be https://linestrings.com/bbox/#343.68,41.8,353.78,49.62

Also, check this out, if you subtract 360 from the longitudes, you get the same map: https://linestrings.com/bbox/#-16.320007,41.8,-6.220001,49.62

Screen-Shot-2023-04-11-at-2.59.07-PM.png

I started looking into this because our Solr config doesn't like longitudes that aren't between 180 and -180: Bad X value 343.68 is not in boundary Rect(minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0) input: ENVELOPE(343.68,353.78,49.62,41.8)

This issue is related: https://github.com/IQSS/dataverse/issues/9421

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 11 2023 at 19:07):

I got the idea of subtracting 360 from https://stackoverflow.com/questions/34883835/latitude-and-longitude-issue-with-out-of-the-boundary-values but I'm not sure it's the best plan! :sweat_smile:

view this post on Zulip Ana Trisovic (Apr 11 2023 at 21:25):

Hi All :smiley:

view this post on Zulip Jan Range (Apr 13 2023 at 06:15):

Seems like the range is just shifted from 0-360ยฐ to -180ยฐ-180ยฐ - Hence if you have something like 200ยฐ in the "normal" degree interval, you can transform it by subtracting 180ยฐ and thus 20ยฐ would be the equivalent for the latter range. But I am not too sure, maybe we could ask Amber to be sure?

view this post on Zulip Jan Range (Apr 13 2023 at 06:42):

In the NetCDF files I found i.e. lat is described as a minimum and maximum of either degree west or east. To me it feels a bit more intuitive to think of a single direction rather than two opposite directions where you need to do a conversion. What do you think?

Attached an image that illustrates this. Having a single unit of degree might also help to calculate width and height of the bounding box, if that is necessary.

Bildschirmfoto-2023-04-13-um-08.28.37.png

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 13 2023 at 16:04):

I made a draft PR: https://github.com/IQSS/dataverse/pull/9523

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 13 2023 at 18:56):

I just deployed that PR to https://dev1.dataverse.org if anyone wants to try uploading some NetCDF files.

view this post on Zulip Ana Trisovic (Apr 13 2023 at 20:37):

Hey! I had the exact same questions and I opened an issue at CF- conventions here: https://github.com/cf-convention/cf-conventions/issues/435 what do you think?

view this post on Zulip Ana Trisovic (Apr 13 2023 at 20:50):

Here is another funny PR https://github.com/pdurbin/dataverse/pull/2

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 13 2023 at 21:27):

@Ana Trisovic great! but now that I've made that draft PR above, you can make a PR into that if you want.

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 13 2023 at 21:30):

Also, the cf-conventions issue is super interesting.

I guess they call it the "domain".

"Coordinate values in CF files must be stored monotonically increasing or decreasing, so you could extract the first and last coordinate values stored to determine the domain."

"As Karl said, coordinate variables have to be monotonic, so you only need to check the first and last coordinates rather than the whole thing. If one of them is > 180, the domain is 0:360. If one of them is <0, the domain is -180:180. If both are between 0 and 180, the answer is indeterminate."

view this post on Zulip Jan Range (Apr 14 2023 at 13:15):

@Philip Durbin indeed that was a very interesting read, thanks @Ana Trisovic ! So I guess, according to your PR, sticking to [-180, 180] is the plan?

view this post on Zulip Ana Trisovic (Apr 14 2023 at 14:16):

@Jan Range [-180,180] is a more standard way than [0-360], and Dataverse fields (and geo-search I think) work on [-180-180].

view this post on Zulip Ana Trisovic (Apr 14 2023 at 14:17):

@Philip Durbin can we extract "the first and last coordinate values" with our Java lib?

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 14 2023 at 14:20):

Right, we've recently introduced a constraint of -180 to 180 for longitude because we now try to index these values into Solr for the purposes of geospatial search. Solr will complain like this for values that are out of the range it wants:

Bad X value 343.68 is not in boundary Rect(minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0) input: ENVELOPE(343.68,353.78,49.62,41.8)

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 14 2023 at 14:22):

So far we've only extracted the metadata from global attributes.

(It's interesting from that issue that these values are sometimes considered untrustworthy! "In my experience, when it comes to this issue, it's better to check the data explicitly than infer the answer from metadata. Even when the choice of longitude range was specified by convention, I have run into a lot of data that did not conform to the spec. Plus, it's very easy for attributes recording min & max values to become out of sync with the actual contents of the file. Unless I am certain that nobody has touched the file since it was originally written, I regard them as unreliable." -- https://github.com/cf-convention/cf-conventions/issues/435#issuecomment-1505614364 )

So far the Java library has been pretty capable. So I'll say "maybe," I guess, for pulling out actual data. :happy:

view this post on Zulip Ana Trisovic (Apr 14 2023 at 14:24):

Yeah, I can see that happening when a researcher downloads an official dataset, makes changes in the variables but does not update the metadata :sweat_smile: and then uploads the modified data file to Dataverse

view this post on Zulip Jan Range (Apr 27 2023 at 14:25):

Just opened an issue at EasyDataverse for a NetCDF initializer. Is there already a decision which method we are using? There were PRs from both of you. Would adapt it to EasyDataverse then.

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 27 2023 at 15:57):

I tried to make a little improvement here: https://github.com/IQSS/dataverse/pull/9541/commits/1ee2ea733356e43c74e58bd3a2b66a1107e59022

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 27 2023 at 15:57):

Feedback welcome! :sweat_smile:

view this post on Zulip Ana Trisovic (Apr 27 2023 at 17:34):

Hi Phil I think the PR looks great!

view this post on Zulip Ana Trisovic (Apr 27 2023 at 17:34):

I'm inclined to think we should use this PR as the final

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 27 2023 at 17:59):

If you say so! :sweat_smile:

view this post on Zulip Jan Range (Apr 27 2023 at 21:05):

Added it to EasyDataverse - https://github.com/gdcc/easyDataverse/pull/12

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 28 2023 at 13:15):

I just bumped into Jeff Blossom while getting coffee. He's off to teach a workshop on cartography, but he said he'd be happy to look at the tests I wrote for the -180/360 domain thing.

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 28 2023 at 18:31):

Wow, the workshop goes all day: https://gis.harvard.edu/event/cartography-workshop-spring-2023

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 28 2023 at 18:31):

I'll have to try to catch him next week.

view this post on Zulip Jan Range (Apr 29 2023 at 05:55):

That would be great! I was thinking a bit more about the current implementation. We are currently only addressing the obvious cases, where the degree is > 180, but how do we determine those that are 0 < 180 yet in [0, 360]? Given that both cases can be present, due to the need of the transformation, situations where both apply may be indeterminate.

view this post on Zulip Jan Range (Apr 29 2023 at 05:55):

image.png

view this post on Zulip Jan Range (Apr 29 2023 at 05:57):

Maybe there is some convention that specifically applies on the units found within the NetCDF file?

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 29 2023 at 12:19):

Nice diagram. Do you want to upload it to https://github.com/cf-convention/cf-conventions/issues/435 ? :happy:

view this post on Zulip Jan Range (Apr 30 2023 at 06:08):

Will post it :-)

view this post on Zulip Jan Range (Apr 30 2023 at 06:09):

Do you meet on monday btw?

view this post on Zulip Jan Range (Apr 30 2023 at 08:19):

Alright, re-reading the comment of sethmcg answered the question. Cases between [0ยฐ, 180ยฐ] are simply indeterminate. If there are no domain infos within the NetCDF file, we have no chance. It's a pity that this is not handled appropriately by the standard.

view this post on Zulip Jan Range (Apr 30 2023 at 08:22):

Is it possible, upon displaying the bounding box, to alert the user that the given bounding box is indeterminate? In addition, maybe offer to display from both domains -180ยฐ, 180ยฐ and 0ยฐ, 360ยฐ?

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 30 2023 at 12:07):

Yep, I just created an agenda/notes doc: https://docs.google.com/document/d/1TC2pqAqdxbAaW10nErsdVIrygiHDJ8-trIW2rqOAss4/edit?usp=sharing

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 30 2023 at 12:09):

This is pretty much what I'd like to ask Jeff about, if I can. Show him my unit tests. Ask about edge cases. It seems like a lot of indeterminacy.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 04 2023 at 21:25):

@Jan Range when you get a moment can you please take a look at this comment? Thanks. https://github.com/IQSS/dataverse/pull/9541#discussion_r1183737225

view this post on Zulip Philip Durbin ๐Ÿš€ (May 04 2023 at 21:26):

Actually, @Ana Trisovic you're welcome to look too, of course. You've been thinking about this. :happy:

view this post on Zulip Jan Range (May 05 2023 at 04:59):

Just added a comment :-)

view this post on Zulip Philip Durbin ๐Ÿš€ (May 05 2023 at 11:28):

Thanks! I'm hoping to circle back to this today.

view this post on Zulip Jan Range (May 08 2023 at 11:38):

@Philip Durbin @Ana Trisovic I have to leave earlier from today's meeting. Is it okay for both of you to quickly exchange about the long/lat PR? Also @Ana Trisovic the H5Web part is on my week's todo :smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (May 08 2023 at 11:39):

Whatever is fine! Exchange quickly with you or without you? You mean to put it first on the agenda, right? :happy:

view this post on Zulip Jan Range (May 08 2023 at 11:44):

Yes, that would be awesome :-)

view this post on Zulip Jan Range (May 08 2023 at 11:45):

I think we are close with the extraction though :raised_hands:

view this post on Zulip Philip Durbin ๐Ÿš€ (May 08 2023 at 16:02):

My PR was approved. It's in "ready for QA" now: #9541

view this post on Zulip Philip Durbin ๐Ÿš€ (May 08 2023 at 16:02):

I did fix that typo we saw :sweat_smile:

view this post on Zulip Jan Range (May 08 2023 at 16:19):

I believe linestrings works within 0:360 :sweat_smile: Bottom one is from 0 to 360 which starts at greenwich

image.png

view this post on Zulip Jan Range (May 08 2023 at 16:22):

Also, if the convention is to go from west to east in a bounding box., then the current implementation visually results in the same bounding box.

image.png

view this post on Zulip Jan Range (May 08 2023 at 16:26):

I'll try to come up with some plotting for this in Python so we can play around with it.

view this post on Zulip Jan Range (May 08 2023 at 17:10):

Got a nice visualisation! Feel free to play around :-)

https://colab.research.google.com/drive/168MMLUQgImdZOza2LOHxz211WGFQD3JB?usp=sharing

view this post on Zulip Philip Durbin ๐Ÿš€ (May 08 2023 at 17:34):

nice globe

globe.png

view this post on Zulip Jan Range (May 08 2023 at 17:35):

Yea, looks pretty!

view this post on Zulip Ana Trisovic (May 08 2023 at 18:40):

I added a comment in the PR https://github.com/IQSS/dataverse/pull/9541#pullrequestreview-1417286036

view this post on Zulip Ana Trisovic (May 08 2023 at 19:17):

IMG_2792.jpg this is my logic

view this post on Zulip Ana Trisovic (May 08 2023 at 19:18):

The black line is 0-360 and green line is -180-180. The value(s) just need to be mapped (from black to green) when they are in the red area. This was not obvious to me at all in the beginning :sweat_smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (May 08 2023 at 19:19):

All these coffee stained hand written diagrams are going in the talk. :happy:

view this post on Zulip Philip Durbin ๐Ÿš€ (May 09 2023 at 19:04):

#9541 has been merged!

view this post on Zulip Jan Range (May 28 2023 at 10:36):

Have you seen the message from Philip Conzett?

As for the content for bounding boxes, the value for West Longitude must be lower than the value for East Longitude; the value for South Latitude must be lower than the value for North Latitude.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 30 2023 at 20:09):

Yep. I just replied: https://groups.google.com/g/dataverse-community/c/7IgkXYfiFhc/m/TXmOc9a0AgAJ

view this post on Zulip Notification Bot (Aug 11 2023 at 18:27):

Philip Durbin has marked this topic as resolved.


Last updated: Nov 01 2025 at 14:11 UTC