Giter Site home page Giter Site logo

Comments (26)

hobu avatar hobu commented on June 12, 2024 6

I don't understand why people are being stubborn about the format.

Because writing a specification that diverse implementation audiences can succeed with is very difficult. Most of the non-geo software world has no clue what WKT is or knows how to dereference an EPSG code into a coordinate system and they don't ever care to. Geoparquet aspires a much wider audience than the spatial-is-special crowd, and it needs implementation buy-in in these other communities to get traction beyond it. Larding up the specification with conveniences like allowing many different coordinate system description formats makes it harder to provide complete implementations and increases the interoperability leakage between those implementations.

I would argue that the spatial-is-special world's two most impactful specifications, Shapefile and GeoJSON, could attribute a lot of their market penetration to the fact they don't provide much guidance in regard to coordinate systems. By not imposing that complexity on implementers, they focused on the part of the interoperability that matters – the geometries. I argue the same thirst exists in the communities that would also implement geoparquet.

from geoparquet.

hobu avatar hobu commented on June 12, 2024 5

add support wkt or wkt2 formats

Which version? There are several (of each).

Do you mean any version? If so, then you've just imposed all of the (varied, inconsistent, and incompatible) history of WKT onto every implementer of the format.

The case for PROJJSON is very clear:

  • Casual implementations can extract ID keys from it without writing a parser. This means that CRS interpretation is conveniently opt-in instead of hard-required for readers and writers who might otherwise be simply pushing data along through a pipeline.
  • Its structure can be validated directly with JSON schema. Content validation requires a database of parameters.
  • WKT support is very uneven across many language families that have strong Parquet/Arrow support. IMO this is due to the fact that its grammar is so whacky and the fact that it requires a gigantic database that's managed by an entirely different organization than the one that manages the WKT specification(s). PROJJSON doesn't escape the latter, but the former is conveniently sidestepped in every modern computing language that matters today.

It is a huge deficiency that the geospatial standards community doesn't have a JSON-based CRS format. The impedance caused by the content of WKT not being expressed in any common grammar has been a huge gate-keeping industry deficiency for decades. The OGC CRS SWG is planning to start with PROJJSON to make a CRSJSON, but who knows what that will devolve into. PROJJSON, however, exists, can have its syntax validated with common tools, and can be conveniently parsed.

#wktrantoff

from geoparquet.

hobu avatar hobu commented on June 12, 2024 5

Not everyone uses projjson or the associated tools. Many people are in the ArcGIS space.

It is easy to install and use PROJ from an ArcPro Conda environment. It works quite well.

Concretely, would this mean that certain geoparquet readers couldn't read certain geoparquet files, if the reader doesn't happen to implement projjson support?

If the specification allows multiple flavors of CRS, most writers will chose vanilla – raw EPSG codes. That means readers will have to go somewhere else to get the parameters those codes describe. Or they will always use the one code that everyone knows and can describe by heart, 4326 😄

The case against PROJJSON so far is:

  • The Java language family has no convenient, zero-effort implementation for people to off-the-shelf and provide fully capable read, write, and interpretation ability of PROJJSON into their software
  • Neither does Rust
  • Nor does JavaScript, but at least PROJJSON is conveniently JSON

What's missing here is these languages don't have a complete open source implementation of the data model that describes WKT2, which is published in ISO 19162 and OGC 18-010. They're missing because writing one is a ton of detailed, thankless work to implement a complex and necessarily complicated data model. PROJJSON is a very faithful expression of that model in JSON, and @rouault found many interpretation nits and bugs in the specification as he built PROJJSON because of its complexity.

Maybe a transpile of the full PROJ engine to WASM is within reach. Maybe Apache SIS has a full 19162 model ready to go but just needs the PROJJSON i/o built for it. I don't have the answers here, but it seems to me users in those software ecosystems need to strengthen their capabilities to meet the requirement regardless of whether or not geoparquet requires PROJJSON or allows every flavor of WKT to describe the coordinate system of data.

PROJJSON is advantageous because it can meet data readers half way – if users have a full interpretation engine they can use it. If they don't, they can pluck the keys and codes that they know about without writing a custom parser and interpretation engine.

from geoparquet.

cholmes avatar cholmes commented on June 12, 2024 4

Great discussion everyone - I think I'm going to close this issue soon as we discussed extensively before 1.0, and I think we've gone over most of the points again. I think we can all acknowledge that our choice of PROJJSON was our most 'controversial' choice in the specification, but I don't think we'll revisit that until a '2.0' version of GeoParquet.

And having 'multiple' options (PROJJSON plus WKT2 for example) that impose higher requirements on readers, forcing them to understand both dialects if they want to read any possible GeoParquet format, is not something desired for GeoParquet. Philosophically this is not in line with the choices we've made for this format - we want to make it as easy as possible for implementations to be created without a deep stack of geospatial software behind it.

I do think we should continue to work to encourage and even find funding for software that does not yet understand PROJJSON, especially open source implementations. And I will state that we actively want ESRI to implement GeoParquet fully, and the stubbornness on this particular issue is in service of greater interoperability. But until it's fully implemented it seems fine to me for ESRI to just support lat/long, or to use 'most' of GeoParquet and do their own crs metadata that is WKT2 as a bridge.

from geoparquet.

urschrei avatar urschrei commented on June 12, 2024 3

Neither does Rust

@hobu We (the georust greater co-prosperity sphere) have good bindings to libproj if it can be used. And if it can't, we'll write a native implementation.

from geoparquet.

rouault avatar rouault commented on June 12, 2024 2

PROJJSON has no Java implementation or Java binding

If it is not already available in it, it shouldn' hopefully be too hard to add to https://github.com/OSGeo/PROJ-JNI which is a JNI binding of PROJ.

Otherwise https://github.com/rouault/projjson_to_wkt could be quickly ported to Java to convert PROJJSON to WKT2 (@m-mohr ported it to JavaScript), but I'm not sure GeoTools understands WKT2. There might be in progress work regarding WKT2:2019 support in https://github.com/apache/sis

from geoparquet.

jiayuasu avatar jiayuasu commented on June 12, 2024 1

PROJJSON has no Java implementation or Java binding. This becomes a blocker to Apache Sedona or any big data ecosystem that are in Java / Scala world such as HBase, Trino, Hive and so on

Currently, we have no way to parse or understand PROJJSON but we can understand CRS WKT using GeoTools.

from geoparquet.

jiayuasu avatar jiayuasu commented on June 12, 2024 1

Thanks guys for the help. So I guess the solution for us is:

  1. Sedona side will implement a Java version of the https://github.com/rouault/projjson_to_wkt . It converts projjson string to WKT1/WKT2:2019
  2. We will use GeoTools WKT1 for now. When Apache SIS finishes WKT2:2019, we will migrate to WKT2:2019.

But this just solves the reading projjson problem. How about writing a WKT1 / WKT2 string to projjson?

from geoparquet.

rouault avatar rouault commented on June 12, 2024 1
  1. It converts projjson string <> WKT1/WKT2:2019

projjson_to_wkt has this important warning "Warning: while the export to WKT1 should be syntaxically correct, datum, projection method or parameter names will be the one of WKT2, and thus a number of implementations will in practice fail to understand such WKT1 strings."

from geoparquet.

rouault avatar rouault commented on June 12, 2024 1

Many people are in the ArcGIS space.

https://www.esri.com/content/dam/esrisites/en-us/media/legal/open-source-acknowledgements/arcgis-pro-3-3-open-source-disclosure.zip has a ArcGIS Pro 3_3 Open Source Disclosure.xlsx file mentioning a "proj_gdal_e.dll" file. Time to make active use of it ;-)

from geoparquet.

nyalldawson avatar nyalldawson commented on June 12, 2024 1

@achapkowski while you're active in the open source community, mind getting someone at ESRI to comment on OSGeo/gdal#9980 ? Having a open driver for this format benefits everyone, ESRI included. 👍

from geoparquet.

jiayuasu avatar jiayuasu commented on June 12, 2024 1

Ping Apache SIS core developer @desruisseaux since he is much more knowledgable than me on this 😁:

Is PROJJSON support on Apache SIS's roadmap?

Chris, please feel free to close the issue since this is off the topic :-)

from geoparquet.

desruisseaux avatar desruisseaux commented on June 12, 2024 1

Even is correct, Apache SIS supports WKT 1 and WKT 2:2015 (it was the first open source software to support WKT 2 after the ESRI prototype) with work for WKT 2:2019 in progress right now. It also supports GML, which is currently the only format capable to support fully the ISO 19111:2007 model. If I understood correctly, PROJJSON doesn't cover fully the ISO 19111 model yet, which is one reason why OGC wants to review it before to approve a JSON format. If we want CRSJSON to be a replacement for GML, then it should be at least as capable as GML.

I plan to support OGC CRSJSON in Apache SIS when the specification will be advanced enough. Whether SIS will support PROJJSON will depend on whether there is a lot of differences. Note that the OGC CRS working group has explicitly stated in their charter that they will avoid any unnecessary difference with PROJJSON.

One correction to what has been said in a previous comment: WKT 2 is not a data model. The model is ISO 19111, and WKT is an encoding of that model. Libraries do not implement a WKT model. They implement ISO 19111, then establish a mapping from WKT elements to that model. This is what both Apache SIS and PROJ C++ API do. One reason for the WKT complexity is that its mapping to ISO 19111 is not straightforward, as WKT makes compromises in an attempt to be more compact and for backward compatibility. The consequence is that trying to understand WKT without prior knowledge of ISO 19111 is confusing. For understanding WKT, ISO 19111 must be read first. If a JSON encoding does a more direct mapping to ISO 19111 elements, it may help to reduce that confusion.

The CRS standardization effort at OGC is lead mainly by Roger Lott. My experience in working with him for more than 10 years is that he is very reliable. When he said that he will do something, he really does, and he is much, much better than me in following the roadmap.

from geoparquet.

kylebarron avatar kylebarron commented on June 12, 2024

More discussion on PROJJSON was had in #90 and #96

from geoparquet.

rouault avatar rouault commented on June 12, 2024

If it is not already available in it, it shouldn' hopefully be too hard to add to https://github.com/OSGeo/PROJ-JNI which is a JNI binding of PROJ.

well, I was forgetting that you could also use the GDAL JNI bindings to convert PROJJSON to WKT1 using
https://gdal.org/java/org/gdal/osr/SpatialReference.html#SetFromUserInput(java.lang.String) to import PROJJSON
and https://gdal.org/java/org/gdal/osr/SpatialReference.html#ExportToWkt() to export to WKT, using PROJ underneath. Of course that's a bit of a heavy dependency

from geoparquet.

paleolimbot avatar paleolimbot commented on June 12, 2024

JNI is a non-starter for many Java libraries in the big data ecosystem, let alone PROJ via JNI. For PROJJSON to be a possibility in that ecosystem somebody would probably need to step up and do the implementation work in Java (as Even noted, it might be not be too difficult and there is some readily available prior art to draw from).

In the absence of that, excluding an entire ecosystem seems worse than allowing a widely supported CRS representation into our metadata.

from geoparquet.

m-mohr avatar m-mohr commented on June 12, 2024

The conversion work from Python to JS was 1 hour of work with ChatGPT. It's likely not much more in Java. If that's too hard to do, then the ecosystem doesn't really seem to want it, I'd say?

Anyway, if we add other encodings, please only additive, not instead of PROJJSON. Otherwise you also exclude non-WKT2 supporting ecosystems again.

Also, can we clarify whether Java supports WKT1 or 2? That's quite a difference...

from geoparquet.

rouault avatar rouault commented on June 12, 2024

Also, can we clarify whether Java supports WKT1 or 2? That's quite a difference...

I believe GeoTools supports WKT1 only AFAIK: https://docs.geotools.org/stable/javadocs/org/geotools/api/referencing/doc-files/WKT.html
Apache SIS supports WKT2:2015 (and WKT1), with in-progress work to add WKT2:2019.

from geoparquet.

achapkowski avatar achapkowski commented on June 12, 2024

Not everyone uses projjson or the associated tools. Many people are in the ArcGIS space.

from geoparquet.

TomAugspurger avatar TomAugspurger commented on June 12, 2024

Anyway, if we add other encodings, please only additive, not instead of PROJJSON.

Concretely, would this mean that certain geoparquet readers couldn't read certain geoparquet files, if the reader doesn't happen to implement projjson support? I'd worry about that causing an (IMO unnecessary) schism and confusing users and data providers.

from geoparquet.

m-mohr avatar m-mohr commented on June 12, 2024

Yeah, if the other encodings are not additive. That makes it more difficult for writers though, but I feel like ease of reading is more important than ease of writing?

Ideally everyone would support PROJJSON though.

from geoparquet.

achapkowski avatar achapkowski commented on June 12, 2024

Not everyone uses projjson or the associated tools. Many people are in the ArcGIS space.

It is easy to install and use PROJ from an ArcPro Conda environment. It works quite well.

You obviously never worked in closed secure environments. Not everyone can pip or conda install stuff.

from geoparquet.

hobu avatar hobu commented on June 12, 2024

Not everyone uses projjson or the associated tools. Many people are in the ArcGIS space.

It is easy to install and use PROJ from an ArcPro Conda environment. It works quite well.

You obviously never worked in closed secure environments. Not everyone can pip or conda install stuff.

https://anaconda.org/esri/proj4 it seems like Esri is already explicitly supporting PROJ usage?

Anyway, I do not see "Esri doesn't support it (yet)" as a valid argument against it.

from geoparquet.

jorisvandenbossche avatar jorisvandenbossche commented on June 12, 2024

PROJJSON is advantageous because it can meet data readers half way – if users have a full interpretation engine they can use it. If they don't, they can pluck the keys and codes that they know about without writing a custom parser and interpretation engine.

I think this is an important point that @hobu makes. We actually have an example of that in the spec specifically for OGC:CRS84 (https://github.com/opengeospatial/geoparquet/blob/v1.0.0/format-specs/geoparquet.md#ogccrs84-details), but I think that should apply more in general (with the only requirement that the files were created by a writer that includes those codes).

from geoparquet.

achapkowski avatar achapkowski commented on June 12, 2024

Since proj supports multiple formats. https://proj.org/en/9.4/faq.html

I don't understand why people are being stubborn about the format.

from geoparquet.

jjimenezshaw avatar jjimenezshaw commented on June 12, 2024

Maybe a transpile of the full PROJ engine to WASM is within reach

That would be great. There is already a version of GDAL https://github.com/bugra9/gdal3.js that includes PROJ, so it should be easy to "extract" only the PROJ needed part.
The only missing part (but not completely mandatory) is the cURL integration to use the grid files from https://cdn.proj.org Unfortunately this issue is not moving forward: emscripten-core/emscripten#3270

from geoparquet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.