Comments (6)
I agree on point 1.
My feeling on 2 is that the primary_column
should be restricted to be a top-level column, for a couple of reasons:
- As we are designing a geo-format, it feels natural that the geographic information is available at the top level.
- We are at the 0.1 stage of the spec I think it is best to have this restriction and review it later. It will make it easier to make headway with implementations
- How does this interact with the https://github.com/geopandas/geo-arrow-spec/? We don't want to add flexibility to the GeoParquet spec which makes it hard to implement in the linked GeoArrow spec
from geoparquet.
Thanks for the great feedback!
For 1. I think the column path makes good sense.
For 2. I lean towards restricting primary geometry column to be top-level, so that conversion to geojson / shapefile is clear, and straightforward in implementation. And I suppose making primary_column
optional makes sense, but I feel like it'd be good to have something nudging people towards defining it if possible. But I certainly see the usefulness of allowing big parquet datasets that just have a nested geospatial value to be compliant without making them say 'this is a geo file'.
from geoparquet.
Call 11/7
For first version (1.0.0) we want to limit geometry columns to only being at the top-level. There are very few geospatial packages that would be able to understand it. But if someone has a use case for nested geometry columns we can potentially add it in the future.
And repetition is optional or required (not repeated).
Need to update the spec in describing the geometry columns to be specific that we don't support grouped and repetition level is required or optional.
from geoparquet.
I think it is right decision for v1.
But I also wonder if there are many geospatial packages that support multiple geometry columns? I would think most that don't support nesting / repetition would also ignore all the columns besides "primary_column", and then nesting / repetition of additional geometry columns should not matter :).
We do have several customers who use repeated geometry columns. Typically, the primary geometry column is top level required column, and it is broken into parts, which are stored as nested or/and repeated columns. What I remember:
- a building and individual floors as repeated geometry,
- a linestring path and repeated struct containing vertices from such linestring with additional data columns (think of M/Z dimension on steroids - where you can have many columns of arbitrary types for each vertex).
In these cases the primary geometry column is non-nested, non-repeated, but there are other columns that are nested inside repeated struct.
from geoparquet.
Yeah, I can imagine this will be something that is revisited. From a writer's perspective, given that Parquet is capable of representing repeated and group fields, it is somewhat odd that a "geo" extension would restrict that. I guess we are anticipating the needs of readers in adding this restriction - but it may turn out to be unnecessarily restrictive.
from geoparquet.
But I also wonder if there are many geospatial packages that support multiple geometry columns?
GeoPandas supports this, and it seems R sf
does as well (https://cran.r-project.org/web/packages/sf/vignettes/sf6.html#how-does-sf-deal-with-secondary-geometry-columns). PostGIS supports this as well (https://gis.stackexchange.com/questions/176263/can-a-postgis-table-or-view-have-two-geometry-columns).
I know that GDAL also supports this in their OGR data model and C API, but it depends on the bindings to GDAL whether it's actually supported (I know that the python bindings right now will only return a single (first) geometry column).
I can certainly see the use case of repeated (list/array type) geometry columns. I also assume that databases (like BigQuery) that have both a proper array type and geometry/geography type will typically not limit combining those two in a repeated geometry type?
from geoparquet.
Related Issues (20)
- PROJJSON for CRS, WKT for CRS and ISO19111 HOT 6
- WKT support for 3/4D using Z and/or M HOT 10
- Schema version invalid HOT 11
- Simplify or remove script dependencies HOT 3
- PROJJSON schema version HOT 4
- Metadata encoding options for GeoArrow-encoded columns in GeoParquet metadata HOT 2
- Is it possible to define a transform alongside a CRS, similar to geotiff? HOT 3
- Recommendation on the Arrow specific type for the WKB geometry column ? HOT 5
- Antimeridian Crossings and bbox HOT 4
- Update example files for 1.1 HOT 2
- The releases on the repository can be misleading regarding the status of GeoParquet as an OGC Standard
- Clarify projection of bounding box columns HOT 2
- Mixed concerns: Encoding + Geometry Type HOT 5
- Covering Schema
- Clarify recommended file extension HOT 9
- List of Submitting Organisations HOT 2
- Enforce pull requests and approvals for all repository updates HOT 4
- Require status checks to pass before merging HOT 4
- Synchronise requirements in the metanorma asciidoc files with those in the gpq validator HOT 1
- add support wkt or wkt2 formats for crs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geoparquet.