Comments (4)
You can pass the -explodecollections
parameter to ogr2ogr to convert multi-part geometries to single part in the output. However, this won't update the area obviously.
I wonder though, could you explain why you want to explode the geometries?
from open-buildings.
You can pass the -explodecollections parameter to ogr2ogr to convert multi-part geometries to single part in the output. However, this won't update the area obviously.
Ah, good to know. But yeah, this feels like it needs a bit more customization than what you can do with GDAL out of the box.
I wonder though, could you explain why you want to explode the geometries?
It's really just for this particular google buildings dataset. It's distributed in CSV with WKT, and some small percent of the geometries are multipolygons (certainly less than 1%, perhaps even less than 0.1%?). The data set was clearly made by computer vision people who don't understand geospatial, and in my experience a number of tools work better if you have all of one geometry type. Yes, shapefile munges them together, so most 'can deal', but it feels far cleaner to have exactly one geometry type - especially with these buildings, it makes sense to me that each building would be one row.
But as I mentioned in #53 (comment) it'd be much nicer to just have a clean library that compares read and write performance from any major format to another. I'd not even include 'csv' in that, and it wouldn't need to do any exploding of geometries.
from open-buildings.
You can pass the -explodecollections parameter to ogr2ogr to convert multi-part geometries to single part in the output. However, this won't update the area obviously.
Ah, good to know. But yeah, this feels like it needs a bit more customization than what you can do with GDAL out of the box.
It depends... you don't need to do customizations but without them the area will have to be recalculated for all rows... which is a bit less efficient with that low a percentage of exploded rows.
I wonder though, could you explain why you want to explode the geometries?
It's really just for this particular google buildings dataset. It's distributed in CSV with WKT, and some small percent of the geometries are multipolygons (certainly less than 1%, perhaps even less than 0.1%?). The data set was clearly made by computer vision people who don't understand geospatial, and in my experience a number of tools work better if you have all of one geometry type. Yes, shapefile munges them together, so most 'can deal', but it feels far cleaner to have exactly one geometry type - especially with these buildings, it makes sense to me that each building would be one row.
OK. I always do the other way around: if there is a mixture, I convert everything to multipolygon so it can be stored in one table/file.
FYI: pyogrio automatically converts all geometries to MultiPolygons if you save a GeoDataFrame with both Polygons and MultiPolygons.
But as I mentioned in #53 (comment) it'd be much nicer to just have a clean library that compares read and write performance from any major format to another. I'd not even include 'csv' in that, and it wouldn't need to do any exploding of geometries.
I'm not sure I'll get to it, at least not on short term, but if you would be interested, you can find some other benchmarks involving geo operations I did in the past here: https://github.com/geofileops/geobenchmark
from open-buildings.
It depends... you don't need to do customizations but without them the area will have to be recalculated for all rows... which is a bit less efficient with that low a percentage of exploded rows.
Yeah, I just meant you can't do an easy one-liner from ogr2ogr that does it all in one. And agreed, a second run just to recalculate area won't make the comparison great. I think it's fine for it to not explode rows, the other two options just enabled this all in one pass.
OK. I always do the other way around: if there is a mixture, I convert everything to multipolygon so it can be stored in one table/file.
Yeah, that's the practical way to do things, given the state of geospatial data formats (shapefile still being widely used) and the state of the tools. With this I was working towards distributing data in a 'better' way, and it just strikes me it's better to be able to differentiate between multi-polygons and polygons. If this was 'facilities' that could include multiple buildings in each then a multipolygon makes sense. If it's supposed to be every building, but some are squeezed into a single geometry then that makes less sense.
FYI: pyogrio automatically converts all geometries to MultiPolygons if you save a GeoDataFrame with both Polygons and MultiPolygons.
Cool - good to know. I forget what tool I was working with but there was one that was barfing if you just threw this dataset at it.
I'm not sure I'll get to it, at least not on short term, but if you would be interested, you can find some other benchmarks involving geo operations I did in the past here: https://github.com/geofileops/geobenchmark
Oh nice! I'll check it out.
from open-buildings.
Related Issues (20)
- Automatically calculate country codes per quadkey & remove country_iso flag HOT 2
- Option to just get the 'count' of buildings, but not actually download the results. HOT 2
- Better progress reporting in `get_buildings` HOT 1
- Make get building requests with smaller quadkeys
- Better estimates of how long a query might take
- Warn users if their geojson is not in `iso_country`
- Don't create empty files if there are 0 features
- Add tests HOT 2
- Create QGIS plugin for get_buildings HOT 2
- Install spatial extension if not installed HOT 3
- Add a geocoding option HOT 2
- Tidy up requirements.txt
- Structure of the package
- Python interface for functionality
- Cannot save as Shapefile HOT 10
- Allow both .geojson and .json as suffixes to mean output as GeoJSON
- Accept more input formats HOT 2
- Installation requirement files and dependencies HOT 3
- Nicer error reporting when geocoder fails HOT 6
- Document version of data used HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open-buildings.