Comments (15)
Just a small followup. After your question I decided to write up the explanation of algorithms here http://martinfleischmann.net/line-simplification-algorithms/
from topojson.
Btw: this was the end of my summer of code. Tomorrow back to work..
from topojson.
Thanks for the detailed feedback. Very much appreciated. As you have noticed I started to push the last bits of open issues.
In coming week I'll distill the above mentioned points in separate issues, and create a new release candidate version (v1.0rc9)
, based on the current code base.
In the period after will continue to improve the documentation of the manual and inline docstrings. No new features will be introduced and if no apparent bugs appear the v1.0
release will be the same as the v1.0rc9
.
from topojson.
Hi @mattijn, I just came across this repo. Really good work!
In GeoPandas we had some discussion and attempts to have a direct io to Topojson (geopandas/geopandas#645, geopandas/geopandas#610). While it is simple to read through Fiona, we don't have a way of saving to Topojson now.
Instead of the PR mentioned above, I would like to see it done using your package, so I am checking with you to see whether it is a safe (stable) option yet. https://github.com/calvinmetcalf/topojson.py seem to be unmaintained and with a fraction of options.
If you need a hand pushing the last bit, let me know, I might find a moment.
from topojson.
Thanks @martinfleis, for your comment. There are still a few open issues to solve before releasing a 1.0 version. A bit of pushing is definitely helpful. Potential integration with geopandas might be that push! :)
Besides the defined issues, there are two things to be aware of to find out if this repo is useful for integration:
1. Speed.
Computing a topology can be cost-intensive. This whole repo is more or less build around the shared_paths
function. But there is also the current bottleneck (here in code). It is because the shared_paths
function cannot do broadcasting. I tried using pygeos (its documented in this issue), but the bottleneck seems to be on the GEOS side.
I never compared this repo to the unmaintained https://github.com/calvinmetcalf/topojson.py repo in terms of speed and correctness, but the start principles differs. This repo builds on shapely and numpy, where the other is solely using python.
2. Toposimplify
To be honest: I don't really understand the units of the toposimplify
parameter. The required value is much different for geographic data in degrees or in meters. There is no issue yet, but in the (slowly appearing) documentation page, I mentioned this:
I donβt really know what the current input value means, but I do know that there is currently NO option to use a %-value (like in mapshaper.org).
It would be a great contribution if you can make the
toposimplifiy
setting work using percentage as input!
This would also make it possible to have an improved interactive experience, since the exact toposimplify
value is almost never equal.
So to answer your question, its close to stable, but not yet ready.
from topojson.
Thank you!
I don't really understand the units of the toposimplify parameter.
For Douglas-Pecker algorithm, epsilon is in the same units as geometry, that is why there is so big difference between degrees and meters. I like this animation from Wikipedia, which is clear on what is epsilon.
Epsilon in this case is the half of the width of the blue buffer around the line connecting furthest points. If the point happens to be within the buffer (i.e. its distance from the line connecting furthest points is smaller than epsilon), it gets deleted.
For Visvalingam-Whyatt algorithm, epsilon is different, aerial value, so it will be larger. VW constructs triangles between the points along the line and removes those points, which are associated (are on the top of) with the triangle which area is smaller than epsilon.
I hope it is clear.
In both cases, values are depending on the actual units. It is then complicated to guess default value. That is likely the reason why shapely's simplify does not have default and user has to specify it. That might be good way for topojson
as well.
Computing a topology can be cost-intensive
I'll need to explore your code to say anything meaningful on speed during topology creation.
One speedup could be possible in here, if we used vectorized pygeos version of simplify instead of loop through shapely.
Lines 529 to 541 in 238fb29
GeoPandas is close to 0.8 release, so I would like include toposimplify (geopandas/geopandas#1387) in 0.9. There's some time to release topojson 1.0 in the meantime.
from topojson.
there is currently NO option to use a %-value (like in mapshaper.org)
The % is mapshaper is the number of retained points. I don't think there's any option how to do that using GEOS and simplification
. You would have to implement the algorithm by yourself and have percentage as a stop point during its iteration.
from topojson.
That is great @martinfleis! Really nice explanation of differences between the two epsilons. Love it.
I created #73, which provides an example of the time-expensive bottleneck I mentioned above.
from topojson.
@martinfleis, as you might have noticed, I've released version 1.0rc10
. I said before that there will be no other releases after rc9
, but after all your feedback + much appreciated suggestions I have made quite some changes that do not yet deserve a full 1.0 release. Any feedback still much welcome!
from topojson.
@mattijn I've seen, the performance gain is great. What are the remaining bottlenecks now? I did not have a chance to actually see what has changed.
from topojson.
Compared to rc9
I've worked my way through Extract
, Join
, Cut
and Dedup
to fix bottlenecks in each of them. In Hashmap
I've made sure that it works OK with all changes made, but I did not fix or change bottlenecks there. With some timings for larger files (it varies a bit by file) most of the time is now spend in the _backward_arcs
function within the Hashmap
class.
If no big issues appear I'll release 1.0
beginning of July and then probably wait till shapely 2.0
has arrived and see what can be done better with the new functionalities in there.
from topojson.
I'll try to find a moment to play with the master in the following days.
wait till shapely 2.0 has arrived and see what can be done better with the new functionalities in there.
It will be essentially what is in pygeos
now, so you can already play with that (I know you already did). If you have pygoes in your environment, you can use geopandas master which uses pygeos geometry under the hood (https://geopandas.readthedocs.io/en/latest/install.html#using-the-optional-pygeos-dependency), so you can work directly with the arrays.
from topojson.
I've checked with geopandas master including pygeos and all still works:
import geopandas as gpd
import topojson as tp
print(f'geopandas: {gpd.__version__}\ntopojson: {tp.__version__}')
gpd.options.use_pygeos = True
gdf = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
topo = tp.Topology(gdf.query('continent == "Africa"')).toposimplify(4)
topo.to_svg()
topo.to_gdf().head(1)
geopandas: 0.7.0+79.g9f4c995 topojson: 1.0rc10
geometry id pop_est continent name iso_a3 gdp_md_est 0 POLYGON ((33.90369 -0.95000, 39.20219 -4.67675... None 53950935 Africa Tanzania TZA 150600.0
Regarding pygeos related things to investigate:
- Probably its necessary to change a few things related to the
STRtree
- It will be interesting to investigate if it is possible to use GEOSRelate (DE-9IM, boundary-boundary) once its landed in pygeos (this could be test with current shapely already).
- Since geometries in pygeos/shapely 2.0 are immutable, maybe can use these and use
numpy.asarray()
representations for a view.
from topojson.
Indeed, everything should work, there's a compatibility layer in GeoPandas.
Probably its necessary to change a few things related to the STRtree
Look at new query
and especially query_bulk
options. Both are backwards compatible, so they work even without pygeos and are super fast (with pygeos).
from topojson.
Released v1.0 on PyPi and GitHub tag v1.0 reflects this.
from topojson.
Related Issues (20)
- Reduce decimal places when converting to GeoJSON HOT 2
- Shapely deprecation warnings in topojson 1.3 HOT 5
- Keep geojson properties HOT 6
- Merge multiple layers in a single topojson HOT 5
- Conversion to Typology object causes overlaps HOT 5
- tp.Topology.to_json(pretty=True) doesn't handle None correctly. (Doens't convert None to null) HOT 1
- Converting GeoJSON FeatureCollection to TopoJSON HOT 1
- BUG: `Topology.to_gdf` should keep the original index HOT 2
- holes in multipolygons are lost by simplification HOT 4
- Deprecation warning for shapely 2.0 HOT 3
- Wrong topologies/arcs being created? HOT 7
- Creating a topology for data without junctions and shared_coords=False, prequantize=False gives error
- Bug: polygons that entirely fill islands in another polygon are often not dedupped
- shared_coords=True vs shared_coords=False HOT 5
- Linestrings that follow the same path but where one contains extra redundant points are not deduplicated
- enh: include features that are possible with shapely 2.0
- tests failing, natural earth dataset changed HOT 2
- RuntimeWarning: invalid value encountered in cast HOT 3
- IndexError: pop index out of range when instanciating Topology with a list of GeoDataFrame HOT 10
- Coordinates not reported correctly on Multipoint with only one point HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from topojson.