Giter Site home page Giter Site logo

Release topojson v1.0 about topojson HOT 15 CLOSED

mattijn avatar mattijn commented on May 24, 2024
Release topojson v1.0

from topojson.

Comments (15)

martinfleis avatar martinfleis commented on May 24, 2024 3

Just a small followup. After your question I decided to write up the explanation of algorithms here http://martinfleischmann.net/line-simplification-algorithms/

from topojson.

mattijn avatar mattijn commented on May 24, 2024 2

Btw: this was the end of my summer of code. Tomorrow back to work..

from topojson.

mattijn avatar mattijn commented on May 24, 2024 1

Thanks for the detailed feedback. Very much appreciated. As you have noticed I started to push the last bits of open issues.

In coming week I'll distill the above mentioned points in separate issues, and create a new release candidate version (v1.0rc9), based on the current code base.

In the period after will continue to improve the documentation of the manual and inline docstrings. No new features will be introduced and if no apparent bugs appear the v1.0 release will be the same as the v1.0rc9.

from topojson.

martinfleis avatar martinfleis commented on May 24, 2024

Hi @mattijn, I just came across this repo. Really good work!
In GeoPandas we had some discussion and attempts to have a direct io to Topojson (geopandas/geopandas#645, geopandas/geopandas#610). While it is simple to read through Fiona, we don't have a way of saving to Topojson now.

Instead of the PR mentioned above, I would like to see it done using your package, so I am checking with you to see whether it is a safe (stable) option yet. https://github.com/calvinmetcalf/topojson.py seem to be unmaintained and with a fraction of options.

If you need a hand pushing the last bit, let me know, I might find a moment.

from topojson.

mattijn avatar mattijn commented on May 24, 2024

Thanks @martinfleis, for your comment. There are still a few open issues to solve before releasing a 1.0 version. A bit of pushing is definitely helpful. Potential integration with geopandas might be that push! :)

Besides the defined issues, there are two things to be aware of to find out if this repo is useful for integration:
1. Speed.
Computing a topology can be cost-intensive. This whole repo is more or less build around the shared_paths function. But there is also the current bottleneck (here in code). It is because the shared_paths function cannot do broadcasting. I tried using pygeos (its documented in this issue), but the bottleneck seems to be on the GEOS side.

I never compared this repo to the unmaintained https://github.com/calvinmetcalf/topojson.py repo in terms of speed and correctness, but the start principles differs. This repo builds on shapely and numpy, where the other is solely using python.

2. Toposimplify
To be honest: I don't really understand the units of the toposimplify parameter. The required value is much different for geographic data in degrees or in meters. There is no issue yet, but in the (slowly appearing) documentation page, I mentioned this:

I don’t really know what the current input value means, but I do know that there is currently NO option to use a %-value (like in mapshaper.org).

It would be a great contribution if you can make the toposimplifiy setting work using percentage as input!

This would also make it possible to have an improved interactive experience, since the exact toposimplify value is almost never equal.

So to answer your question, its close to stable, but not yet ready.

from topojson.

martinfleis avatar martinfleis commented on May 24, 2024

Thank you!

I don't really understand the units of the toposimplify parameter.

For Douglas-Pecker algorithm, epsilon is in the same units as geometry, that is why there is so big difference between degrees and meters. I like this animation from Wikipedia, which is clear on what is epsilon.

image

Epsilon in this case is the half of the width of the blue buffer around the line connecting furthest points. If the point happens to be within the buffer (i.e. its distance from the line connecting furthest points is smaller than epsilon), it gets deleted.

For Visvalingam-Whyatt algorithm, epsilon is different, aerial value, so it will be larger. VW constructs triangles between the points along the line and removes those points, which are associated (are on the top of) with the triangle which area is smaller than epsilon.

I hope it is clear.

In both cases, values are depending on the actual units. It is then complicated to guess default value. That is likely the reason why shapely's simplify does not have default and user has to specify it. That might be good way for topojson as well.

Computing a topology can be cost-intensive

I'll need to explore your code to say anything meaningful on speed during topology creation.

One speedup could be possible in here, if we used vectorized pygeos version of simplify instead of loop through shapely.

topojson/topojson/ops.py

Lines 529 to 541 in 238fb29

if package == "shapely":
if input_as == "array":
list_arcs = []
for ls in linestrings:
coords_to_simp = ls[~np.isnan(ls)[:, 0]]
simple_ls = geometry.LineString(coords_to_simp)
simple_ls = simple_ls.simplify(epsilon, preserve_topology=False)
list_arcs.append(np.array(simple_ls).tolist())
elif input_as == "linestring":
for idx, ls in enumerate(linestrings):
linestrings[idx] = ls.simplify(epsilon, preserve_topology=False)
list_arcs = linestrings

GeoPandas is close to 0.8 release, so I would like include toposimplify (geopandas/geopandas#1387) in 0.9. There's some time to release topojson 1.0 in the meantime.

from topojson.

martinfleis avatar martinfleis commented on May 24, 2024

there is currently NO option to use a %-value (like in mapshaper.org)

The % is mapshaper is the number of retained points. I don't think there's any option how to do that using GEOS and simplification. You would have to implement the algorithm by yourself and have percentage as a stop point during its iteration.

from topojson.

mattijn avatar mattijn commented on May 24, 2024

That is great @martinfleis! Really nice explanation of differences between the two epsilons. Love it.

I created #73, which provides an example of the time-expensive bottleneck I mentioned above.

from topojson.

mattijn avatar mattijn commented on May 24, 2024

@martinfleis, as you might have noticed, I've released version 1.0rc10. I said before that there will be no other releases after rc9, but after all your feedback + much appreciated suggestions I have made quite some changes that do not yet deserve a full 1.0 release. Any feedback still much welcome!

from topojson.

martinfleis avatar martinfleis commented on May 24, 2024

@mattijn I've seen, the performance gain is great. What are the remaining bottlenecks now? I did not have a chance to actually see what has changed.

from topojson.

mattijn avatar mattijn commented on May 24, 2024

Compared to rc9 I've worked my way through Extract, Join, Cut and Dedup to fix bottlenecks in each of them. In Hashmap I've made sure that it works OK with all changes made, but I did not fix or change bottlenecks there. With some timings for larger files (it varies a bit by file) most of the time is now spend in the _backward_arcs function within the Hashmap class.

If no big issues appear I'll release 1.0 beginning of July and then probably wait till shapely 2.0 has arrived and see what can be done better with the new functionalities in there.

from topojson.

martinfleis avatar martinfleis commented on May 24, 2024

I'll try to find a moment to play with the master in the following days.

wait till shapely 2.0 has arrived and see what can be done better with the new functionalities in there.

It will be essentially what is in pygeos now, so you can already play with that (I know you already did). If you have pygoes in your environment, you can use geopandas master which uses pygeos geometry under the hood (https://geopandas.readthedocs.io/en/latest/install.html#using-the-optional-pygeos-dependency), so you can work directly with the arrays.

from topojson.

mattijn avatar mattijn commented on May 24, 2024

I've checked with geopandas master including pygeos and all still works:

import geopandas as gpd
import topojson as tp

print(f'geopandas: {gpd.__version__}\ntopojson:  {tp.__version__}')
gpd.options.use_pygeos = True

gdf = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
topo = tp.Topology(gdf.query('continent == "Africa"')).toposimplify(4)

topo.to_svg()
topo.to_gdf().head(1)
geopandas: 0.7.0+79.g9f4c995
topojson:  1.0rc10

africa

geometry id pop_est continent name iso_a3 gdp_md_est
0 POLYGON ((33.90369 -0.95000, 39.20219 -4.67675... None 53950935 Africa Tanzania TZA 150600.0

Regarding pygeos related things to investigate:

  • Probably its necessary to change a few things related to the STRtree
  • It will be interesting to investigate if it is possible to use GEOSRelate (DE-9IM, boundary-boundary) once its landed in pygeos (this could be test with current shapely already).
  • Since geometries in pygeos/shapely 2.0 are immutable, maybe can use these and use numpy.asarray() representations for a view.

from topojson.

martinfleis avatar martinfleis commented on May 24, 2024

Indeed, everything should work, there's a compatibility layer in GeoPandas.

Probably its necessary to change a few things related to the STRtree

Look at new query and especially query_bulk options. Both are backwards compatible, so they work even without pygeos and are super fast (with pygeos).

from topojson.

mattijn avatar mattijn commented on May 24, 2024

Released v1.0 on PyPi and GitHub tag v1.0 reflects this.

from topojson.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.