Giter Site home page Giter Site logo

Comments (4)

DanRunfola avatar DanRunfola commented on June 3, 2024

Hmmm - that's interesting. I'll dig into this a little further and see if I can get to the root of it.

My guess is that the simplification is resulting in an invalid topology for those shapes, and so they're getting caught in the final stage cleaning for the product. But let me take a closer look at those cases; going to flag this as a needs further research and get into it as soon as is feasible.

from geoboundaries.

DanRunfola avatar DanRunfola commented on June 3, 2024

Just looping back on this - the issue is the ordering of events in the CGAZ simplification.
By swapping the order so that we clean geometry before simplifying, all releases have the same total of shapes.

That said, that same number of shapes is smaller than anticipated - i.e. lengths are equal to:

[['ADM0', '10', 198],
['ADM0', '25', 198],
['ADM0', '50', 198],
['ADM0', '75', 198],
['ADM0', '100', 198],
['ADM1', '10', 3291],
['ADM1', '25', 3291],
['ADM1', '50', 3291],
['ADM1', '75', 3291],
['ADM1', '100', 3291],
['ADM2', '10', 112961],
['ADM2', '25', 112961],
['ADM2', '50', 112961],
['ADM2', '75', 112961],
['ADM2', '100', 112961]]

So, I'm digging in to the cleaning to see what we're losing, why, and if we can pull it back. Will update as I learn more; going to aim for 3.1 for a fix once I know the issue.

from geoboundaries.

DanRunfola avatar DanRunfola commented on June 3, 2024

Ok - I think I have this figured out. Essentially, the earlier issue was the issue - we were doing geometry cleaning after simplification, which resulted in some lost polygons. What I'm going to recommend we push is CGAZ in which we do the opposite - clean our geometry first, then simplified second. That will result in the consistent values noted above.

As a second note, we do see some meaningful differences in administrative zones across our products. They are generally as would be expected, but I think we need to create a page on the website somewhere illustrating these differences.

As an example, here is Ireland ADM2:

[['CGAZ', 50431],
['SSCU', 50885],
['SSCGS', 50431],
['HPSCU', 50885],
['HPSCGS', 50438]]

As expected, CGAZ and SSCGS have the same number of administrative zones, as CGAZ is built with SSCGS (our standardized product). The other differences are more meaningful:
HPSCU - 50885 - Every ADM zone, the "Truth" from the perspective of the country.
HPSCGS - 50438 - ADM zones from HPSCU that fall within the United States definition of Ireland's borders - smaller than HPSCU as expected.
You can see the difference between HPSCU and HPSCGS in the below - Red is HPSCU only (not in HPSCGS):
image

So, I think that makes sense. The next difference is from HPSCGS to SSCGS - we lose 7 zones when we simplify HPSCGS. That shouldn't be happening, i.e. it should be just like HPSCU vs SSCU.
['SSCGS', 50431],
['HPSCGS', 50438]]

I think that difference may be due to an oversimplification occuring in the build of SSCGS - specifically, we are applying " -simplify keep-shapes percentage=25%" to SSCGS, which already is using the SSCU products, so shouldn't be simplified again.

That error is then propogating to the CGAZ product, resulting in some ADMs being inadvertantly dropped.

So - in conclusion, two code changes prompted by this:

  1. In the final CGAZ build, clean before simplification. This may result in some topology issues in the final product, so we'll need to revisit eventually, but should retain accuracy.
  2. In the SSCGS build, drop the second (duplicate) simplification.

I'll leave this open until the next version - targeting this fix for 3.1.

from geoboundaries.

DanRunfola avatar DanRunfola commented on June 3, 2024

Ok! Just confirming that fixed it. This is now fixed in the dev build (uploading now to geoboundaries.org/data/dev/), and will be rolled into the formal 3.1 release later this month.

Thanks for the report!

from geoboundaries.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.