Giter Site home page Giter Site logo

peterstace / simplefeatures Goto Github PK

View Code? Open in Web Editor NEW
119.0 8.0 19.0 2.49 MB

Simple Features is a pure Go Implementation of the OpenGIS Simple Feature Access Specification

License: MIT License

Go 99.64% Dockerfile 0.04% C 0.11% Shell 0.10% Makefile 0.11%
postgis spatial-analysis gis go golang geometry library 2d libgeos opengis

simplefeatures's Introduction

Simple Features

Documentation Build Status Go Report Card Coverage Status

Simple Features is a 2D geometry library that provides Go types that model geometries, as well as algorithms that operate on them.

It's a pure Go Implementation of the OpenGIS Consortium's Simple Feature Access Specification (which can be found here). This is the same specification that GEOS, JTS, and PostGIS implement, so the Simple Features API will be familiar to developers who have used those libraries before.

Table of Contents

Geometry Types

Type Example Description
Point Point is a single location in space.
MultiPoint MultiPoint is collection of points in space.
LineString LineString is curve defined by linear interpolation between a set of control points.
MultiLineString MultiLineString is a collection of LineStrings.
Polygon Polygon is a planar surface geometry that bounds some area. It may have holes.
MultiPolygon MultiPolygon is collection of Polygons (with some constraints on how the Polygons interact with each other).
GeometryCollection GeometryCollection is an unconstrained collection of geometries.
Geometry Geometry holds any type of geometry (Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon, or GeometryCollection). It's the type that the Simple Features library uses when it needs to represent geometries in a generic way.
Envelope Envelope is an axis aligned bounding box typically used to describe the spatial extent of other geometric entities.

Marshalling and Unmarshalling

Simple features supports the following external geometry representation formats:

Format Example Description
WKT POLYGON((0 0,0 1,1 1,1 0,0 0)) Well Known Text is a human readable format for storing geometries. It's often the lowest common denominator geometry format, and is useful for integration with other GIS applications.
WKB <binary> Well Known Binary is a machine readable format that is efficient for computers to use (both from a processing and storage space perspective). WKB is a good choice for transferring geometries to and from PostGIS and other databases that support geometric types.
GeoJSON {"type":"Polygon","coordinates":[[[0,0],[0,1],[1,1],[1,0],[0,0]]]} GeoJSON represents geometries in a similar way to WKB, but is based on the JSON format. This makes it ideal to use with web APIs or other situations where JSON would normally be used.
TWKB <binary> Tiny Well Known Binary is a multi-purpose compressed binary format for serialising vector geometries into a stream of bytes. It emphasises minimising the size of the serialised representation. It's a good choice when space is at a premium (e.g. for storage within a web token).

Geometry Algorithms

The following algorithms are supported:

Miscellaneous Algorithms Description
Area Finds the area of the geometry (for Polygons and MultiPolygons).
Centroid Finds the centroid of the geometry.
ConvexHull Finds the convex hull of the geometry.
Distance Finds the shortest distance between two geometries.
Envelope Finds the smallest axis-aligned bounding-box that surrounds the geometry.
ExactEquals Determines if two geometries are structurally equal.
Length Finds the length of the geometry (for LineStrings and MultiLineStrings).
PointOnSurface Finds a point that lies inside the geometry.
Relate Calculates the DE-9IM intersection describing the relationship between two geometries.
Simplify Simplifies a geometry using the Ramer–Douglas–Peucker algorithm.
Set Operations Description
Union Joins the parts from two geometries together.
Intersection Finds the parts of two geometries that are in common.
Difference Finds the parts of a geometry that are not also part of another geometry.
SymmetricDifference Finds the parts of two geometries that are not in common.
Named Spatial Predicates Description
Equals Determines if two geometries are topologically equal.
Intersects Determines if two geometries intersect with each other.
Disjoint Determines if two geometries have no common points.
Contains Determines if one geometry contains another.
CoveredBy Determines if one geometry is covered by another.
Covers Determines if one geometry covers another.
Overlaps Determines if one geometry overlaps another.
Touches Determines if one geometry touches another.
Within Determines if one geometry is within another.
Crosses Determines if one geometry crosses another.

GEOS Wrapper

A GEOS CGO wrapper is also provided, giving access to functionality not yet implemented natively in Go. The wrapper is implemented in a separate package, meaning that library users who don't need this additional functionality don't need to expose themselves to CGO.

Examples

The following examples show some common operations (errors are omitted for brevity).

WKT

Encoding and decoding WKT:

// Unmarshal from WKT
input := "POLYGON((0 0,0 1,1 1,1 0,0 0))"
g, _ := geom.UnmarshalWKT(input)

// Marshal to WKT
output := g.AsText()
fmt.Println(output) // Prints: POLYGON((0 0,0 1,1 1,1 0,0 0))

WKB

Encoding and decoding WKB directly:

// Marshal as WKB
coords := geom.Coordinates{XY: geom.XY{1.5, 2.5}}
pt := geom.NewPoint(coords)
wkb := pt.AsBinary()
fmt.Println(wkb) // Prints: [1 1 0 0 0 0 0 0 0 0 0 248 63 0 0 0 0 0 0 4 64]

// Unmarshal from WKB
fromWKB, _ := geom.UnmarshalWKB(wkb)
fmt.Println(fromWKB.AsText()) // POINT(1.5 2.5)

Encoding and decoding WKB for integration with PostGIS:

db, _ := sql.Open("postgres", "postgres://...")

db.Exec(`
    CREATE TABLE my_table (
        my_geom geometry(geometry, 4326),
        population double precision
    )`,
)

// Insert our geometry and population data into PostGIS via WKB.
coords := geom.Coordinates{XY: geom.XY{-74.0, 40.7}}
nyc := geom.NewPoint(coords)
db.Exec(`
    INSERT INTO my_table
    (my_geom, population)
    VALUES (ST_GeomFromWKB($1, 4326), $2)`,
    nyc, 8.4e6,
)

// Get the geometry and population data back out of PostGIS via WKB.
var location geom.Geometry
var population float64
db.QueryRow(`
    SELECT ST_AsBinary(my_geom), population
    FROM my_table LIMIT 1`,
).Scan(&location, &population)
fmt.Println(location.AsText(), population) // Prints: POINT(-74 40.7) 8.4e+06

GeoJSON

Encoding and decoding GeoJSON directly:

// Unmarshal geometry from GeoJSON.
raw := `{"type":"Point","coordinates":[-74.0,40.7]}`
var g geom.Geometry
json.NewDecoder(strings.NewReader(raw)).Decode(&g)
fmt.Println(g.AsText()) // Prints: POINT(-74 40.7)

// Marshal back to GeoJSON.
enc := json.NewEncoder(os.Stdout)
enc.Encode(g) // Prints: {"type":"Point","coordinates":[-74,40.7]}

Geometries can also be part of larger structs:

type CityPopulation struct {
    Location   geom.Geometry `json:"loc"`
    Population int           `json:"pop"`
}

// Unmarshal geometry from GeoJSON.
raw := `{"loc":{"type":"Point","coordinates":[-74.0,40.7]},"pop":8400000}`
var v CityPopulation
json.NewDecoder(strings.NewReader(raw)).Decode(&v)
fmt.Println(v.Location.AsText()) // Prints: POINT(-74 40.7)
fmt.Println(v.Population)        // Prints: 8400000

// Marshal back to GeoJSON.
enc := json.NewEncoder(os.Stdout)
enc.Encode(v) // Prints: {"loc":{"type":"Point","coordinates":[-74,40.7]},"pop":8400000}

simplefeatures's People

Contributors

albertteoh avatar doctorloki avatar elitegoblin avatar greg-bell avatar missinglink avatar peterstace avatar pjsoftware avatar sameeraaperera avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simplefeatures's Issues

Implement ST_Area for all types

This is to match the behaviour of PostGIS. It's currently only implemented for Polygon and MultiPolygon. Other types should give 0 (and for GeometryCollection, areas should be summed).

Length calculation

Should be implemented for:

  • Line
  • LineString
  • LinearRing
  • MultiLineString

Return an error for non-implemented combinations

Some methods are only partially implemented (i.e. not implemented for all pairs of geometry inputs). Currently these panic. E.g. the Intersection method. Instead, they should return an error if they're not implemented. The intention is that the error would be removed once all combinations of inputs are tested.

Lazy Validation

  • Validation currently occurs when geometry types are created.
  • Validation is currently slow (n^2 or worse in some cases - there are other tickets to solve this).
  • Some uses cases of the library don't really need validation. E.g. no geometry functionality is used. It's just unmarshalling stuff and then accessing it.

Idea:

  • Make validation 'lazy', and only run it inside methods where it's needed.
  • We may wish to cache the validation state. In which case geometry objects will need to become mutable (so pointer types will need to implement the main interface).

LineString/LineString intersection

It currently uses a pairwise comparison on each line segment within the two line strings. We should be able to use a sweep line algorithm instead.

Configurable serialisation type

AnyGeometry currently has WKT hardcoded in both the scanner and valuer. It should be able to be set up so that the valuer and scanner can be used with WKT or WKB (or any combination).

Possible approaches are flags in the object itself. Or alternatively could have different types for each serialisation type.

Add Coordinates methods

Each concrete type should get a niladic Coordinates method that returns either Coordinates, []Coordinates, [][]Coordinates, or [][][]Coordinates depending on the type.

Better Line/Line intersection

As part of the convex hull work, we now have an 'orientation' function, that finds the orrientation (ccw vs cw vs colinear) for a sequence of 3 points.

That could be used to give a simpler line to line intersection test, e.g. similar to the approach used here. The current approach to line/line intersection is fairly add hoc, and with lots of edge cases.

Postgis comparison tests

We want a way to automatically check behaviour of the library against Postgis behavour. If there are any bugs detected, then we can use the generated test cases to add new unit tests (and fixes). There are a few steps to do this:

  • Parse the existing library for strings that can be used as WKT/WKB/GeoJSON. This will form the initial test corpus.
  • Create additional Geometries for the test corpus randomly or via enumeration.
  • Write test harnesses that use the test corpus to check the behaviour of the library against the behaviour of Postgis.

It may make sense to have this as a separate package so that it's no run with the existing tests (especially if the corpus takes a long time to run through).

GeoJSON FeatureCollections

  • Add types to represent GeoJSON Features and GeoJSON FeatureCollections
  • The types should be able to be marshaled and marshaled.

IsClosed calculation

The method signature should be IsClosed() bool (same as the existing LineString).

Should be implemented for all concrete types:

Type Empty Behaviour Non-empty Behaviour Notes
Point false true
LineString false true if and only if the start and endpoints are the same Already implemented
Polygon false Deletegate to the exterior ring
MultiPoint false true if and only if all child Points are closed (delegate)
MultiLineString false true if and only if all child LineStrings are closed (delegate)
MultiPolygon false true if and only if all child Polygons are closed (deletegate)
GeometryCollection false true if and only if all child geometries are closed (delegate)

It should also be implemented for the Geometry type, and should just delegate to its enclosed geometry.

We will also want to add PostGIS and libgeos reference implementation checks. If the behaviour between the two reference implementations differ, then we will need to decide on the behaviour that we want to follow.

Nested holes not checked in polygon validation

The following polygon is invalid:

postgres=# SELECT ST_IsValid(ST_GeomFromText(
'POLYGON(
    (0 0,5 0,5 5,0 5,0 0),
    (1 1,4 1,4 4,1 4,1 1),
    (2 2,3 2,3 3,2 3,2 2)
)'));
NOTICE:  Holes are nested at or near point 2 2
 st_isvalid
------------
 f
(1 row)

But isn't reported as invalid by simplefeatures.

Naming Overhaul

The naming of most construction related functions are very verbose, often with many different verbose variants depending on input arguments.

Propose to renaming using the following convention:

New{{Resource}}{{InputCode}}

Where {{Resource}} is Line, Point, Coordinate, Scalar etc. And {{InputCode}} is F (float64), C (coordinates), XY (XY type), or empty indicates the 'natural' input type (e.g. NewMultiLineString would accept []LineString).

Remove IsValid

Background Context

When multiple geometries are loaded (either as multiple PostGIS rows or from multiple places in a GeoJSON object), any of them being invalid will cause the entire batch to be rejected from being loaded. This is because geometry constructors return an error when invalid geometries are loaded, and the most common batch logic is to fail the whole batch if a single element fails to load.

IsValid was added to address this problem. The idea being that the Geometries could be loaded with validation disabled, and then IsValid called on each element.

Problem

There is a major problem with that approach that make this less than ideal:

  • Is causes geometries to exist that are invalid. The methods on geometries all assume an implicit invariant that they are valid. So they may panic (or worse -- infinite loop consuming all CPU or memory resources).

Solution

I'm proposing to remove IsValid, and update the documentation to recommend the following approach instead:

  • Use json.RawMessage in the JSON unmarshal target type for Geometries. Then call geom.UnmarshalGEOJSON(field) manually on each geometry.

  • Use []byte as the target field when scanning geometry columns. Then call geom.UnmarshalWKB(field) manually on each geometry.

  • If any geometries are found to be invalid, then the user of the library has the option of skipping over that geometry and trying the next one.

EqualsExact (non-spatial) method

Would be useful for the Geometry interface to have an Equals method that just checks if the representation is exactly the same (similar to behaviour of reflect deep equal).

It would essentially just check that the types are exactly the same, and then iterate over all coordinates comparing them.

Add Constructors that Use XY instead of C

It would be useful to have the following XY variants of existing C constructors:

  • NewLineXY
  • NewLineStringXY
  • NewMultiLineStringXY
  • NewMultiPointXY
  • NewPolygonXY

The following types don't have C constructors at all:

  • MultiPolygon

This should have constructors both in the C and XY variants.

Must for checking constructors

Some constructors panic if their inputs are bad (e.g. a constructor taking a float will panic if the input is NaN or Inf). These sorts of constructors should be prefixed with Must and have the behaviour clearly documented. A non-panicing error-returning variant should also exist, without the Must prefix.

Remove Line type

The Line type is a just a special case of the LineString type, however it is completely re-implemented.

We'll need something to replace the Line type, because it is used in some of the algorithms. However this doesn't need to complicate the API by being exported.

Improve LineString storage efficiency

The LineString type currently stores a list of Lines. In effect, this causes the memory required for a LineString to be double compared to the approach of just storing the control points.

We should just store the control points instead.

Some particular gotchas to be careful of:

  • Much of the library internally reads from the lines slice directly. This isn't ideal -- the library should use the external interface of LineString just as any other client would (unless there is a good reason not to).
  • LineStrings can contain adjacent coincident control points. So the naive approach of constructing a Line from adjacent control points will not always work.

Convex Hull failure case

The current convex hull implementation doesn't work correctly for the following (randomly generated) input:

MULTIPOINT(
(0.5326521145582077 0.5485877685877577),
(0.3621088142676205 0.5659955147732555),
(0.3858468357723894 0.3782674382646396),
(0.42894512378502714 0.46383436182460924),
(0.47475298844919056 0.3891140656169353),
(0.5068343483791182 0.44323186954969124),
(0.5704711402594045 0.572169994958013),
(0.3729267858810107 0.6136299003606884),
(0.6483512013657207 0.6362081037667116),
(0.417255810423433 0.44728479764485624)
)

OmitInvalid Constructor Option

It would be convenient if there were an option to TransformXY that would allow any subresults to be omitted if the transform resulted in them having invalid geometry.

Remove LinearRing type

The LinearRing type is never used on its own, and only as part of the internal structure of Polygons. We could instead make the polygons out of line strings and check that they're closed, and remove linear rings entirely. This will remove all of the special-casing everywhere for linear rings.

Implement Length for all types

Length is currently only implemented for Line, LineString, and MultiLineString. To match PostGIS behaviour, it can also be implemented (in the Geometry type) for other geometries (which should return 0).

Additional Envelope Methods

Additional methods are:

[ ] Center() XY
[ ] Covers(Envelope) bool
[x] Intersects(Envelope) bool
[ ] Area() float64
[ ] Width() float64
[ ] Height() float64
[ ] ExpandBy(x float64, y float64) Envelope
[ ] Distance(Envelope) float64

Also we should rename the Union, Extend and IntersectsPoint methods to the following more standard names (which better match the terminology used by JTS and GEOS):

[] ExpandToIncludePoint(XY) Envelope
[] ExpandToIncludeEnvelope(Envelope) Envelope
[] Contains(XY) bool

PointOnSurface calculation

First step is to experiment with PostGIS to determine which types of geometry PointOnSurface is defined for. After that, we have to work out the algorithm that is used (ideally we match the exact same algorithm as PostGIS and libgeos). The algorithm for Polygons can be found here: https://gis.stackexchange.com/questions/76498/how-is-st-pointonsurface-calculated

The signature of the function should be PointOnSurface() Point. If it's defined for all (or almost all) concrete geometry types, then we'll want to have Geometry implement it as well.

We will also want libgeos and PostGIS reference implementation checks added.

Constructor option to disable validation

Sometimes it's desirable to load invalid geometries, however this is prevented by the validation checks in the constructor functions.

We can add an option which will disable validations.

Boundary() should give back specific geometry types

Where possible, the boundary method on each geometry type should give back a specific geometry rather than the generic Geometry.

E.g. The Boundary() of a Polygon should be a MultiLineString rather than a Geometry.

Improve Validation Performance

Validation is currently slow for some types. It's important for validation to be fast, because it is a common operation (performed by default when any geometry is created). We want to:

  • Add performance tests for each type that performs validation.
  • Identify any types where validation has a sub-optimal complexity (either time complexity or space complexity).
  • Implement algorithms that have optimal complexity.

Once we are using the right algorithms, the validation can be profiled and constants reduced.

Simplify Constructor Options

There are currently 2 constructor options:

  • Disable All Validations
  • Disable Expensive Validations

The logic for working out which validations to perform can get a bit tricky... And the semantics around what an "expensive" validation also feels fuzzy.


Steps to implement this ticket:

  • Remove the DisableExpensiveValidations constructor option.
  • Remove the skipExpensiveValidations field of the ctorOptionSet type.
  • Remove the doExpensiveValidations and doCheapValidations methods on the ctorOptionSet type.
  • For each concrete geometry, skip all validation steps by checking the skipAllValidations flag on the ctorOptionSet struct (maybe rename the flag to skipValidations).

Remove EmptySet type

EmptySet can currently be Point, LineString, or Polygon. These should just be part of their respective types.

This is better in line with the behaviour of PostGIS, and allows a nicer interface in some cases (e.g. when returning a geometry and a flag, could instead just return a flag).

TransformXY method

Add a new TransformXY(func(XY) XY, ...ConstructorOption) method on the Geometry interface.

Later on when we support measure and 3D, we can have similar TransformXYZ, TransformXYM and TransformXYZM methods.

Improve LineString's IsSimple method

It currently uses an O(n^2) algorithm to detect self intersection (pairwise line segment comparison). We should be able to use a sweep line style algorithm instead.

Overhaul error system for easier programatic interogation

There is a scenario that we would like to know the error specific to geometry, for example invalid geometry. However, there is no way to identify whether or not the errors come from geometry specific problem. for instance, with the struct below. I am unable to tell the errors from json.Unmarshal() is geom specific.

type Place struct {
    name string `json:"name"`
    geometry geom.AnyGeometry `json:"geometry"`
}

Due to this reason, we have to work around this by separate structs to

type Place struct{
    name string `json:"name"`
    geometry geom.AnyGeometry `json:"-"`
}

type ValidGeom struct {
    geometry geom.AnyGeometry `json:"geometry"` 
}
var place Place
if err := json.Unmarshal(data, &place); err != nil {
    return fmt.Errorf("name is invalid: %s", err)
}

var validGeom ValidGeom
if err := json.Unmarshal(data, &validGeom); err != nil {
    return fmt.Errorf("geom is invalid: %s", err)
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.