arctern-io / arctern Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
get following error while run spark tests: run_st_transform(spark_session)
file path: GIS/spark/pyspark/example/gis/spark_udf_ex.py
ERROR 1: PROJ: proj_create_from_database: Open of /home/liangliu/anaconda3/envs/zgis_dev/share/proj failed
terminate called after throwing an instance of 'std::runtime_error*'
Describe the solution you'd like
Add build environment Dockerfile
in postgis:
select st_npoints(st_geomfromtext('POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))'));
select st_npoints(st_geomfromtext('POLYGON ((1 2, 3 4, 5 6, 1 2))'));
select st_npoints(st_geomfromtext('POLYGON ((1 1, 3 1, 3 3, 1 3, 1 1))'));
select st_npoints(st_geomfromtext('MULTIPOINT(0 0, 7 7)'));
select st_npoints(st_geomfromtext('GEOMETRYCOLLECTION(POINT(1 1), LINESTRING( 1 1 , 2 2, 3 3))'));
select st_npoints(st_geomfromtext('POINT EMPTY'));
results
5
4
5
2
4
0
in arctern:
results
0
0
0
0
0
1
I found some differences in arctern's parsing rules for WKT strings. Some data that would be reported incorrectly in postgis, but not arctern.
I tested the ST_IsValid function in arctern :
def run_st_tmp(spark):
register_funcs(spark)
input = []
input.extend([('POINT (1 8 2 4 )kdjff',)])
input.extend([('POLYGON ((1 1,1 2,2 2,2 1,1 1)),((dkjfkjd0 0,1 -1,3 4,-2 3,0 0))',)])
df = spark.createDataFrame(data=input, schema=['geos']).cache()
df.createOrReplaceTempView("t1")
spark.sql("select ST_IsValid_UDF(geos) from t1").show(100,0)
I got the following results :
+--------------------+
|ST_IsValid_UDF(geos)|
+--------------------+
| true |
| true |
+--------------------+
Our ST_IsValid implementation, first call OGRGeometryFactory: : createFromWkt, but OGRGeometryFactory: : createFromWkt input check is weak, so it produce the correct results.
I also looked at the implementation of other functions.There is no IsValid check before calling the gdal API. Refer to the gdal website API documentation as follows:
`
Geometry validity is not checked. In case you are unsure of the validity of the input geometries', call IsValid () before, otherwise the result took be wrong.
`So here are two Suggestions:
postgis
SELECT ST_isSimple('POLYGON ((1 2, 3 4, 5 6, 1 2))'::geometry);
result:
t (means true)
per actern, the result is false
It print followed warning when I run unittest
[ RUN ] geometry_test.test_ST_Area
Warning 1: OGR_G_Area() called against non-surface geometry type.
Warning 1: OGR_G_Area() called against non-surface geometry type.
Warning 1: OGR_G_Area() called against non-surface geometry type.
[ OK ] geometry_test.test_ST_Area (1 ms)
[ RUN ] geometry_test.test_ST_Centroid
[ OK ] geometry_test.test_ST_Centroid (0 ms)
[ RUN ] geometry_test.test_ST_Length
Warning 1: OGR_G_Length() called against a non-curve geometry type.
Warning 1: OGR_G_Length() called against a non-curve geometry type.
Warning 1: OGR_G_Length() called against a non-curve geometry type.
Warning 1: OGR_G_Length() called against a non-curve geometry type.
Warning 1: OGR_G_Length() called against a non-curve geometry type.
Warning 1: OGR_G_Length() called against a non-curve geometry type.
[ OK ] geometry_test.test_ST_Length (0 ms)
So, I suggest to check geometry type before call ST_Area and ST_Length
ST_Union_Aggr_UDF throw exception when multipolygon is combined with others.
arctern test code :
def run_st_union(spark):
register_funcs(spark)
test_data1 = []
test_data1.extend([('MULTIPOINT (1 1,3 4)',)])
test_data1.extend([('LINESTRING (1 1,1 2,2 3)',)])
test_data1.extend([('MULTILINESTRING ((1 1,1 2),(2 4,1 9,1 8))',)])
test_data1.extend([('MULTILINESTRING ((1 1,3 4))',)])
test_data1.extend([('POLYGON ((1 1,1 2,2 2,2 1,1 1))',)])
test_data1.extend([('MULTIPOLYGON ( ((1 1,1 2,2 2,2 1,1 1)),((0 0,1 -1,3 4,-2 3,0 0)) )',)]) # topologyEX
union_aggr_df1 = spark.createDataFrame(data=test_data1, schema=['geos']).cache()
union_aggr_df1.createOrReplaceTempView("union_aggr1")
rs = spark.sql("select ST_Union_Aggr_UDF(geos) from union_aggr1").show(100,0)
postgis sql :
drop table if exists test_union;
create table test_union (geos geometry);
insert into test_union values
('MULTIPOINT (1 1,3 4)'),
('LINESTRING (1 1,1 2,2 3)'),
('MULTILINESTRING ((1 1,1 2),(2 4,1 9,1 8))'),
('MULTILINESTRING ((1 1,3 4))'),
('POLYGON ((1 1,1 2,2 2,2 1,1 1))'),
('MULTIPOLYGON (((1 1,1 2,2 2,2 1,1 1)),((0 0,1 -1,3 4,-2 3,0 0)) )')
;
select st_astext(st_union(geos)) from test_union;
arctern result :
ERROR 1: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 1.8 1 at 1.8 1
ERROR 10: Pointer 'hGeom' is NULL in 'OGR_G_ExportToWkt'.
terminate called after throwing an instance of 'std::runtime_error'
what(): gdal error code = 6
postgis result :
GEOMETRYCOLLECTION(LINESTRING(2 4,1 9,1 8),POLYGON((2 1.5,2 1,1.8 1,1 -1,0 0,-2 3,3 4,2 1.5)))
our sql:
select st_envelope_udf(geos) as geos from test_envelope
input:
{"geos": "POLYGON EMPTY"}
{"geos": "LINESTRING EMPTY"}
{"geos": "POINT EMPTY"}
{"geos": "MULTIPOLYGON EMPTY"}
{"geos": "MULTILINESTRING EMPTY"}
{"geos": "MULTIPOINT EMPTY"}
{"geos": "GEOMETRYCOLLECTION EMPTY"}
result:
{"geos":"POINT (0 0)"}
{"geos":"POINT (0 0)"}
{"geos":"POINT (0 0)"}
{"geos":"POINT (0 0)"}
{"geos":"POINT (0 0)"}
{"geos":"POINT (0 0)"}
{"geos":"POINT (0 0)"}
in POSTGIS
sqls:
select st_astext(st_envelope('POLYGON EMPTY'::geometry));
select st_astext(st_envelope('LINESTRING EMPTY'::geometry));
select st_astext(st_envelope('POINT EMPTY'::geometry));
select st_astext(st_envelope('MULTIPOLYGON EMPTY'::geometry));
select st_astext(st_envelope('MULTILINESTRING EMPTY'::geometry));
select st_astext(st_envelope('MULTIPOINT EMPTY'::geometry));
select st_astext(st_envelope('GEOMETRYCOLLECTION EMPTY'::geometry));
result:
POLYGON EMPTY
LINESTRING EMPTY
POINT EMPTY
MULTIPOLYGON EMPTY
MULTILINESTRING EMPTY
MULTIPOINT EMPTY
GEOMETRYCOLLECTION EMPTY
This code is broken at arrow 0.15
error: ‘using element_type = struct arrow::ArrayData’ {aka ‘struct arrow::ArrayData’} has no member named ‘GetValues’
55 | vertices_x_ = (uint32_t*)x_array->data()->GetValues<uint8_t>(1);
I encountered the following problem when compiling with “make -j10”:
/GIS/cpp/src/render/utils/my_zlib_compress.h:1:33: fatal error: stb/stb_image_write.h: No such file or directory.
However,compiling passed when using "make".
sql: select st_area_udf(geos) as my_area from test_area
data: {"geos": "LINESTRING (77.29 29.07,77.42 29.26,77.27 29.31,77.29 29.07)"}
result: {"my_area":0.01750000000000007}
expected: 0.0
guess in this case it was treated as a polygon
sql:
select st_union_aggr_udf(myshape) from (select st_polygonfromenvelope_udf(a,c,b,d) as myshape from polygontable)
polygon.json
in postgis, distance with an empty geometry is like 'empty'
postgis:
SELECT ST_distance('POINT EMPTY'::geometry,'POINT(1 2)'::geometry);
result:
postgres=# SELECT ST_distance('POINT EMPTY'::geometry,'POINT(1 2)'::geometry);
st_distance
||-------------
(1 row)
per actern, result is 0
Describe the solution you'd like
Add Pytest in Jenkins CI
GIS test :
wkt_arrow_array = {MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)), ((0 0,1 0,0 1,0 0)) ) }
zilliz::gis::ST_Buffer(wkt_arrow_array,0)
output : POLYGON ((0 0,0 1,0.2 0.8,1 4,1 0,0 0))
postgis test :
select st_astext(st_buffer('MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)), ((0 0,1 0,0 1,0 0)) )'::geometry,0))
output : POLYGON((0 0,0 1,0.2 0.8,1 4,1 0,0 0))
Describe the solution you'd like
Add build stage in Jenkins CI
Location of incorrect documentation
https://github.com/zilliztech/arctern/blob/conda/doc/Build-Conda-Package.md
Suggested fix for documentation
Update build Conda package document
Describe the solution you'd like
Add conda build and upload in Jenkins CI
Describe the solution you'd like
Open check cpplint & clang-format & pylint operation
GIS test :
wkt_arrow_array1 = { POLYGON ((0 0,0 1,1 1,1 0,0 0))}
wkt_arrow_array2 = { MULTIPOLYGON ( ((0 0, 0 2, 2 3,2 0,0 0)) )}
zilliz::gis::ST_Overlaps(wkt_arrow_array1,wkt_arrow_array2)
output : true
postgis test :
select st_overlaps('POLYGON ((0 0,0 1,1 1,1 0,0 0))'::geometry,'MULTIPOLYGON ( ((0 0, 0 2, 2 3,2 0,0 0)) )'::geometry);
output : false
ST_PrecisionReduce
is wanted in current version, so if it could be done by gdal 3.0.4
, try to implement it by boost
in postgis:
select st_isvalid("POINT (30)");
select st_isvalid("POINT (,)");
select st_isvalid("POINT (a b)");
select st_isvalid("MULTIPOINT ()");
select st_isvalid("MULTIPOINT (,)");
select st_isvalid("POINT(1 2 3 4 5 6 7)");
select st_isvalid("LINESTRING(1 1)");
select st_isvalid("MULTIPOINT(1 1, 2 2");
all return ERROR while executing these in psql
in arctern, all of them return FALSE
in arctern the results for the following specific data are all false:
select st_equals_udf(left, right) as geos from test_equals
in postgis, these sqls results are all true
select st_equals('LINESTRING (0 0, 10 10)'::geometry, 'LINESTRING (0 0, 5 5, 10 10)'::geometry);
select st_equals('LINESTRING (10 10, 0 0)'::geometry, 'LINESTRING (0 0, 5 5, 10 10)'::geometry);
select st_equals('LINESTRING(0 0, 1 1)'::geometry, 'LINESTRING(1 1, 0 0)'::geometry);
sql:
select st_isvalid_udf(null)
raise exception
however per GeoSpark, should not raise exception here
log:
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/usr/local/bin/spark/python/lib/pyspark.zip/pyspark/worker.py", line 577, in main
eval_type = read_int(infile)
File "/usr/local/bin/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 837, in read_int
raise EOFError
EOFError
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:484)
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:99)
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:49)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:437)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:726)
at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$1.hasNext(InMemoryRelation.scala:132)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1370)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1297)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1361)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1185)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:360)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:311)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:441)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:444)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.UnsupportedOperationException: Unsupported data type: null
I got an exception when run my test code below :
from pyspark.sql import SparkSession
from zilliz_pyspark import register_funcs
def run_st_union(spark):
test_df = spark.read.json("/xxx/st_union.json").cache()
test_df.createOrReplaceTempView("st_union")
register_funcs(spark)
spark.sql("select ST_Union_Aggr_UDF(geos) from (select ST_PolygonFromEnvelope_UDF(a,c,b,d) as geos from st_union) as foo").show(100,0)
#main here.
st_union.json is just like :
{"a": 13.9, "c": 82.2, "b": 19.1, "d": 83.4}
{"a": 10.1, "c": 91.9, "b": 19.7, "d": 98.3}
{"a": 16.1, "c": 93.3, "b": 16.6, "d": 94.0}
{"a": 11.0, "c": 88.3, "b": 18.7, "d": 98.2}
{"a": 13.9, "c": 82.2, "b": 19.1, "d": 83.4}
{"a": 12.0, "c": 81.5, "b": 16.2, "d": 90.6}
{"a": 10.4, "c": 87.5, "b": 11.7, "d": 92.2}
{"a": 15.5, "c": 88.7, "b": 18.6, "d": 98.4}
{"a": 14.8, "c": 83.0, "b": 16.9, "d": 85.6}
{"a": 10.8, "c": 83.9, "b": 16.5, "d": 84.4}
{"a": 12.5, "c": 80.8, "b": 14.8, "d": 97.1}
The messege is :
ERROR 1: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 14.899999999999999 95.099999999999994 at 14.899999999999999 95.099999999999994
20/02/29 15:43:16 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3)
postgis test :
sql :
drop table t1;
create table t1 (a real,c real,b real,d real);
insert into t1 values
(10.1,91.9,19.7,98.3),
(16.1,93.3,16.6,94.0),
(11.0,88.3,18.7,98.2),
(13.9,82.2,19.1,83.4),
(12.0,81.5,16.2,90.6),
(10.4,87.5,11.7,92.2),
(15.5,88.7,18.6,98.4),
(14.8,83.0,16.9,85.6),
(10.8,83.9,16.5,84.4),
(12.5,80.8,14.8,97.1)
;
select st_astext(st_union(geo)) from (select st_makeEnvelope(a,c,b,d) as geo from t1) as foo;
result :
POLYGON((16.8999996185303 83.4000015258789,19.1000003814697 83.4000015258789,19.1000003814697 82.1999969482422,16.2000007629395 82.1999969482422,16.2000007629395 81.5,14.8000001907349 81.5,14.8000001907349 80.8000030517578,12.5 80.8000030517578,12.5 81.5,12 81.5,12 83.9000015258789,10.8000001907349 83.9000015258789,10.8000001907349 84.4000015258789,12 84.4000015258789,12 88.3000030517578,11.6999998092651 88.3000030517578,11.6999998092651 87.5,10.3999996185303 87.5,10.3999996185303 91.9000015258789,10.1000003814697 91.9000015258789,10.1000003814697 98.3000030517578,15.5 98.3000030517578,15.5 98.4000015258789,18.6000003814697 98.4000015258789,18.6000003814697 98.30
00030517578,19.7000007629395 98.3000030517578,19.7000007629395 91.9000015258789,18.7000007629395 91.9000015258789,18.7000007629395 88.3000030517578,16.2000007629395 88.
3000030517578,16.2000007629395 85.5999984741211,16.8999996185303 85.5999984741211,16.8999996185303 83.4000015258789))
Describe the solution you'd like
Cpplint attent copyright information
Describe the solution you'd like
Add Cpplint & Clang-format & Clang-tidy for GIS
I found that the WKT form does not specify the coordinate system type, which means that you can convert a string of type WKT to a spatial object in any coordinate system. This may be an issue to consider, as arctern's current interfaces are defined in WKT form.
I did the following tests to verify the above view:
select st_distance('LINESTRING (11 2,3 4)'::geometry,'POLYGON ((0 0,0 1,3 3,1 0,0 0))'::geometry) ; -- sql1
select st_distance('LINESTRING (11 2,3 4)'::geography,'POLYGON ((0 0,0 1,3 3,1 0,0 0))'::geography) ; -- sql2
select st_distance('LINESTRING (11 2,3 4)'::geography,'POLYGON ((0 0,0 1,3 3,1 0,0 0))'::geometry) ; -- sql3
The results are :
sql1 : 0.970142500145332
sql2 : 107417.14877794
sql3 : 107417.14877794 (just same as sql2)
You can see that the sql1 and sql2 results are different.
Therefore, I tried to add extra information to the WKT string to avoid possible ambiguity caused by the above phenomenon.
Here is my test SQL statement (I chose POINT and LINESTRING to avoid possible errors):
SELECT st_distance(
ST_Transform(ST_GeomFromText('POINT (1 1)',4326),3857),
ST_Transform(ST_GeomFromText('LINESTRING (0 0,0 1)',4326),3857)
); -- sql4
select st_distance('POINT(1 1)'::geography,'LINESTRING(0 0,0 1)'::geography); -- sql5
The results are :
sql4 : 111319.490793272
sql5 : 111302.64933943
It can be found that the results of sql4 and sql5 are close to each other. I am not sure whether it is the error caused by the coordinate system mapping, but it can also be verified that adding additional information can avoid the above ambiguity.
Note: all tests are in postgis.
We need to check if the GIS project conforms to good integration practices
View: https://docs.pytest.org/en/latest/goodpractices.html#goodpractices
in our st_envelope_udf function the result will be 'POINT (0 0)'
actually, it's different on envelope all empty geometry types
postgis:
select st_astext(st_envelope('POLYGON EMPTY'::geometry));
result:
st_astext
POLYGON EMPTY
Describe the solution you'd like
Deploy Arctern cluster with Docker compose
Is your feature request related to a problem? Please describe.
When I am using Arctern, I can’t know the current Arctern system information(E.g. version)
Describe the solution you'd like
Add system information interface for Arctern
GIS test :
wkt_arrow_array = {MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)), ((0 0,1 0,0 1,0 0)) )}
zilliz::gis::ST_Centroid(wkt_arrow_array)
output : POINT (0.6 1.13333333333333)
postgis test :
select st_astext(st_centroid('MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)), ((0 0,1 0,0 1,0 0)) )'::geometry));
output : POINT(0.6 1.13333333333333)
Describe the solution you'd like
Loading conda environment
My test code :
from osgeo import ogr
p0 =ogr.CreateGeometryFromWkt('POINT (1 8)')
p1 =ogr.CreateGeometryFromWkt('MULTIPOINT (1 1,3 4)')
p2 =ogr.CreateGeometryFromWkt('LINESTRING (1 1,1 2,2 3)')
p3 =ogr.CreateGeometryFromWkt('MULTILINESTRING ((1 1,1 2),(2 4,1 9,1 8))' )
p4 =ogr.CreateGeometryFromWkt('MULTILINESTRING ((1 1,3 4))')
p5 =ogr.CreateGeometryFromWkt('POLYGON ((1 1,1 2,2 2,2 1,1 1))')
p6 =ogr.CreateGeometryFromWkt('POLYGON ((1 1,1 2,2 2,2 1,1 1)),((0 0,1 -1,3 4,-2 3,0 0))')
p7 =ogr.CreateGeometryFromWkt('POLYGON ((1 1,1 2,2 2,2 1,1 1),(0 0,1 -1,3 4,-2 3,0 0))')
p8 =ogr.CreateGeometryFromWkt('MULTIPOLYGON (((1 1,1 2,2 2,2 1,1 1)),((0 0,1 -1,3 4,-2 3,0 0)) )')
p9 =ogr.CreateGeometryFromWkt('POINT EMPTY')
p10=ogr.CreateGeometryFromWkt('LINESTRING EMPTY')
p11=ogr.CreateGeometryFromWkt('POLYGON EMPTY')
p12=ogr.CreateGeometryFromWkt('MULTIPOINT EMPTY')
p13=ogr.CreateGeometryFromWkt('MULTILINESTRING EMPTY')
p14=ogr.CreateGeometryFromWkt('MULTIPOLYGON EMPTY')
p15=ogr.CreateGeometryFromWkt('GEOMETRYCOLLECTION EMPTY')
p16=ogr.CreateGeometryFromWkt('CIRCULARSTRING (0 2, -1 1,0 0, 0.5 0, 1 0, 2 1, 1 2, 0.5 2, 0 2)')
p17=ogr.CreateGeometryFromWkt('COMPOUNDCURVE(CIRCULARSTRING(0 2, -1 1,0 0),(0 0, 0.5 0, 1 0),CIRCULARSTRING( 1 0, 2 1, 1 2),(1 2, 0.5 2, 0 2))')
p18=ogr.CreateGeometryFromWkt('GEOMETRYCOLLECTION ( LINESTRING ( 90 190, 120 190, 50 60, 130 10, 190 50, 160 90, 10 150, 90 190 ), POINT(90 190) ) ')
p19=ogr.CreateGeometryFromWkt('MULTICURVE ((5 5, 3 5, 3 3, 0 3), CIRCULARSTRING (0 0, 0.2 1, 0.5 1.4), COMPOUNDCURVE (CIRCULARSTRING (0 0,1 1,1 0),(1 0,0 1)))')
p20=ogr.CreateGeometryFromWkt('CURVEPOLYGON(CIRCULARSTRING(0 0, 4 0, 4 4, 0 4, 0 0),(1 1, 3 3, 3 1, 1 1))')
p21=ogr.CreateGeometryFromWkt('CURVEPOLYGON(COMPOUNDCURVE(CIRCULARSTRING(0 0,2 0, 2 1, 2 3, 4 3),(4 3, 4 5, 1 4, 0 0)), CIRCULARSTRING(1.7 1, 1.4 0.4, 1.6 0.4, 1.6 0.5, 1.7 1) )')
p22=ogr.CreateGeometryFromWkt('MULTISURFACE(CURVEPOLYGON(CIRCULARSTRING(0 0, 4 0, 4 4, 0 4, 0 0),(1 1, 3 3, 3 1, 1 1)),((10 10, 14 12, 11 10, 10 10),(11 11, 11.5 11, 11 11.5, 11 11)))')
p23=ogr.CreateGeometryFromWkt('MULTISURFACE Z (CURVEPOLYGON Z (CIRCULARSTRING Z (-2 0 0, -1 -1 1, 0 0 2, 1 -1 3, 2 0 4, 0 2 2, -2 0 0), (-1 0 1, 0 0.5 2, 1 0 3, 0 1 3, -1 0 1)), ((7 8 7, 10 10 5, 6 14 3, 4 11 4, 7 8 7)))')
p24=ogr.CreateGeometryFromWkt('MULTISURFACE (CURVEPOLYGON (CIRCULARSTRING (-2 0, -1 -1, 0 0, 1 -1, 2 0, 0 2, -2 0), (-1 0, 0 0.5, 1 0, 0 1, -1 0)), ((7 8, 10 10, 6 14, 4 11, 7 8)))')
p25=ogr.CreateGeometryFromWkt('POLYHEDRALSURFACE (((0 0,0 0,0 1,0 0)),((0 0,0 1,1 0,0 0)),((0 0,1 0,0 0,0 0)),((1 0,0 1,0 0,1 0)))')
p26=ogr.CreateGeometryFromWkt('TRIANGLE ((1 2,4 5,7 8,1 2))')
p27=ogr.CreateGeometryFromWkt('TIN ( ((0 0, 0 0, 0 1, 0 0)), ((0 0, 0 1, 1 1, 0 0)) )')
isValid0 =p0.IsValid()
isValid1 =p1.IsValid()
isValid2 =p2.IsValid()
isValid3 =p3.IsValid()
isValid4 =p4.IsValid()
isValid5 =p5.IsValid()
isValid6 =p6.IsValid()
isValid7 =p7.IsValid()
isValid8 =p8.IsValid()
isValid9 =p9.IsValid()
isValid10=p10.IsValid()
isValid11=p11.IsValid()
isValid12=p12.IsValid()
isValid13=p13.IsValid()
isValid14=p14.IsValid()
isValid15=p15.IsValid()
isValid16=p16.IsValid()
isValid17=p17.IsValid()
isValid18=p18.IsValid()
isValid19=p19.IsValid()
isValid20=p20.IsValid()
isValid21=p21.IsValid()
isValid22=p22.IsValid()
isValid23=p23.IsValid()
isValid24=p24.IsValid()
isValid25=p25.IsValid()
isValid26=p26.IsValid()
isValid27=p27.IsValid()
isValid0
isValid1
isValid2
isValid3
isValid4
isValid5
isValid6
isValid7
isValid8
isValid9
isValid10
isValid11
isValid12
isValid13
isValid14
isValid15
isValid16
isValid17
isValid18
isValid19
isValid20
isValid21
isValid22
isValid23
isValid24
isValid25
isValid26
isValid27
test result :
>>> isValid0
True
>>> isValid1
True
>>> isValid2
True
>>> isValid3
True
>>> isValid4
True
>>> isValid5
True
>>> isValid6
True
>>> isValid7
False
>>> isValid8
False
>>> isValid9
True
>>> isValid10
True
>>> isValid11
True
>>> isValid12
True
>>> isValid13
True
>>> isValid14
True
>>> isValid15
True
>>> isValid16
True
>>> isValid17
True
>>> isValid18
True
>>> isValid19
True
>>> isValid20
True
>>> isValid21
True
>>> isValid22
False
>>> isValid23
True
>>> isValid24
True
>>> isValid25
False
>>> isValid26
False
>>> isValid27
False
for data below:
{"left": "POLYGON ((40 21, 40 22, 40 23, 40 21))", "right": "POLYGON ((2 2, 9 2, 9 9, 2 9, 2 2))"}
{"left": "POINT(1 3)", "right": "LINESTRING (0 0, 10 10)"}
{"left": "POINT(-1 4)", "right": "LINESTRING (0 0, 10 10)"}
{"left": "POINT(10 1)", "right": "LINESTRING (0 0, 10 10)"}
{"left": "POINT(7 9)", "right": "LINESTRING (0 0, 10 10)"}
in arctern:
{"ST_Intersection_UDF(left, right)":"POLYGON EMPTY"}
{"ST_Intersection_UDF(left, right)":"POINT EMPTY"}
{"ST_Intersection_UDF(left, right)":"POINT EMPTY"}
{"ST_Intersection_UDF(left, right)":"POINT EMPTY"}
{"ST_Intersection_UDF(left, right)":"POINT EMPTY"}
in postgis:
GEOMETRYCOLLECTION EMPTY
GEOMETRYCOLLECTION EMPTY
GEOMETRYCOLLECTION EMPTY
GEOMETRYCOLLECTION EMPTY
GEOMETRYCOLLECTION EMPTY
Describe the solution you'd like
Add pylint for GIS
Describe the bug
The version of libprotobuf is 2.6.1 in the system environment,but it is 3.11.0 in the conda environment. When I execute unittest, the program reports an error.
Steps/Code to reproduce behavior
[libprotobuf FATAL google/protobuf/stubs/common.cc:87] This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.11.0). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-O8_xaj/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
[2020-02-17T13:26:45.455Z] terminate called after throwing an instance of 'google::protobuf::FatalException'
[2020-02-17T13:26:45.455Z] what(): This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.11.0). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "/build/mir-O8_xaj/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)
Expected behavior
Execute unittest and return correct results in docker
Environment details
Describe the solution you'd like
Add Jenkins CI for GIS
Describe the solution you'd like
Change base docker image in GPU version build environment Dockerfile
GIS test :
wkt_arrow_array1 = { POLYGON ((0 0,4 0,4 4,0 4,0 0))}
wkt_arrow_array2 = { POINT (4 0)}
zilliz::gis::ST_Contains(wkt_arrow_array1,wkt_arrow_array2)
output : false
postgis test :
select st_contains('POLYGON ((0 0,4 0,4 4,0 4,0 0))'::geometry,'POINT (4 0)'::geometry);
output : false
The following design issues need to be discussed:
geospark test :
spark.sql("SELECT ST_Area(ST_GeomFromWKT('LINESTRING (0 0, 1 0, 1 1, 0 0)'))").show(1,0)
output : 0
spark.sql("SELECT ST_Area(ST_GeomFromWKT('MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)), ((0 0,1 0,0 1,0 0)) ) '))").show(1,0)
output : 1.5
interface changes:
Describe the solution you'd like
Add Unittest in Jenkins CI
If the input is not valid geometry,like Im not polygon
, ST_IsValid
will crash and throw exception with error message
unknown file: Failure
C++ exception with description "gdal error code = 3" thrown in the test body.
This is the test code,and it will throw exception.
arrow::StringBuilder string_builder;
std::shared_ptr<arrow::Array> polygons;
string_builder.Append("my is not polygon");
string_builder.Finish(&polygons);
auto vaild_mark = ST_IsValid(polygons);
geospark test :
spark.sql("SELECT ST_Length(ST_GeomFromWKT('POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))'))").show(1,0)
output : 4.0
spark.sql("SELECT ST_Length(ST_GeomFromWKT('MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)))'))").show(1,0)
output : 9.123105625617661
spark.sql("SELECT ST_Length(ST_GeomFromWKT('MULTIPOLYGON ( ((0 0, 0 4, 4 4, 4 0, 0 0)), ((0 0, 0 1, 4 1, 4 0, 0 0)) )'))").show(1,0)
output : 26.0
GIS test :
wkt_arrow_array = {POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0)) ,MULTIPOLYGON ( ((0 0, 1 4, 1 0,0 0)) ) , MULTIPOLYGON ( ((0 0, 0 4, 4 4, 4 0, 0 0)), ((0 0, 0 1, 4 1, 4 0, 0 0)) )}
zilliz::gis::ST_Area(wkt_arrow_array)
output : 0 , 0 , 0
postgis test
output : 0 , 0 , 0
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
Add pod tolerations to Jenkins slave pods
Describe alternatives you've considered
Additional context
Currently we only need to focus on the column-based UDF interfaces.
In the macro of CHECK_GDAL
, it would throw an exception of std::runtime_error
if gdal return error, would python catch this exception?
what happen in pyspark
when cpp throw exception?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.