Giter Site home page Giter Site logo

dov-vlaanderen / pydov Goto Github PK

View Code? Open in Web Editor NEW
31.0 20.0 17.0 17.15 MB

Python package to retrieve data from Databank Ondergrond Vlaanderen (DOV)

Home Page: https://pydov.readthedocs.io/en/latest/

License: MIT License

Python 78.74% Jupyter Notebook 21.26%
python package data-access water lifewatch oscibio

pydov's Introduction

pydov

CI Documentation Status Project Status: Active – The project has reached a stable, usable state and is being actively developed. DOI pyOpenSci

pydov is a Python package to query and download data from Databank Ondergrond Vlaanderen (DOV). It is hosted on GitHub and development is coordinated by Databank Ondergrond Vlaanderen (DOV). DOV aggregates data about soil, subsoil and groundwater of Flanders and makes them publicly available. Interactive and human-readable extraction and querying of the data is provided by a web application, whereas the focus of this package is to support machine-based extraction and conversion of the data.

To get started, see the documentation at https://pydov.readthedocs.io.

Please note that downloading DOV data with pydov is governed by the same disclaimer that applies to the other DOV services. Be sure to consult it when using DOV data with pydov.

Installation

You can install pydov stable using pip:

pip install pydov

Or clone the git repository and install with python setup.py install to get the latest snapshot from the master branch.

To contribute to the code, make sure to install the package and all of the development dependencies enlisted in the requirements_dev.txt file. First, clone the git repository. We advice to use an Python development environment, for example with conda or virtualenv. Activate the (conda/virtualenv) environment and install the package in development mode:

pip install -e .[devs]

Need more detailed instructions? Check out the installation instructions and the development guidelines.

Quick start

Read the quick start from the docs or jump straight in:

from pydov.search.boring import BoringSearch
from pydov.util.location import Within, Box

from owslib.fes2 import PropertyIsGreaterThan

boringsearch = BoringSearch()

dataframe = boringsearch.search(
    query=PropertyIsGreaterThan(propertyname='diepte_tot_m', literal='550'),
    location=Within(Box(107500, 202000, 108500, 203000))
)

The resulting dataframe contains the information on boreholes (boringen) within the provided bounding box (as defined by the location argument) with a depth larger than 550m:

>>> dataframe
                                         pkey_boring     boornummer         x         y  mv_mtaw  start_boring_mtaw gemeente  diepte_boring_van  diepte_boring_tot datum_aanvang uitvoerder  boorgatmeting  diepte_methode_van  diepte_methode_tot boormethode
0  https://www.dov.vlaanderen.be/data/boring/1989...  kb14d40e-B777  108015.0  202860.0      5.0                5.0     Gent                0.0              660.0    1989-01-25   onbekend          False                 0.0               660.0    onbekend
1  https://www.dov.vlaanderen.be/data/boring/1972...  kb14d40e-B778  108090.0  202835.0      5.0                5.0     Gent                0.0              600.0    1972-05-17   onbekend          False                 0.0               600.0    onbekend

Documentation

Full documentation of pydov can be found on our ReadTheDocs page.

Contributing

You do not need to be a code expert to contribute to this project as there are several ways you can contribute to this project. Have a look at the contributing page.

Meta

  • We welcome contributions including bug reports.
  • License: MIT
  • Citation information can be found on Zenodo.
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
  • Also note that downloading DOV data with pydov is governed by the same disclaimer that applies to the other DOV services. Be sure to consult it when using DOV data with pydov.

pydov's People

Contributors

guillaumevandekerckhove avatar johanvdw avatar jorissynaeve avatar jorisvandenbossche avatar kpaenen avatar marleenvd avatar meisty avatar peterdesmet avatar pjhaest avatar rebot avatar roel avatar stijnvanhoey avatar sweco-begilt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pydov's Issues

Provide documentation on data column origin

Currently, the data documentation provides for each variable the DOV scheme origin. The source should be updated to the real used source (WFS or XML) to clarify the effort of data transfer for a given query

add ci to pydov

  • both Travis (pip) and appveyor (conda), also testing it works within the osgeo4w context
  • including pypi automatic releases
  • deploying docs on github pagesreadthedocs when successfull
  • tests for both Python 2.(7) and 3.(5)
  • link with coveralls

No documentation in xsd for grondwaterlichaam, regime and grondwatersysteem

In FilterDataTypes.xsd no documentation is provided for the following items:

<xs:element name="grondwaterlichaam" type="GrondwaterlichaamEnumType" minOccurs="0"><xs:annotation><xs:documentation/></xs:annotation></xs:element><xs:element name="grondwatersysteem" type="GrondwatersysteemEnumType" minOccurs="0"><xs:annotation><xs:documentation/></xs:annotation></xs:element><xs:element name="regime" type="interpretatie:RegimeEnumType"><xs:annotation><xs:documentation>regime</xs:documentation></xs:annotation></xs:element>

Cannot use fields from a subtype as return fields.

  • PyDOV version: master
  • Python version: 3.6
  • Operating System: Windows 10

Description

Specifying a field from a subtype as return field gives an error if the resulting dataframe is non-empty.

What I Did

import pydov.search.boring
from owslib.fes import PropertyIsEqualTo

bs = pydov.search.boring.BoringSearch()

bs.search(query=query, return_fields=('pkey_boring',))
                                         pkey_boring
0  https://www.dov.vlaanderen.be/data/boring/2004...

bs.search(query=query, return_fields=('pkey_boring', 'boormethode'))
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Projecten\PyDov\pydov_git\pydov\search\boring.py", line 114, in search
    columns=Boring.get_field_names(return_fields))
  File "C:\Users\rhbav33\python_virtualenvs\3.6_dev\lib\site-packages\pandas\core\frame.py", line 364, in __init__
    data = list(data)
  File "C:\Projecten\PyDov\pydov_git\pydov\types\abstract.py", line 467, in to_df_array
    result = item.get_df_array(return_fields)
  File "C:\Projecten\PyDov\pydov_git\pydov\types\abstract.py", line 524, in get_df_array
    return_fields=return_fields)
  File "C:\Projecten\PyDov\pydov_git\pydov\types\abstract.py", line 386, in get_field_names
    raise InvalidFieldError("Unknown return field: '%s'" % rf)
pydov.util.errors.InvalidFieldError: Unknown return field: 'boormethode'

Appveyor build broken for Python2.7

Recent Appveyor builds for Python2.7 fail with:

pip install --no-cache-dir --ignore-installed -r requirements_dev.txt
ERROR: To modify pip, please run the following command:
C:\Miniconda\python.exe -m pip install --no-cache-dir --ignore-installed -r requirements_dev.txt

Multiple coveralls reports for single PR

This provides an overflow of information... Having the information on the PR level is useful to evaluate the effort of test-writing when someone adds new features, but having 4 reports for each change is overkill.

The issue is reported on github, and the coveralls documentation provides a potential solution, but the current attempt was without success... Other people experience similar issues. Moreover, the responses to https://github.com/lemurheavy/coveralls-public/issues seems rather low...

I would actually attempt a switch to codecov, as it provides this feature out of the box.

https://docs.codecov.io/v4.3.6/docs/merging-reports

Find a solution to reuse monkeypatches across types

Currently we have to copy/paste all the monkeypatch functions between the tests of different types and change the location of the data, f.ex.:

def mp_remote_wfs_feature(monkeypatch):
    """Monkeypatch the call to get WFS features.

    Parameters
    ----------
    monkeypatch : pytest.fixture
        PyTest monkeypatch fixture.

    """
    def __get_remote_wfs_feature(*args, **kwargs):
        with open('tests/data/types/boring/wfsgetfeature.xml',
                  'r') as f:
            data = f.read()
            if type(data) is not bytes:
                data = data.encode('utf-8')
        return data

    if sys.version_info[0] < 3:
        monkeypatch.setattr(
            'pydov.util.owsutil.wfs_get_feature',
            __get_remote_wfs_feature)
    else:
        monkeypatch.setattr(
            'pydov.util.owsutil.wfs_get_feature',
            __get_remote_wfs_feature)

We should think of a way to only specify the base path once and be able to reuse the monkeypatches between different types.

XML string parsing error

XML can contain non-ascii that fail upon encoding (see below).
For example: "u'Societé Belge des Bétons'"
Check boring type and search for
{'start_boring_mtaw': 61.0, 'boorgatmeting': '{UNRESOLVED}', 'uitvoerder': '{UNRESOLVED}', 'boornummer': 'kb34d93e-B183', 'pkey_boring': 'https://www.dov.vlaanderen.be/data/boring/1928-031159', 'mv_mtaw': '{UNRESOLVED}', 'diepte_boring_van': '{UNRESOLVED}', 'y': 177156.1, 'x': 234685.7, 'datum_aanvang': datetime.date(1928, 1, 2), 'diepte_boring_tot': 15.0, 'gemeente': 'Bilzen'}

File "D:\_wd\pydov\types\abstract.py", line 59, in typeconvert
    return str(x).strip()
UnicodeEncodeError: 'ascii' codec can't encode characters in position 6-7: ordinal not in range(128)

XML parsing from downloaded file

The current code for interpretations is applicable for xml parsing from the webservice.
Later on, we should verify which adjustments are necessary to parse downloaded xml files where 'boringen' and 'interpretations' can both be present (currently, the type does not define the full rooth path in the xml of mixed data)

Dutch or English object names?

For groundwater we have now used an English term for our object: DovGroundwater.

I'm not sure we should translate things - it often leads to errors (eg filter instead of screen). Most fields in xml are also in Dutch (meetnet, ...) so I think it is better to stick to Dutch.

My proposal: if we are using our own standards we stick to Dutch. If we can use international standards (waterml, inspire, ...) we can use those versions.

Missing values in XML export

It looks like some values are missing in XML export, but are present in the UI.

Listing those here so we can pass them (grouped) to the DOV development team.

grondwater observaties: detectie

Providing a guide of contribution

As mentioned by @pjhaest, we should write some guidances on how new users can contribute, providing some good practices, choices,... things to mention:

  • scope of this package (what should be in and what not)
  • how to contribute (fork, pull request,...) with guidance
  • advice on how to use docstrings -> numpy doc style?
  • advice and routines on how to render the documentation as a readthedocs (which can be on github pages)
  • ...

Search query attributes

@Roel you mentioned earlier that the search query can make use of the attributes defined in the wfs schema. Ok so far. With the 'DOV-verkenner' there are more options available to search for. Are these also available for pydov? If so, could you post an attribute table to the docs?

Voorbeeld XML file in repo

As for the moment the XML service based on the URL is not yet provided, it would be useful to have an example file of the XML format in the repo. Hence, we can already provide and test the conversion functionality.

Caching of certain XML files is broken

  • PyDOV version: caching
  • Python version: 3.6
  • Operating System: Windows 10

Description

Caching of certain XML files (I assume ones with funny characters) is broken. This leads to errors when trying to reuse cached data.

The problem is twofold: for some reason some XML files cannot be saved in the cache, but instead an empty file is created causing trouble when trying to reuse this 'cached' data.

We should fix:

  • the root cause why certain XML's can't be saved
  • the fact the when saving fails an empty file is created
  • empty files shouldn't be considered valid cache

What I Did

import pydov.search.boring
from owslib.fes import PropertyIsEqualTo

query = PropertyIsEqualTo('boornummer', 'B/9-000014')
bs = pydov.search.boring.BoringSearch()

bs.search(query=query)
                                         pkey_boring  boornummer         x  \
0  https://www.dov.vlaanderen.be/data/boring/1995...  B/9-000014  197535.0   
          y  mv_mtaw  start_boring_mtaw gemeente  diepte_boring_van  \
0  187210.0    22.61              22.61    Diest                0.0   
   diepte_boring_tot datum_aanvang      uitvoerder  boorgatmeting  \
0              350.0    1995-01-01  Peeters-Ramsel          False   

bs.search(query=query)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Projecten\PyDov\pydov_git\pydov\search\boring.py", line 114, in search
    columns=Boring.get_field_names(return_fields))
  File "C:\Users\rhbav33\python_virtualenvs\3.6_dev\lib\site-packages\pandas\core\frame.py", line 364, in __init__
    data = list(data)
  File "C:\Projecten\PyDov\pydov_git\pydov\types\abstract.py", line 468, in to_df_array
    result = item.get_df_array(return_fields)
  File "C:\Projecten\PyDov\pydov_git\pydov\types\abstract.py", line 532, in get_df_array
    self._parse_xml_data()
  File "C:\Projecten\PyDov\pydov_git\pydov\types\boring.py", line 191, in _parse_xml_data
    tree = etree.fromstring(xml)
  File "src\lxml\etree.pyx", line 3230, in lxml.etree.fromstring (src\lxml\etree.c:81056)
  File "src\lxml\parser.pxi", line 1871, in lxml.etree._parseMemoryDocument (src\lxml\etree.c:121236)
  File "src\lxml\parser.pxi", line 1759, in lxml.etree._parseDoc (src\lxml\etree.c:119912)
  File "src\lxml\parser.pxi", line 1125, in lxml.etree._BaseParser._parseDoc (src\lxml\etree.c:114159)
  File "src\lxml\parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src\lxml\etree.c:107724)
  File "src\lxml\parser.pxi", line 709, in lxml.etree._handleParseResult (src\lxml\etree.c:109433)
  File "src\lxml\parser.pxi", line 638, in lxml.etree._raiseParseError (src\lxml\etree.c:108287)
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1

example notebooks - using latest pydov version

Just a small comment:
In the notebooks you do

# Insert current tree in the sys path to be able to import local copy of 'pydov'
sys.path.insert(0, '../../')

You can also use editable (develop) mode to install:

pip install -e 

That way you are sure you always use the current version, without having to edit your path.

xsd scheme documentation description mistake

When checking the xsd scheme of DOV, I bumped into duplicate description. I do think the betrouwbaarheid has a wrong description:

file xsd/kern/gwmeetnet/FilterDataTypes.xsd:

...
<xs:element name="zoet" type="generiek:JNOEnumType" minOccurs="0" default="O"><xs:annotation><xs:documentation>omgerekend naar zoet water (ja/nee/onbekend)</xs:documentation></xs:annotation></xs:element><xs:element name="betrouwbaarheid" type="generiek:BetrouwbaarheidEnumType" minOccurs="0" default="onbekend"><xs:annotation><xs:documentation>omgerekend naar zoet water (ja/nee/onbekend)</xs:documentation></xs:annotation></xs:element>
...

XML boringen

Er werd aangegeven dat de xml van boringen etc reeds in productie is, dus enkele vraagjes:

  • Er is een probleem voor het verkrijgen van de xml in Python (terwijl ik wel de xml als download in mijn browser krijg). Johan heeft dit probleem al gesignaleerd #5 . Of moeten hier in Python nog zaken voor gedefinieerd worden? Zie onderaan voor het resultaat van een query in Python.
  • Als ik het goed begrijp blijft de eerste stap: de selectie van de boringen uit de webservice voor een bounding box, wel een POST request met het resultaat in JSON?
    Moet er dan een request gemaakt worden op onderstaande url, of naar de wfs? En wat zijn de vereiste parameters voor de bounding box, want deze staan onder Request Payload (?) ?
    ...www.dov.vlaanderen.be/zoeken-ocdov/proxy-boring/boring/search?maxresults=100

Xml in Python:

url  = 'https://www.dov.vlaanderen.be/data/boring/1981-010840.xml'
r = requests.get(url)
r.text
u'<!DOCTYPE HTML>\r\n<html>\r\n<head>\r\n    <link rel="icon" sizes="192x192"\r\n          href="//dij151upo6vad.cloudfront.net/latest/icons/app-icon/icon-highres-precomposed.png">\r\n    <meta http-equiv="content-type" content="text/html;charset=utf-8"/>\r\n    <title>DOV Portaal</title>\r\n    <script type="text/javascript" language="javascript">\r\n        appVersion = "v1.7.0";\r\n    </script>\r\n    <!-- Name defined in the module xml. -->\r\n    <!-- cfr https://groups.google.com/group/google-web-toolkit/browse_thread/thread/71b17949f9a7c333https://groups.google.com/group/google-web-toolkit/browse_thread/thread/71b17949f9a7c333  -->\r\n    <!-- before your module(*.nocache.js) loading  -->\r\n    <!--[if lt IE 9]>\r\n    <script src="https://html5shim.googlecode.com/svn/trunk/html5.js"></script>\r\n    <![endif]-->\r\n    <!--[if IE 7]>\r\n    <link rel="stylesheet" href="edovboringen/css/font-awesome-ie7.css">\r\n    <![endif]-->\r\n    <!-- your module(*.nocache.js) loading  -->\r\n    <script type="text/javascript" language="javascript" src="portaalclient/portaalclient.nocache.js"></script>\r\n</head>\r\n\r\n<body>\r\n<!-- OPTIONAL: include this if you want history support -->\r\n<iframe src="javascript:\'\'" id="__gwt_historyFrame" tabIndex="-1"\r\n        style="position: absolute; width: 0; height: 0; border: 0"></iframe>\r\n</body>\r\n</html>\r\n'

Compatiblity with pastas Python package

Pastas is an open-source framework for the analysis of hydrological time series, http://pastas.readthedocs.io/en/latest/.

Not a priority, but interesting to log here that it could be worthwhile to provide compatibility in the future. By providing the mapping to the pastas datamodel, the modelling tools developed in pastas would be applicable. As the modelling part is out of scope of this package, both packages are complimentary to each other.

(getting this from earlier correspondence from Pieter Jan Haest)

Use cases

Do we allow for use cases in this repo?
If so, where to put them: in the docs or some other folder?

Update name and description

I would call this package:

pydov

It's shorter than dov-pydownloader, sounds more official (only one pydov package) and easily applicable to R: rdov.

I would also update the description to:

A python package to retrieve data from Databank Ondergrond Vlaanderen (DOV)

Rather than "A python package to extract data from the DOV web application", as you don't really extract data from the application, but from the webservices. Also retrieve implies more of a request/response than extract.

Add coordinates to interpretations

During the august code sprint it became apparent that uses not always need 'boring' data next to interpretations. If only interpretations are searched, the users need access to the coordinates as well.
This can be circumvented by adding the wfs fields to the 'return_fields' argument but the column name then differs. Therefore, it is best to add these wfs data to the dataframe for all possible queries.

branch poc_boring

  • PyDOV version: branch poc_boring
  • Python version: 2.7.5 OSGeo4W distribution for QGIS 2.14 ltr
  • Operating System: Windows

Description

Testing branch poc_boring yielded some initial dependency issues on the OSGeo4W python distribution:

  • The owslib module is not up-to-date with the one tested by Roel: the schema.py (owslib/feature/) and other methods are missing.

  • Therefore, I downloaded the current master from GitHub and replaced this in OSGeo4W\apps\qgis-ltr\python\owslib

Then some warning messages pop-up, which I don't know if we should care about too much at the moment?

  • following openURL on line 54 of owslib\feature\common.py:
C:\OSGEO4~1\apps\Python27\Lib\site-packages\urllib3\util\ssl_.py:339: SNIMissingWarning: 
An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS 
is not available on this platform. This may cause the server to present an incorrect TLS certificate, 
which can cause validation failures. 
You can upgrade to a newer version of Python to solve this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning

C:\OSGEO4~1\apps\Python27\Lib\site-packages\urllib3\util\ssl_.py:137: InsecurePlatformWarning: 
A true SSLContext object is not available. 
This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. 
You can upgrade to a newer version of Python to solve this. 
For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
  • or after calling get_fields() in the boring_search.py examples
return self.__wfs.contents[self._layer] on line 104 of pydov/search.py returns:
C:\OSGEO4~1\apps\qgis-ltr\python\owslib\iso.py:116: FutureWarning: the .identification 
and .serviceidentification properties will merge into .identification being a list of properties.  
This is currently implemented in .identificationinfo.  Please see
ttps://github.com/geopython/OWSLib/issues/38 for more information
  FutureWarning)
  • the pkey_boring link is not printed correctly, but it is ok in the dataframe, no worries

Overall +1! More testing with new queries will follow.

AppVeyor build failing

The AppVeyor build is currently failing because it only installs requirements_dev.txt, and that does not include the -r requirements anymore.

Shall I install the requirements.txt in AppVeyor manually too or do we add the (contents of) requirements.txt to requirements_dev as before?

move `_parse_xml_data` to generic types abstract

_parse_xml_data is currently duplicated amongst different classes (grodwaterfilter, boring). We could move this to the abstract class. If alterations are required for certain classes, overwriting the class is still possible.

Coveralls build unreliable

The last Coveralls build of master dates from a few weeks ago, did we break something?

Other branches do have more recent builds, but not reliably for all commits/PRs.?

First example/idea of the aimed functionality, setup

As a functionality, the user would could like this (naming should be improved to better fit the naming in the groundwater domain):

import dov-downloader as dov
dov.download([list of wel]).subset_period('2000':'2007').to_csv("name_file.csv")

(in words: download my list of wells, filter that specific period and write everything into a csv-file)

Basically, there are 3 parts in this setup:

  1. download, i.e. extraction part: downloading data based on a list of stations; this part could be extended towards more powerful download_**** function, e.g. download_from_boundingbox, download_from_aquifer(),... These extension functions of the regular download will always require soma additional service calls, but will end up having a list of stations and use the download function
  2. subset_*, i.e. filter part: this should provide some straightforward functions to filter the downloaded data set. When using pandas DataFrames as the basic data type to store the data (see further), a lot of options will be available.
  3. to_***, i.e. conversion part: The data is stored or exported to a new file-format that could be useful for the user. to_csv/ to_excel are exampled that are already available, but the advantages of this package would be if there are more domain-specific export funtionalities, e.g. to_modflow(), to_menyanthes(), to_swap()

As we're dealing with time series, the usage of Pandas DataFrames as used datatype, provides a lot of built-in options. When needed, we make a new class inhereted on the pd.DataFrame to handle some additional metadata. Multiple stations can be solved by having a Multi-index as column headers. With the row labels as a DateTimeIndex, we have all the data handling options like resampling (daily/monthly/... mean values) and slicng data from Pandas available.

The fact that we will have the XML-format as such (always a complete time serie) as the stable source for data, I would propose to have an xml_to_df conversion function that converts the XML to a Pandas DataFrame as a basic function in direct relation with the other basic functionality download. These two functions (xml_to_df and download) could be the first milestone to implement. Than, more advanced download functions and more advanced export functions can be created ont op of this.

caching feature

Using the package, XML files need to be downloaded by the user. As we should not expect that users will be fully aware of when XML downloads are needed (versus pure WFS requests) we can counteract multiple downloads by providing a caching functionality:

  • XML files stored as files in cache folder
  • before requesting an XML file, check the cache folder for existing local XML files .
  • check for age; if older than X weeks, redownload the file

(is a package wide functionality, used by the different modules, basically a wrapper around pydov.types.abstract.AbstractDovType#_get_xml_data

Next development steps

(cfr. minimal translation of the discussion of fac2face meeting on 2017/10/24)

In terms of functionalities, I want to...

  1. ...download data (borehole data, water level,...) for location X (and derived locations, e.g. polygon,...).
  2. ...download data for all locations that provide a kind of observation (e.g. all filters measuring arsenic, all interpretations with a certain characteristic).
  3. ...download data from ... to ... (dates)

The current existing services (WFS, XML and the REST service of the application), can support these requests. As the REST service is not guaranteed to be stable (for the moment), we will primary focus on the WFS/XML services.

Scenario 1 can be supported by the WFS/XML. For scenario 2 this depends if the WFS contains the specific information. Period information is included in the WFS, so this is searchable on it, but will require some actions to mask this from the package user.

We will focus on scenario 1 for the moment, using the WFS for the location based searching and downloading/parsing the XML.

Translating to main classes

  • The general capabilities (general DOVObject, name to be defined)

    • parsing of the XML download
    • location based search/handling functionalities
  • Main handling classes to work on and inherit from the DOVObject

    • DovGroundwater -> focus on time series of levels and observations
    • DovBoreholes -> focus on interpretations
    • (DovConePenetrationTest -> focus on interpretations; inherit on BoreHoles)

Data representation

The package scope stops at a table representation of the required data in a Pandas data.frame (In a later stage we can add custom translators to other common groundwater formats).

  • More details (deeper into the XML tree) will result in more columns, as we denormalize the data. -
  • Logical entities to split (e.g. observations versus levels) are split in individual tables.
  • aggregated/derived fields useful for searching are not always useful to keep in the table, we exclude these in the resulting data.frame (e.g. are there interpretations). Comments fields are kept ;-)

Meta-organisation

  • ci: both Travis (pip) and appveyor (conda), also testing it works within the osgeo4w context
  • unit-tests: py.test
    • we'll add data directory with example XML to make sure unit-tests on handling XML can be tested off-line and provide additional tests to check that the XML format is still the same as serverside
    • code coverage: using coveralls
  • documentation: sphinx docs, hosting on github pages (deployed by the travis ci)
  • add code of conduct
  • support for at least 2.7 and 3.5

We always use pull requests to add new features, but you're allowed to merge your own pull requests

@Roel @johanvdw @pjhaest @marleenvd feel free to comment/...

caching in database

as discussed in the august code sprint it could be useful to implement an enhancement of the caching where downloaded xmls are stored in a postgres DB, and read from there.

maxfeatures van geoserver

Kan je het met behulp van OWSlib het maximum aantal features lezen dat je kan opvragen, zonder te itereren met je getfeature query?
Ik vind onder WFS versie 1.1.0 wel een verwijzing naar <ows:Constraint name="DefaultMaxFeatures">, maar dit komt nergens terug in de WFS voor DOV?

osgeo4w compatibility

As a subset of the target audience will be installing Python and GIS tools with osgeo4w, it is good to take this into account:

  • add CI tests in the osgeo4w context.
  • document the installation for this target audience

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.