Giter Site home page Giter Site logo

dalvarez83 / geowrangler Goto Github PK

View Code? Open in Web Editor NEW

This project forked from thinkingmachines/geowrangler

0.0 0.0 0.0 70.15 MB

๐ŸŒ A python package for wrangling geospatial datasets

Home Page: https://geowrangler.thinkingmachin.es/

License: MIT License

Python 5.35% Makefile 0.02% Jupyter Notebook 94.64%

geowrangler's Introduction

Geowrangler

License: MIT Code style: black Code style: isort Code style: flake8 Versions Docs

Tools for wrangling with geospatial data

Overview

Geowrangler is a python package for geodata wrangling. It helps you build data transformation workflows with no out-of-the-box solutions from other geospatial libraries.

We have surveyed our past geospatial projects to extract these solutions for our work and hope it will be useful for others as well.

Our audience are researchers, analysts and engineers delivering geospatial projects.

We welcome your comments, suggestions, bug reports and code contributions to make Geowrangler better.

Modules

  • Grid Tile Generation
  • Geometry Validation
  • Vector Zonal Stats
  • Raster Zonal Stats
  • Area Zonal Stats
  • Distance Zonal Stats
  • Demographic and Health Survey (DHS) Processing Utils
  • Geofabrik (OSM) Data Download
  • Ookla Data Download

Check this page for more details about our Roadmap

Installation

pip install geowrangler

Documentation

The documentation for the package is available here

Development

Development Setup

If you want to learn more about Geowrangler and explore its inner workings, you can setup a local development environment. You can run geowrangler's jupyter notebooks to see how the different modules are built and how they work.

Pre-requisites

  • OS: Linux, MacOS, Windows Subsystem for Linux (WSL) on Windows

  • Requirements:

    • python 3.7 or higher
    • virtualenv, venv or conda for python environment virtualization
    • poetry for dependency management

Github Repo Fork

If you plan to make contributions to geowrangler, we encourage you to create your fork of the Geowrangler repo.

This will then allow you to push commits to your forked repo and then create a Pull Request (PR) from your repo to the main geowrangler repo for approval by geowrangler's maintainers.

Development Installation

We recommend creating a virtual python environment via virtualenv or conda for your development environment. Please see the relevant documentation for more details on installing these python environment managers. Afterward, continue the geowrangler setup instructions below.

Set-up with virtualenv

First, install libgeos ( version >=3.8 required for building pygeos/shapely) through the ff. commands.

See libgeos documentation for installation details on other systems.

sudo apt update # updates package info
sudo apt install build-essential # installs GCC
sudo apt install libgeos-dev

Next, set-up your Python env with virtualenv and install pre-commits and the necessary python libs by running the following commands.

Remember to replace <your-github-id> in the github url below with your GitHub ID to clone from your fork.

git clone https://github.com/<your-github-id>/geowrangler.git
cd geowrangler
virtualenv -p /usr/bin/python3.9 .venv
source .venv/bin/activate
pip install pre-commit "poetry>=1.2.0"
pre-commit install
poetry install

You're all set! Run the tests to make sure everything was installed properly.

Whenever you open a new terminal, you can cd <your-local-geowrangler-folder> and run poetry shell to activate the geowrangler environment.

Set-up with conda

Run the following commands to set-up a conda env and install geos.

conda create -y -n geowrangler-env python=3.9 # replace geowrangler-env if you prefer a different env name
conda deactivate # important to ensure libs from other envs aren't used
conda activate geowrangler-env
conda install -y geos

Then run the following to install the pre-commits and python libs.

cd geowrangler # cd into your geowrangler local directory
pip install pre-commit poetry>=1.2.0
pre-commit install
poetry install

You're all set! Run the tests to make sure everything was installed properly.

Whenever you open a new terminal, run conda deactivate && conda activate geowrangler-env to deactivate any conda env, and then activate the geowrangler environment.

Jupyter Notebook Development

The code for the geowrangler python package resides in Jupyter notebooks located in the notebooks folder.

Using nbdev, we generate the python modules residing in the geowrangler folder from code cells in jupyter notebooks marked with an #export comment. A #default_exp <module_name> comment at the first code cell of each notebook directs nbdev to put the code in a module named <module_name> in the geowrangler folder.

See the nbdev cli documentation for more details on the commands to generate the package as well as the documentation.

Running notebooks

Run the following to view the jupyter notebooks in the notebooks folder

poetry run jupyter lab

Generating and viewing the documentation site

To generate and view the documentation site on your local machine, the quickest way is to setup Docker. The following assumes that you have setup docker on your system.

poetry run nbdev_build_docs --mk_readme False --force_all True
docker-compose up jekyll

As an alternative if you don't want to use Docker you can install jekyll to view the documentation site locally.

nbdev converts notebooks within the notebooks/ folder into a jekyll site.

From this jekyll site, you can then create a static site.

To generate the docs, run the following


poetry run nbdev_build_docs -mk_readme False --force_all True
cd docs && bundle i && cd ..

To run the jekyll site, run the following

cd docs
bundle exec jekyll serve

Running tests

We are using pytest as our test framework. To run all tests and generate a generate a coverage report, run the following.

poetry run pytest --cov --cov-config=.coveragerc -n auto

To run a single test or test file

# for a single test function
poetry run pytest tests/test_grids.py::test_create_grids
# for a single test file
poetry run pytest tests/test_grids.py

Contributing

Please read CONTRIBUTING.md and CODE_OF_CONDUCT.md before anything.

Development Notes

For more details regarding our development standards and processes, please see our wiki.

geowrangler's People

Contributors

butchtm avatar jtmiclat avatar dependabot[bot] avatar tm-abby-moreno avatar tm-jc-nacpil avatar joshuacortez avatar alronlam avatar tm-dafrose-bajaro avatar rnvllflores avatar tm-jace-peralta avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.