Giter Site home page Giter Site logo

woudc / woudc-data-registry Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 8.0 792 KB

WOUDC Data Registry is a platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.

Home Page: https://woudc.org

License: Other

Makefile 0.66% Python 97.11% Shell 2.24%
ozone ultraviolet uv totalozone ozonesonde umkehr gaw wmo spectral

woudc-data-registry's Introduction

WOUDC Data Registry

Build Status Coverage Status Documentation

Overview

WOUDC Data Registry is a platform that manages ozone and ultraviolet radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.

Installation

Requirements

Dependencies

Dependencies are listed in requirements.txt. Dependencies are automatically installed during installation.

Installing woudc-data-registry

# setup virtualenv
python3 -m venv --system-site-packages woudc-data-registry
cd woudc-data-registry
source bin/activate

# clone woudc-extcsv and install
git clone https://github.com/woudc/woudc-extcsv.git
cd woudc-extcsv
pip install -r requirements.txt
python setup.py install
cd ..

# clone codebase and install
git clone https://github.com/woudc/woudc-data-registry.git
cd woudc-data-registry
python setup.py build
python setup.py install
# for PostgreSQL backends
pip install -r requirements-pg.txt


# set system environment variables
cp default.env foo.env
vi foo.env  # edit database connection parameters, etc.
. foo.env

# create database
make ENV=foo.env createdb

# drop database
make ENV=foo.env dropdb

# show configuration
woudc-data-registry admin config

# initialize model (database tables)
woudc-data-registry admin registry setup

# initialize search engine
woudc-data-registry admin search setup

# load core metadata

# fetch WMO country list
mkdir data
curl -o data/wmo-countries.json https://www.wmo.int/cpdb/data/membersandterritories.json
woudc-data-registry admin init -d data/

# cleanups

# re-initialize model (database tables)
woudc-data-registry admin registry teardown
woudc-data-registry admin registry setup

# re-initialize search engine
woudc-data-registry admin search teardown
woudc-data-registry admin search setup

# drop database
make ENV=foo.env dropdb

Running woudc-data-registry

TIP: autocompletion can be made available in some shells via:

eval "$(_WOUDC_DATA_REGISTRY_COMPLETE=source woudc-data-registry)"

Core Metadata Management

# list all instances of foo (where foo is one of:
#  project|dataset|contributor|country|station|instrument|deployment)
woudc-data-registry <foo> list
 e.g.
woudc-data-registry contributor list

# show a specific instance of foo with a given registry identifier
woudc-data-registry <foo> show <identifier>
 e.g.
woudc-data-registry station show 023
woudc-data-registry instrument show ECC:2Z:4052:002:OzoneSonde

# add a new instance of foo (contributor|country|station|instrument|deployment)
woudc-data-registry <foo> add <options>
 e.g.
woudc-data-registry deployment add -s 001 -c MSC:WOUDC
woudc-data-registry contributor add -id foo -n "Contributor name" -c Canada -w IV -u https://example.org -e [email protected] -f foouser -g -75,45

# update an existing instance of foo with a given registry identifier
woudc-data-registry <foo> update -id <identifier> <options>
 e.g.
woudc-data-registry station update -n "New station name"
woudc-data-registry deployment update --end-date 'Deployment end date'

# delete an instance of foo with a given registry identifier
woudc-data-registry <foo> delete <identifier>
 e.g.
woudc-data-registry deployment delete 018:MSC:WOUDC

# for more information about options on operation (add|update):
woudc-data-registry <foo> <operation> --help
 e.g.
woudc-data-registry instrument update --help

Data Processing

# ingest directory of files (walks directory recursively)
woudc-data-registry data ingest /path/to/dir

# ingest single file
woudc-data-registry data ingest foo.dat

# ingest without asking permission checks
woudc-data-registry data ingest foo.dat -y

# verify directory of files (walks directory recursively)
woudc-data-registry data verify /path/to/dir

# verify single file
woudc-data-registry data verify foo.dat

# verify core metadata only
woudc-data-registry data verify foo.dat -l

# ingest with only core metadata checks
woudc-data-registry data ingest /path/to/dir -l

UV Index Generation

# Teardown and generate entire uv_index_hourly table
woudc-data-registry product uv-index generate /path/to/archive/root


# Only generate uv_index_hourly records within year range
woudc-data-registry product uv-index update -sy start-year -ey end-year /path/to/archive/root

Total Ozone Generation

# Teardown and generate entire totalozone table
woudc-data-registry product totalozone generate /path/to/archive/root

OzoneSonde Generation

# Teardown and generate entire ozonesonde table
woudc-data-registry product ozonesonde generate /path/to/archive/root

Report Generation

The woudc-data-registry data ingest command accepts a -r/--report flag, which is a path pointing to a directory. When that flag is provided, an operator report and a run report are automatically written to that directory while the files are being processing.

woudc-data-registry data ingest /path/to/dir -r /path/to/reports/location

The run report has a filename run_report. The file contains a series of blocks, one per contributor in a processing run, of the following format:

<contributor acronym>
<status>: <filepath>
<status>: <filepath>
<status>: <filepath>
...

Where <status> is either Pass or Fail, depending on how the file reported in that line fared in processing.

The operator report is a more in-depth error log in CSV format, with a filename like operator-report-<date>.csv. Operator reports contain one line per error or warning that happened during the processing run. The operator report is meant to be a human-readable log which makes specific errors easy to find and diagnose.

Development

# install dev requirements
pip install -r requirements-dev.txt

Building the Documentation

# build local copy of https://woudc.github.io/woudc-data-registry
cd docs
make html
python -m http.server  # view on http://localhost:8000/

Running Tests

# run tests like this:
cd woudc_data_registry/tests
python test_data_registry.py

# or this:
python setup.py test

# measure code coverage
coverage run --source=woudc_data_registry -m unittest woudc_data_registry.tests.test_data_registry
coverage report -m

Code Conventions

Bugs and Issues

All bugs, enhancements and issues are managed on GitHub.

Contact

woudc-data-registry's People

Contributors

ahurka avatar danielwaiforssell avatar kngai avatar tomkralidis avatar victoriarspada avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.