Giter Site home page Giter Site logo

evy-kong / dapper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nansencenter/dapper

0.0 0.0 0.0 91.05 MB

Data Assimilation with Python: a Package for Experimental Research (DAPPER)

License: MIT License

Python 63.23% Makefile 0.13% Fortran 7.33% Jupyter Notebook 29.31%

dapper's Introduction

DAPPER is a set of templates for benchmarking the performance of data assimilation (DA) methods. The tests provide experimental support and guidance for new developments in DA. Example diagnostics:

EnKF - Lorenz'63

The typical set-up is a twin experiment, where you

  • specify a
    • dynamic model*
    • observational model*
  • use these to generate a synthetic
    • "truth"
    • and observations thereof*
  • assess how different DA methods perform in estimating the truth, given the above starred (*) items.

Pros: DAPPER enables the numerical investigation of DA methods through a variety of typical test cases and statistics. It (a) reproduces numerical benchmarks results reported in the literature, and (b) facilitates comparative studies, thus promoting the (a) reliability and (b) relevance of the results. DAPPER is (c) open source, written in Python, and (d) focuses on readability; this promotes the (c) reproduction and (d) dissemination of the underlying science, and makes it easy to adapt and extend. In summary, it is well suited for teaching and fundamental DA research.

Cons: In a trade-off with the above advantages, DAPPER makes some sacrifices of efficiency and flexibility (generality). I.e. it is not designed for the assimilation of real data in operational models (e.g. WRF).

Getting started: Read, run, and understand the scripts example_{1,2,3}.py. There is no unified documentation, but the code is reasonably well commented, including docstrings. Alternatively, see the tutorials folder for an intro to DA.

Installation

Tested on Linux/MacOS/Windows

  1. Prerequisite: python3.5+ (suggest setting it up with anaconda).
  2. Download, extract the DAPPER folder, and cd into it.
    $ pip install -r requirements.txt
  3. To test the installation, run:
    $ python example_1.py

Methods

References provided at bottom

Method name Literature RMSE results reproduced
EnKF 1 Sakov and Oke (2008), Hoteit (2015)
EnKF-N Bocquet (2012), (2015)
EnKS, EnRTS Raanes (2016a)
iEnKS 2 Sakov (2012), Bocquet (2012), (2014)
LETKF, local & serial EAKF Bocquet (2011)
Sqrt. model noise methods Raanes (2015)
Particle filter (bootstrap) 3 Bocquet (2010)
Optimal/implicit Particle filter 3 "
NETF Tödter (2015), Wiljes (2017)
Rank histogram filter (RHF) Anderson (2010)
Extended KF Raanes (2016b)
Optimal interpolation "
Climatology "
3D-Var

1: Stochastic, DEnKF (i.e. half-update), ETKF (i.e. sym. sqrt.). Serial forms are also available.
Tuned with inflation and "random, orthogonal rotations".
2: Includes (as particular cases) EnRML, iEnKS, iEnKF.
Also supports MDA forms, the bundle version, and "EnKF-N"-type inflation.
3: Resampling: multinomial (including systematic/universal and residual).
The particle filter is tuned with "effective-N monitoring", "regularization/jittering" strength, and more.

To add a new method: Just add it to da_methods.py, using the others in there as templates. Remember: DAPPER is a set of templates (not a framework); do not hesitate make your own scripts and functions (instead of squeezing everything into standardized configuration files).

Models

Model Linear? Phys.dim. State len # Lyap≥0 Implementer
Lin. Advect. Yes 1d 1000 * 51 Evensen/Raanes
Lorenz63 No 0d 3 2 Sakov
Lorenz84 No 0d 3 2 Raanes
Lorenz95 No 1d 40 * 13 Raanes
LorenzUV No 2x 1d 256 + 8 * ≈60 Raanes
Quasi-Geost No 2d 129²≈17k ≈140 Sakov

*: flexible; set as necessary

To add a new model: Make a new dir: DAPPER/mods/your_model. Add the following files:

  • core.py to define the core functionality and documentation of your dynamical model. Typically this culminates in a step(x, t, dt) function.
    • The model step operator (and the obs operator) must support 2D-array (i.e. ensemble) and 1D-array (single realization) input. See the core.py file in mods/Lorenz63 and mods/Lorenz95 for typical implementations, and mods/QG for how to parallelize the ensemble simulations.
    • Optional: To use the (extended) Kalman filter, you will need to define the model linearization. Note: this only needs to support 1D input (single realization).
  • demo.py to visually showcase a simulation of the model.
  • Files that define a complete Hidden Markov Model ready for a twin experiment (OSSE). For example, this will plug in the stepfunction you made previously as in Dyn['model'] = step. For further details, see examples such as DAPPER/mods/Lorenz63/{sak12,boc12}.py.

Other reproductions

As mentioned above, DAPPER reproduces literature results. There are also plenty of results in the literature that DAPPER does not reproduce. Typically, this means that the published results are incorrect.

A list of experimental settings that can be compared with literature papers can be obtained using gnu's find:

		$ find . -iname "[a-z]*[0-9].py" | grep mods

Some of these files contain settings that have been used in several papers.

Additional features

  • Progressbar
  • Tools to manage and display experimental settings and stats
  • Visualizations, including
    • liveplotting (during assimilation)
    • intelligent defaults (axis limits, ...)
  • Diagnostics and statistics with
    • Confidence interval on times series (e.g. rmse) averages with
      • automatic correction for autocorrelation
      • significant digits printing
  • Parallelisation:
    • (Independent) experiments can run in parallel; see example_3.py
    • Forecast parallelisation is possible since the (user-implemented) model has access to the full ensemble; see example in mods/QG/core.py.
    • Analysis parallelisation over local domains; see example in da_methods.py:LETKF()
    • Also, numpy does a lot of parallelization when it can. However, as it often has significant overhead, this has been turned off (see tools/utils.py) in favour of the above forms of parallelization.
  • Gentle failure system to allow execution to continue if experiment fails.
  • Classes that simplify treating:
    • Time sequences Chronology/Ticker with consistency checks
    • random variables (RandVar): Gaussian, Student-t, Laplace, Uniform, ..., as well as support for custom sampling functions.
    • covariance matrices (CovMat): provides input flexibility/overloading, lazy eval) that facilitates the use of non-diagnoal covariance matrices (whether sparse or full).

Alternative projects

DAPPER is aimed at research and teaching (see discussion on top). Example of limitations:

  • It is not suited for very big models (>60k unknowns).
  • Time-dependent error covariances and changes in lengths of state/obs (although models f and Obs may otherwise be time-dependent).
  • Non-uniform time sequences not fully supported.

Also, DAPPER comes with no guarantees/support. Therefore, if you have an operational (real-world) application, you should look into one of the alternatives, sorted by approximate project size.

Name Developers Purpose (approximately)
DART NCAR Operational, general
ERT* Statoil Operational, history matching (Petroleum)
JEDI JCSDA (NOAA, NASA, ++) Operational, general (in develpmt?)
OpenDA TU Delft Operational
EMPIRE Reading (Met) Operational
SANGOMA Conglomerate** Unify DA research
Verdandi INRIA Biophysical DA
PDAF Nerger Operational and research
PyOSSE Edinburgh, Reading Earth-observation DA
MIKE DHI Oceanographic. Commercial?
OAK Liège Oceaonagraphic
Siroco OMP Oceaonagraphic
FilterPy R. Labbe Engineering, general intro to Kalman filter
DASoftware Yue Li, Stanford Matlab, large-scale
Pomp U of Michigan R, general state-estimation
PyIT CIPR Real-world petroleum DA (?)
Datum* Raanes Matlab, personal publications
EnKF-Matlab* Sakov Matlab, personal publications and intro
EnKF-C Sakov C, light-weight EnKF, off-line
IEnKS code* Bocquet Python, personal publications
pyda Hickman Python, personal publications

*: Has been inspirational in the development of DAPPER.

**: Liege/CNRS/NERSC/Reading/Delft

References

  • Sakov (2008) : Sakov and Oke. "A deterministic formulation of the ensemble Kalman filter: an alternative to ensemble square root filters".
  • Anderson (2010): "A Non-Gaussian Ensemble Filter Update for Data Assimilation"
  • Bocquet (2010) : Bocquet, Pires, and Wu. "Beyond Gaussian statistical modeling in geophysical data assimilation".
  • Bocquet (2011) : Bocquet. "Ensemble Kalman filtering without the intrinsic need for inflation,".
  • Sakov (2012) : Sakov, Oliver, and Bertino. "An iterative EnKF for strongly nonlinear systems".
  • Bocquet (2012) : Bocquet and Sakov. "Combining inflation-free and iterative ensemble Kalman filters for strongly nonlinear systems".
  • Bocquet (2014) : Bocquet and Sakov. "An iterative ensemble Kalman smoother".
  • Bocquet (2015) : Bocquet, Raanes, and Hannart. "Expanding the validity of the ensemble Kalman filter without the intrinsic need for inflation".
  • Tödter (2015) : Tödter and Ahrens. "A second-order exact ensemble square root filter for nonlinear data assimilation".
  • Raanes (2015) : Raanes, Carrassi, and Bertino. "Extending the square root method to account for model noise in the ensemble Kalman filter".
  • Hoteit (2015) : "Mitigating Observation Perturbation Sampling Errors in the Stochastic EnKF"
  • Raanes (2016a) : Raanes. "On the ensemble Rauch-Tung-Striebel smoother and its equivalence to the ensemble Kalman smoother".
  • Raanes (2016b) : Raanes. "Improvements to Ensemble Methods for Data Assimilation in the Geosciences".
  • Wiljes (2017) : Aceved, Wilje and Reich. "Second-order accurate ensemble transform particle filters".

Further references are given in the code.

Contributors

Patrick N. Raanes, Colin Grudzien, Maxime Tondeur, Remy Dubois

If you use this software in a publication, please cite as follows.

@misc{raanes2018dapper,
  author = {Patrick N. Raanes and others},
  title  = {nansencenter/DAPPER: Version 0.8},
  month  = December,
  year   = 2018,
  doi    = {10.5281/zenodo.2029296},
  url    = {https://doi.org/10.5281/zenodo.2029296}
}

Powered by

Python Numpy Pandas Jupyter

dapper's People

Contributors

patnr avatar 14tondeu avatar cgrudz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.