ecmwf / ecpoint-calibrate Goto Github PK

Interactive GUI (developed in Python) for calibration and conditional verification of numerical weather prediction model outputs.

License: GNU General Public License v3.0

Python 38.90% Shell 0.24% GLSL 0.23% CSS 0.91% JavaScript 50.72% HTML 0.15% MATLAB 8.84%

python meteorology weather-forecast ecmwf calibration metview decision-trees

ecpoint-calibrate's Introduction

ecPoint-Calibrate

ecPoint-Calibrate is a software that uses conditional verification tools to compare numerical weather prediction (NWP) model outputs against point observations and, in this way, anticipate sub-grid variability and identify biases at grid scale. It provides a dynamic and user-friendly environment to post-process NWP model parameters (such as precipitation, wind, temperature, etc.) and produce probabilistic products for geographical locations (everywhere in the world, and up to medium-range forecasts).

The development of this project was sponsored by the project "ECMWF Summer of Weather Code (ESoWC)" @esowc_ecmwf ECMWF.

Build with Docker

docker build -f Dockerfile.core -t ecmwf/ecpoint-calibrate-core:dev .

Deploy new versions of the Docker containers

./deploy.sh

Create a production AppImage

yarn dist

The appimage won't work on modern machines without manually adding the --no-sandbox electron option and re-packaging.

Install `appimagetool`

sudo wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage -O /usr/local/bin/appimagetool
sudo chmod +x /usr/local/bin/appimagetool

Repackage the AppImage

cd pkg
./ecPoint-Calibrate-0.30.0.AppImage --appimage-extract

This will extract the image into the squashfs-root directory. Open squashfs-root/AppRun and change the exec lines to have the --no-sandbox argument. e.g. exec "$BIN" --no-sandbox

Then repackage:

appimagetool squashfs-root ecPoint-Calibrate-0.30.0.AppImage

Python Backend

We need metview-batch from conda-forge so unfortunately need to use conda with poetry.

Creating the environment

conda create --name ecpoint_calibrate_env --file conda-linux-64.lock
conda activate ecpoint_calibrate_env
poetry install

Activating the environment

conda activate ecpoint_calibrate_env

Updating the environment

Poetry (strongly preferred)

Installing a new package with poetry will update the poetry lockfile.

poetry install $DEP

Conda

You should very rarely need to add a new conda dep.

conda-lock -k explicit --conda mamba
mamba update --file conda-linux-64.lock
poetry update

Run tests

First activate the conda env, then run pytest.

Electron Frontend

You'll need node v 14.5.0.

Installing deps

yarn

Run the app

yarn start

Run tests

npm run test

ecpoint-calibrate's People

Contributors

Stargazers

Watchers

Forkers

carletes tjtg hardupnow atimhewson 5l1v3r1 ivantq zejiang-unsw sizzles

ecpoint-calibrate's Issues

Integrate computations step with the backend

Currently the backend runs all the computations known to us, since they have been hard-coded. The backend should run only those computations that have been specified by the user.

Integrate computations step with the backend

Currently the backend runs all the computations known to use, since they have been hard-coded. The backend should run only those computations that have been specified by the user.

Rename Accumulated Solar Radiation to 24H Solar Radiation

Use ReactJS Datepicker for specifying start and end date ranges

Ref: https://reactdatepicker.com

Check sections Date Range, and Month dropdown.

Integrate React Date Picker

Stream stdout logs from Python subprocess to the Electron client

Use ZeroRPC streams.

Remove the (ill-conceived) concept of computation reference

The indicated reference is actually the predictand.

Force end date to be always greater than start date

After selecting the start-date, all dates before it should be disabled in the end-date calendar.

Refactor GeopointsLoader to use Pandas DataFrames instead of pure lists

Find a way to automatically install eccodes Python bindings

This will probably involve some tinkering with setuptools.

Add validation check to make sure the predictors list contains the predictand

Checkmarks on card header in Computation components don't work

Show the computed ASCII table once the processing is done

Show it in a new modal
Optionally allow the user to download the ASCII file

Convert 2D numpy arrays in DecisionTree to Pandas DataFrames

This is done so as to match column names in the computed DecisionTree matrix with the input predictors.

Create snap for packaging the GUI app

Ref: http://snapcraft.io

Dynamically probe the FS and disable selection of dates missing in the DB

Ref: #10

Do not spawn the Flask process via the Electron runtime

Use Docker Compose instead.

Do not hardcode ASCII table headers

Generate ASCII tables using Pandas.
The headers must respect the dynamic computations framework implemented in #13.

Migrate to Pipenv

Hopefully, get rid of the setup.py that exists for no good reason. Also remove the (.*-)?requirements.txt files.

Allow user to specify the forecast error to compute

Implement component to check Forecast Error (FE) and Forecast Error Ratio (FER) computations.
Remove outdated MDL code, and use SemanticUI React.
Add the values to the computations Redux state.
Integrate with backend

Reorganize product management material

We have stuff scattered around in pm and Meeting directories. Clean this up, and communicate this to other collaborators.

Write an attrs class for computation fields and errors and link it to Parameters

See https://github.com/onyb/ecPoint-PyCal/blob/master/core/processor/models.py

Write unittests for GribLoader and GeopointLoader

Visual makeover of ecPoint-PyCal

We want to make the splash screen of the software much more attractive.

Update ECMWF logo
Add a background animation to the splash screen
- See https://codepen.io/ste-vg/pen/Gqakbo
Think about a ecPoint-PyCal logo
???

Add Docker support

Why?

The current way of installing ecPoint-PyCal is via a snap. While this is a very convenient, both the installation and execution of the software requires the user to be in the sudoers list.

Docker is a good solution to tackle this problem, provided we figure out a way to forward the display server to the host.

Note: The most complex step would probably be installation of metview inside the container.

Create separate models for predictand, predictors, computations, and parameters

Currently we use only Parameters model for all information coming from the app.

Replace ecCodes with Metview-Python framework

This issue involves rewriting all the loaders (for GRIB, NetCDF, Geopoints) used in the project, with the Python bindings for Metview.

Metview will probably be using the C / F90 version of ecCodes package internally, but this fact should be hidden from the user.

The most important aspect of this issue is to use an efficient replacement for the codes_grib_find_nearest function from ecCodes, that supports use of fieldsets. Write a unit test for the same.

Allow user to not include computed fields from list of predictors

For example, to add Convective Precipitation Ratio (CPR), one must also add an accumulated field for Convective Precipitation. However, the latter is normally not useful while running predictions with the decision tree model.

Restructure folder contents to prevent snapcraft from copying in .git folders

Currently, Snapcraft is unnecessarily copying media assets (slideshows, notes, pdfs) and .git folder, which is making the snapped artifact HUGE.

The objective is to restructure the files in a way that allows us to easily include them in the snapcraft.yaml file (think about a src directory..).

Bump Pandas version to 0.24.0

Why?

Newer version of Pandas is faster and provides pre-compiled wheel packages that speeds up the installation process significantly.

What's stopping us?

Newer versions of Pandas have a bug in the justification of the string repr, on which we rely for (reading from) / (writing to) ascii files.

Update 25/09/2018

The bug was recently fixed in https://github.com/pandas-dev/pandas/pull/22505/files which is a candidate for 0.24.0. We can't fix the issue until a new release is cut.

Allow reordering computated fields in threshold splits table

Ref: http://clauderic.github.io/react-sortable-hoc

Nice to have: Allow the reordering on the the split sheet header directly.

Refactor Predictands and Predictors logic

s/Predictant/Predictand/g
Indicate that predictand should always be singular.
Split database component into Predictand and Predictors components.
Add a radio field for specifying if predictand is an accumulated or instantaneous variable.
Move Predictand Errors from Step 2 to Step 1, in the Predictand component.
Move accumulation field from Parameters component to Predictand component.

Display the number of predictors selected for post-processing in the computations view

To display it beside the text "8 computations" like:

8 computations in total; 5 selected for post-processing.

Disable navigation while computation is in progress

Encode the DB structure in a config file

Structure of the database

Forecasts: /vol/ecpoint/ecPoint_DB/Forecasts/[origin]/[predictant & predictors]/[date-time]/
Observations: /vol/ecpoint/ecPoint_DB/Observations/[predictant]/[type]/[date]/

Name of the files

Forecasts: [variable]_[date]_[time]_[step]
Observations: [variable]_[type]_[date]_[end-of-period]

The structure of the database is IMPORTANT because it will be interrogated by the calibration software.

Implement deserializer for geopoint files

Spec: https://software.ecmwf.int/wiki/display/METV/Geopoints

Allow user to input meta data for computations

In the Computations tab, we would like the user to provide information about the different computed variables, that'll be useful in post-processing steps. See the Predictors.csv file for more information about the required meta data.

Identify the selected predictand code

Take UI inspiration from predictor codes, but it's not necessary to probe the filesystem.

Blocks #31.

Display available predictors and predictants in the Database component

See here.

Add validation checks for moving to the next Step

Do not let the user add computations (Step 2) without validating the input parameters (Step 1)
Do not let the user launch computations (Step 3) without validating the computations (Step 2)

Allow downloading a snapshot of the decision tree state

Allow a choice between FE and FER, but not both

It doesn't make sense for user to compute both the Forecast Error and Forecast Error Ratio at the same time.

On the UI, it should be enough to simply convert the checkbox to a radio. However, we may need to perform some refactoring in the computations framework.

Allow users to define units in the computations UI

Units will mainly serve two purposes:

Documentation
Assert the fact that some scaling factor was consciously applied to the computations. The user is responsible for making sure that the units correspond well to the scaled values.

The units should be displayed for each computation in the metadata section of the ASCII table.

Allow user to define threshold splits for the (naive) decision tree

Currently, we are relying on a WeatherTypes.csv file for obtaining the threshold splits to send as input to the decision tree. The objective is to provide the user an interface similar to a spreadsheet with prepopulated predictors names.

Resources:

https://nadbm.github.io/react-datasheet

Internal note: Corresponds to the Module 2 script in MATLAB.

Improve subprocess management

The Python process, which does the heavy-lifting (sometimes spawning parallel processes) must be the main process. The Electron process must therefore either be:

spawned as a subprocess of Python
managed independently by an external shell script

Add new operation type for computations to calculate ratio wrt reference field

For example, Convective Precipitation Ratio (CPR), where CPR = Accumulated Convective Precipitation / Accumulated Total Precipitation.

The field operation should be called Ratio Field and must take as input a field from another computation.

List of presentational improvements until the Processing step

Summary of things to fix until Processing step of ecPoint-PyCal, as detailed out by @FatimaPillosu. I'm using this issue to share my progress on each one of those.

General feedback

Create a new section called Model Data, that would contain two sections: Predictand and Predictors. Move the whole section "Select the data type to load" at the very top, and put it in the joint section "Model Data". This is because the fact that we are uploading GRIB applies not only to the predictor, but to both, the predictand and the predictor. See mockup below:

Input Parameters

Select directory of the predictand (rainfall, temperature, etc.) ➜ Select the directory that contains the predictand.
Code ➜ Predictand Code (given by the name of the directory that contains the predictand).
Predictand error to compute ➜ Type of error that will be computed
Enter accumulation (in hours) of the parameter to post-process ➜ Enter an accumulation period, in hours, for the predictand (e.g. insert the value "24" for a 24-hourly accumulation period)
- Allow accumulation periods other than 6, 12, and 24 as well.
Select a minimum value to consider, to prevent division by zero. Chosen value must be consistent with units of computed values (defined later) ➜ Enter a minimum value for the predictand to prevent dividing by zero (e.g. insert the value "1" for 1 mm/12h). The chosen value must be consistent with the units in which the predictand is represented (1 mm/12h or 0.001 m/12h).
- This sentence should be on written before the blank space where the user inserts the value.

Observations

Select observations ➜ Observational data
Move this section to the very top, so it would be clear from the beginning which parameter is going to be post-processed.
Select directory containing the observations ➜ Select the directory that contains the observations
- Ani's remark: Not sure if this makes things any better. Shorter sentences are usually more likely to be read, and softwares often go for syntactically incomplete sentences to maximize the quantum of meaning communicated per unit word.

Predictors

Select directory containing the predictors ➜ Select the directory that contains the model data for the calculation of the predictors
Code ➜ Predictors short names (given by the name of the directory that contains the predictors)
Select the data type to load ➜ Select the model data type

Parameters

Nuke this whole section and move it to the new Model Data.
Move the date picker to Model Data and call it Calibration Period, since it has nothing to do with observations.
Modified field for Spin-up (see mockup)
Modified field for Leadtime (see mockup)

Computations

Rename this section to Predictors Selection or Predictors Computation.

Processing

Rename this section to Calculations and ASCII Table Generation.
- Ani's remark: Too long
Add more comments to let the user know why we are not using that particular forecast (e.g. t+0 is within the spin-up window). Example below:
Do not hard-code list of predictors in the processing logs.
Provide a way to retrieve the ASCII file generated inside the Docker container.

Upgrade Electron runtime to v5

Use the latest and the greatest Electron runtime along with Chromium 73+, Node.js 12, and newer V8. This could be a breaking change.

Implement a data loader for NetCDF files

Investigate the maturity of MetviewPython's handling of NetCDF files.
Write a loader for .nc files using MetviewPython.
Implement unit tests.
Cleanup legacy NetCDF parser that used dataframes.
Update the frontend to choose NetCDF files as input.

Handle errors in Database component

Reproducing the bug: selecting a database path for predictors that is non-standard, crashes the UI.

TODO: The Database component must make sure that the predictors path indeed contains GRIB/NetCDF data.

Implement high performance methods to compute nearest gridpoints

Calculate distance between two points in a globe using the Haversine formula.
Provide a non-optimized naive implementation using generic interpolation.
Provide an implementation optimized for GRIB data.

ecmwf / ecpoint-calibrate Goto Github PK

ecpoint-calibrate's Introduction

ecPoint-Calibrate

Build with Docker

Deploy new versions of the Docker containers

Create a production AppImage

Install appimagetool

Repackage the AppImage

Python Backend

Creating the environment

Activating the environment

Updating the environment

Poetry (strongly preferred)

Conda

Run tests

Electron Frontend

Installing deps

Run the app

Run tests

ecpoint-calibrate's People

Contributors

Stargazers

Watchers

Forkers

ecpoint-calibrate's Issues

Why?

Why?

What's stopping us?

Update 25/09/2018

Structure of the database

Name of the files

General feedback

Input Parameters

Observations

Predictors

Parameters

Computations

Processing

Recommend Projects

Recommend Topics

Recommend Org

Install `appimagetool`