pysteps / pysteps Goto Github PK

Python framework for short-term ensemble prediction systems.

License: BSD 3-Clause "New" or "Revised" License

Python 97.22% Cython 2.78%

nowcasting optical-flow advection forecast-verification stochastic-processes weather-radar precipitation hydrology rainfall rainfall-prediction

pysteps's Introduction

pysteps - Python framework for short-term ensemble prediction systems

docs
status
package
community

What is pysteps?

Pysteps is an open-source and community-driven Python library for probabilistic precipitation nowcasting, i.e. short-term ensemble prediction systems.

The aim of pysteps is to serve two different needs. The first is to provide a modular and well-documented framework for researchers interested in developing new methods for nowcasting and stochastic space-time simulation of precipitation. The second aim is to offer a highly configurable and easily accessible platform for practitioners ranging from weather forecasters to hydrologists.

The pysteps library supports standard input/output file formats and implements several optical flow methods as well as advanced stochastic generators to produce ensemble nowcasts. In addition, it includes tools for visualizing and post-processing the nowcasts and methods for deterministic, probabilistic, and neighbourhood forecast verification.

Quick start

Use pysteps to compute and plot a radar extrapolation nowcast in Google Colab with this interactive notebook.

Installation

The recommended way to install pysteps is with conda from the conda-forge channel:

$ conda install -c conda-forge pysteps

More details can be found in the installation guide.

Usage

Have a look at the gallery of examples to get a good overview of what pysteps can do.

For a more detailed description of all the available methods, check the API reference page.

Example data

A set of example radar data is available in a separate repository: pysteps-data. More information on how to download and install them is available here.

Contributions

We welcome contributions!

For feedback, suggestions for developments, and bug reports please use the dedicated issues page.

For more information, please read our contributors guidelines.

Reference publications

The overall library is described in

Pulkkinen, S., D. Nerini, A. Perez Hortal, C. Velasco-Forero, U. Germann, A. Seed, and L. Foresti, 2019: Pysteps: an open-source Python library for probabilistic precipitation nowcasting (v1.0). Geosci. Model Dev., 12 (10), 4185–4219, doi:`10.5194/gmd-12-4185-2019 <https://doi.org/10.5194/gmd-12-4185-2019>`_.

While the more recent blending module is described in

Imhoff, R.O., L. De Cruz, W. Dewettinck, C.C. Brauer, R. Uijlenhoet, K-J. van Heeringen, C. Velasco-Forero, D. Nerini, M. Van Ginderachter, and A.H. Weerts, 2023: Scale-dependent blending of ensemble rainfall nowcasts and NWP in the open-source pysteps library. Q J R Meteorol Soc., 1-30, doi: 10.1002/qj.4461.

Contributors

pysteps's People

Contributors

Stargazers

Watchers

Forkers

aperezhortal luigijr aereinha cvelascof savelovme herrmannv rubenimhoff savelov tjniemi jzanetti rprudden fangyh09 kmuehlbauer alhridoy mattiabalestra juanpablosimarro mostamndi cycle13 gain9999 pkars mazhao86 iluckyyang aoe-khkhan huangynj kkyong77 ecasellas patrik-benacek fox91 jayapudashine jselzler alexanderhucheerful tang662019 afansgh hardupnow fagan2888 jctw2008 xrosliang tanpinsiang pandasambit15 maz2198 spc2019 diversoft jaykimbravekjh jaybravekjhkim jleinonen arpa-simc bradyrx jiaobf andrewcbennett cchwala jpolz heygrance china1885 wolfidan meteoswiss-mdr xingge0130 agile-lee sxjscience kwonil-kim xianwuxue-noaa yumeone babetoduarte tkokkone mjalava lindgrv zhang-shibao ladc kisshua victorwangshuang aitaten wdewettin ggraeler helvecioneto nebuchadnezzarr gerritholl mincrt esmailghaemi xwtang mpvginde leabeusch fourmia ocean2045 martinstam l4fl4m3 viniciusgcjr zzqzack lorypack ritvje endinlee isaaccad nathalierombeek tsmsalper iamasam eastwind2000 estebanmontandon ai-app aizhan87 timschmi95 kyuhee-shin lauesbri

pysteps's Issues

Add tests to pysteps modules

Now, the pySTEPS tests are mostly done by running the examples. It is a good idea to implement scripts that test different functions. A good and simple testing framework is pytest.

By doing so, we can test the library after any change that we want to commit.

One of the best advantages of implementing these tests is that we can set continuous integration service used to build and test projects hosted at GitHub, like Travis-CI. This is used for example in py-art.
Many of the CI servers support integration with github. By doing so, after each commit, the tests can be run under different environments and the results are visible in the commits tab in github (see pyart green checks for example ). Also, the pull request can be automatically tested before merging.

I created a branch with a script to test the interfaces as an example: https://github.com/aperezhortal/pysteps/tree/tests/test

To run this tests, execute pytest in the test folder (pytest package is needed).

The output looks like this:

============================= test session starts ==============================
platform linux -- Python 3.6.7, pytest-4.0.1, py-1.7.0, pluggy-0.8.0

test_interfaces.py . [100%]

=========================== 1 passed in 1.25 seconds ===========================
Process finished with exit code 0

Basemap EOL

Basemap will be replaced by Cartopy and "All new software development should try to use Cartopy whenever possible". Pysteps currently uses Basemap. If there are no major barriers, pysteps should probably aim to switch to Cartopy to avoid using deprecated modules.

While not all Basemap features are yet implemented in Cartopy, I've been using it for over a year without major issues. Also for visualizing radar products.

A possible bug in D -= V_inc if D_prev is None

If D_prev is None and t == 0,
D -= V_inc(Line 151) is not right, it should be D -= V_inc/n_iter

pysteps/pysteps/extrapolation/semilagrangian.py

Lines 135 to 155 in dde2b13

    
           for t in range(num_timesteps): 
        
               if n_iter > 0: 
        
                   for k in range(n_iter): 
        
                       XYW = xy_coords + D - V_inc / 2.0 
        
                       XYW = [XYW[1, :, :], XYW[0, :, :]] 
        
                       interpolate_motion(XYW, V_inc) 
        
                       D -= V_inc 
        
                       interpolate_motion(xy_coords + D, V_inc) 
        
               else: 
        
                   if t > 0 or D_prev is not None: 
        
                       XYW = xy_coords + D 
        
                       XYW = [XYW[1, :, :], XYW[0, :, :]] 
        
                       interpolate_motion(XYW, V_inc) 
        
                   D -= V_inc 
        
               XYW = xy_coords + D 
        
               XYW = [XYW[1, :, :], XYW[0, :, :]]

ModuleNotFoundError: No module named 'pysteps.motion._vet'

Hi! Many thanks for making STEPS available.

I've successfully created Conda environment and completed python setup.py install. But when I tried to import pySTEPS, I encountered the following problem:

(pysteps) wcwoo@wcwoo-VirtualBox:~/pysteps$ python
Python 3.6.7 | packaged by conda-forge | (default, Nov 21 2018, 03:09:43) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pysteps
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wcwoo/pysteps/pysteps/__init__.py", line 15, in <module>
    from . import motion
  File "/home/wcwoo/pysteps/pysteps/motion/__init__.py", line 3, in <module>
    from .interface import get_method
  File "/home/wcwoo/pysteps/pysteps/motion/interface.py", line 5, in <module>
    from pysteps.motion.vet import vet
  File "/home/wcwoo/pysteps/pysteps/motion/vet.py", line 30, in <module>
    from pysteps.motion._vet import _warp, _cost_function
ModuleNotFoundError: No module named 'pysteps.motion._vet'
>>>

I'm using Ubuntu LTS 18.04. Anything I've missed?

Keep option for probability matching when not using mask

For my experiments I would need to use probability matching also when the mask_method is None.
The probability matching should be unconditional so that it sets the right amount of zeros.

Interface module for FFT methods

The user cannot currently choose the FFT method. The available methods are searched in fixed order, and the same sequence of try-except clauses is repeated in several modules (cascade.decomposition, noise.fftgenerators, noise.utils and motion.darts). A more sensible way, that would also allow the user to choose the FFT method, would be implementing an interface module for different methods.

Example scripts not working

The following scripts are not working correctly after the changes in the extrapolation interfaces:

my_first_nowcast_with_pysteps.py
run_deterministic_nowcast.py
run_ensemble_nowcast.py

They need to be updated to be compliant with the last updates in pysteps.

Segfaults when using pyfftw

Parallel computation of STEPS ensembles occasionally crashes when using pyfftw. The following errors appear in the system log:

[1708862.459601] python[32200]: segfault at 0 ip 00007fa7e771e5b0 sp 00007fa7c9f30598 error 4 in pyfftw.cpython-37m-x86_64-linux-gnu.so[7fa7e7314000+4f1000]

This occurs in Anaconda environment with Python 3.7.1 and pyfftw 0.11.1 when n_ens_members>1 and num_workers>1.

Lucas-Kanade Robustness

If there are is precipitation data on a map, the Lucas-Kanade method throws a ValueError with the following details:

File "/python3.5/site-packages/pysteps/motion/lucaskanade.py", line 245, in _ShiTomasi_features_to_track
    raise ValueError("Shi-Tomasi found no good feature to be tracked.")
ValueError: Shi-Tomasi found no good feature to be tracked.

It is understandable that a motion field cannot be computed on an empty map but I see that more as a regular case rather than an exception/error. Could the API be modified slightly to make sure that an error is not raised?

Implement ODIM HDF5 exporter

Exporter for writing nowcasts in the ODIM HDF5 format needs to be implemented.

ValueError: X contains non-finite values

I run example at https://pysteps.readthedocs.io/en/latest/auto_examples/plot_noise_generators.html#sphx-glr-auto-examples-plot-noise-generators-py

Error occurs at
Fp = initialize_param_2d_fft_filter(R)

[BUG] CRPS not defined in get_method function in verification interface

Variable CRPS is not defined in get_method() function of pysteps.verification.interface module. Trying to call function with name='crps' raises NameError.

Steps to reproduce:

In [1]: import pysteps

In [2]: pysteps.verification.get_method('crps', 'probabilistic')

NameError Traceback (most recent call last)
in ()
----> 1 pysteps.verification.get_method('crps', 'probabilistic')

~/Nowcasting/pysteps/pysteps/verification/interface.py in get_method(name, type)
142
143 if name in ["crps"]:
--> 144 return CRPS
145 elif name in ["reldiag"]:
146 return reldiag_init, reldiag_accum, reldiag_compute

NameError: name 'CRPS' is not defined

pysteps documentation

There are a couple of matters concerning documentation that I would like to discuss here before the V1.0.0 release.

First, currently the documentation is hosted on github pages: https://pysteps.github.io/pysteps/refmanual/.
This means that the documentation needs to be manually updated.
One alternative to this approach would be to use readthedocs, allowing the automatic update of the documentation after every new push to github. It also allows versioning, meaning that we'll be able to support the documentation for different releases.

Secondly, currently all methods need to be listed in the source file of the documentation (as in here). A different approach would be to to directly list them in the init.py file of each module as done in py-art, for example here. I would argue that this way may be easier to maintain, since the list for the documentation autosummary is kind in the same place of the code itself.

Any thoughts on these two points?

_generate_path function cannot convert paths appropriately. Generalisation could be helpful

pysteps/pysteps/io/archive.py

Line 100 in fcf2bcd

def _generate_path(date, root_path, path_fmt):

Pysteps source file snap

"mch_rzc": {
            "root_path": "/Users/mohitanand/Documents/projects/masters_project/data",
            "path_fmt": "%Y/Meteo_Swiss_Images_%m",             
	    	"fn_pattern": "meteoswiss.radar.precip.%Y%m%d%H%M",
            "fn_ext": "gif",
            "importer": "mch_gif",
            "timestep": 5,
            "importer_kwargs": {
                "product": "RZC",
                "unit": "mm/h",
                "accutime": 5
            }

The code to read the files

import pysteps as stp
from pysteps import rcparams
from datetime import datetime

ds = stp.rcparams["data_sources"]["mch_rzc"]
starttime = datetime(2016,1,1)
importer = stp.io.get_method(ds.importer, "importer")
input_files = stp.io.find_by_date(starttime, ds.root_path, ds.path_fmt,ds.fn_pattern, ds.fn_ext, ds.timestep, 0, 1)
R, _, metaradar = stp.io.read_timeseries(input_files, importer, **ds.importer_kwargs)

Error

path /Users/mohitanand/Documents/projects/masters_project/data/2016/Meteo_Swiss_Images_%m not found.
path /Users/mohitanand/Documents/projects/masters_project/data/2016/Meteo_Swiss_Images_%m not found.

%m not being converted to the month value

More efficient FFT computation

The current implementation of the cascade decomposition uses fft2 and ifft2. These implement the generic Fourier transform that can take complex-valued inputs. For real-valued inputs, which is always the case in pysteps, we should use the optimized rfft2 and irfft2 with roughly half of the computational cost and memory usage.

More sensible treatment for missing input files

Handling of missing input files could be done in a more sensible or a better documented way. Currently io.archive.find_by_date returns None if a file for the requested time stamp is not found. If io.readers.read_timeseries is supplied a None file name, it returns a field on nan values, which can cause computations to silently fail. In addition, the above behavior is not documented at all.

Plotting arguments

The plotting routines in pySTEPS are amazing, but it would be cool if one could pass further **kw plotting arguments (e.g. when I calculated the UV-differences between DARTS and LucasKanade, I coloured the vectors in red).

pysteps/pysteps/visualization/motionfields.py

Line 8 in c12287f

def quiver(UV, geodata=None, step=15):

Changes in the extrapolator interface

The current extrapolation methods implement the following interface:
extrapolate(extrap, precip, velocity, num_timesteps, outval=np.nan, **keywords)
where extrap is an extrapolator object returned by an initialization function which is not needed if we want to run a simple nowcast.

We can change the interface and make it compatible with the V0.2 interface:
extrapolate(precip, velocity, num_timesteps, outval=np.nan, extrap=None, keyword arguments)
where the extrapolator is an optional keyword argument.

We also need to update the nowcast module should be updated accordingly.

Specify dependency package versions in environment.yml

Currently the conda environment file (environment.yml) does not specify which versions of dependencies should be installed. It would be a good idea to specify package versions at least for required dependencies to prevent issues arising from different versions used.

I suggest we use current (latest) stable versions as our required versions, unless there's a reason to use an older version. As of writing this (Sept. 2018), the following versions are available from conda:

git 2.18.0
dask 0.19.1
numpy 1.15.1
matplotlib 2.2.3
opencv 3.4.2
pillow 5.2.0
pyproj 1.9.5.1
scipy 1.1.0

Using pysteps-data for tests and examples

Currently we have a set of sample data in a separate repository, pysteps-data. This is well motivated by the necessity to keep the main library repository as slim as possible.

The problem is that more and more we need sample data within the library, as in the case of the tests run with Travis or the sphinx-gallery that we are preparing and will be eventually built by Read the Docs.

To avoid having to include data in the library, we could think of using the approach of wradlib. Fundamentally, wradlib uses a shell script to install the library through Travis. This script will also clone the wradlib-data repository and set an environment variable WRADLIB_DATA containing the path to the data (here). Finally, a util method is used to get the path to data directory when need, as for example when running a test for importing a file (here).

Personally, I think we should adopt something similar so that we can store all our sample data in one place and fetch them only when needed (tests, sphinx-gallery,...).

What do you think?

pysteps has no module named io

I am new to pysteps and installed it using conda environment. I am trying to run one of the example scripts.


import matplotlib.pylab as plt
import numpy as np
import pysteps as stp

#Get the two last observations

#read two consecutive radar fields 
filenames = ("sample_mch_radar_composite_00.gif","sample_mch_radar_composite_01.gif")
R = []
for fn in filenames:
    R_, _, metadata = stp.io.import_mch_gif(fn)
    R.append(R_)
    R_ = None
R = np.stack(R)

Traceback (most recent call last):

  File "<ipython-input-8-fd2533396aec>", line 2, in <module>
    R_, _, metadata = stp.io.import_mch_gif(fn)

AttributeError: module 'pysteps' has no attribute 'io'
While running this code I get an error stating pysteps has no module name io, while trying to load images using io.import_mch_gif(). There was though no error importing pysteps and I am running this on a directory which is not in pysteps folder.

Speedup of boolean summation

Dear pysteps Team
It looks like one can speed-up the skill score calculation by converting the booleans to floats before the summation (roughly a factor of 30...).
I hope this helps!

    H = sum(H_idx.astype(float)) # hits
    M = sum(H_idx.astype(float)) # misses
    F = sum(H_idx.astype(float)) # false alarms
    R = sum(H_idx.astype(float)) # correct rejections

pysteps/pysteps/verification/detcatscores.py

Line 40 in 3d8fda8

H = sum(H_idx).astype(float) # hits

Which flow is better?

I found the procedure to compute optical flow is slow. And tried another way(neural network) to compute it.
Which one looks better and reasonable? LK method or the new method? The results are:
------------------------------This repo LK method------------------------------
Time=t0:

Time=t1:

------------------------------new method------------------------------
Time=t0:

Time=t1:

NetCDF exporter for nowcasts

The NetCDF exporter (pysteps.io.exporters.write_nowcast_netCDF) for writing the nowcast files is incomplete. Please check that the produced files conform to the CF standard and have all the required metadata.

Composite Radar Images - Support for more Importers

Hello,

A great library overall! Thanks for making this open-source.

One question about the supported radar image types: Currently the library supports radar formats such as bom_rf3, fmi_pgm and few more. Can there be efforts to provide a more comprehensive set of importers so that the library can nowcast over almost any kind of radar image?

We have tried feeding the following types of radar images to the library and I don't think there is a 100% match with the supported image types. That is why the nowcast results looked a bit odd.

Thanks!

Support gpu version of AR(2)

Hi,
I implement gpu version of autoregression, which supports image-level / patch-level(23x23 for example) autogression.
https://github.com/Fangyh09/Autoregression.Pytorch

Change names of top-level modules

Naming of the modules in the top-level pysteps directory is inconsistent. Suggested changes:

Replace "advection" with "extrapolation", because the submodules in this directory implement extrapolation methods.
Replace "optflow" with more generic "motion", since this module can contain more general motion-related functionality (such as temporal evolution of the motion field).

Improve parametric noise generator

Currently the parametric noise generator method initialize_param_2d_fft_filter() only uses one spectral slope beta:

pysteps/pysteps/noise/fftgenerators.py

Lines 48 to 131 in 288774c

    
           def initialize_param_2d_fft_filter(X, **kwargs): 
        
               """Takes a 2d input field and produces a fourier filter by using the Fast  
        
               Fourier Transform (FFT). 
        
               Parameters 
        
               ---------- 
        
               X : array-like 
        
                 Two-dimensional square array containing the input field. All values are  
        
                 required to be finite. 
        
               Optional kwargs 
        
               --------------- 
        
               win_type : string 
        
                  Optional tapering function to be applied to X. 
        
                  Default : flat-hanning 
        
               model : string 
        
                   The parametric model to be used to fit the power spectrum of X. 
        
                   Default : power-law 
        
               weighted : bool 
        
                   Whether or not to apply the sqrt(power) as weight in the polyfit() function. 
        
                   Default : True 
        
               Returns 
        
               ------- 
        
               F : array-like 
        
                 A two-dimensional array containing the parametric filter. 
        
                 It can be passed to generate_noise_2d_fft_filter(). 
        
               """ 
        
               if len(X.shape) != 2: 
        
                   raise ValueError("the input is not two-dimensional array") 
        
               if np.any(~np.isfinite(X)): 
        
                 raise ValueError("X contains non-finite values") 
        
               if X.shape[0] != X.shape[1]: 
        
                   raise ValueError("a square array expected, but the shape of X is (%d,%d)" % \ 
        
                                    (X.shape[0], X.shape[1])) 
        
               # defaults 
        
               win_type = kwargs.get('win_type', 'flat-hanning') 
        
               model    = kwargs.get('model', 'power-law') 
        
               weighted = kwargs.get('weighted', True) 
        
               L = X.shape[0] 
        
               X = X.copy() 
        
               if win_type is not None: 
        
                   X -= X.min() 
        
                   tapering = build_2D_tapering_function((L, L), win_type) 
        
               else: 
        
                   tapering = np.ones_like(X) 
        
               if model.lower() == 'power-law': 
        
                   # compute radially averaged PSD 
        
                   psd = _rapsd(X*tapering) 
        
                   # wavenumbers 
        
                   if L % 2 == 0: 
        
                       wn = np.arange(0, int(L/2)+1) 
        
                   else: 
        
                       wn = np.arange(0, int(L/2)) 
        
                   # compute spectral slope Beta 
        
                   if weighted: 
        
                       p0 = np.polyfit(np.log(wn[1:]), np.log(psd[1:]), 1, w=np.sqrt(psd[1:])) 
        
                   else: 
        
                       p0 = np.polyfit(np.log(wn[1:]), np.log(psd[1:]), 1) 
        
                   beta = -p0[0] 
        
                   # compute 2d filter 
        
                   if L % 2 == 1: 
        
                       XC,YC = np.ogrid[-int(L/2):int(L/2)+1, -int(L/2):int(L/2)+1] 
        
                   else: 
        
                       XC,YC = np.ogrid[-int(L/2):int(L/2), -int(L/2):int(L/2)] 
        
                   R = np.sqrt(XC*XC + YC*YC) 
        
                   R = fft.fftshift(R) 
        
                   F = R**(-beta) 
        
                   F[~np.isfinite(F)] = 1 
        
               else: 
        
                   raise ValueError("unknown parametric model %s" % model) 
        
               return F

State of the art methods usually include two spectral slopes and a scale break that need to be fit to the 1d power spectrum.

Issue with config module and pip install

The default configuration parameters are now hardcoded in the config module. This means that when a user install a new version of pysteps using pip, the user-defined configuration will be overide.

To keep the user-defined configuration unchanged under library updates, an approach similar to the one used in Matplolib can be used to mantain the configuration files.
Matplotlib uses matplotlibrc configuration files to customize all kinds of propertieslike figure size and dpi, line width, color and style, axes, axis and grid properties, text and font properties and so on.
This gives the ability to the user to customize the default behavior of the library, with the advantage that this behavior is maintained when the user updates the library.

As a proof of concept, I adapt this idea to pysteps in the following fork:

The datasources in the config module are now defined in a pystepsrc JSON file

The JSON files resemble the definition of a dictionary in python, and allows to define many data types.

When pysteps is imported, it looks for pystepsrc file in four locations, in the following order:
- $PWD/pystepsrc
- $PYSTEPSRC if it is a file
- $PYSTEPSRC/pystepsrc
- $HOME/.pysteps/pystepsrc if $HOME is defined.
- Lastly, it looks inside the library in pysteps/pystepsrc for a system-defined copy.

This is the example default configuration file that is included with the library.

Usage example:

from pysteps import rcparams
ds = rcparams["bom"]

The ds is an AttrDict object and it is compatible with the actual config module implementation to maintain backwards compatibility.

This functionality was implemented in the pysteps/init.py file.

Things that are still missing or maybe are nice to include

The global parameters defined in pysteps_config.py are not yet defined, but they can be easily implemented.
If the user-defined file is loaded and some of the parameters where not defined, the default ones defined in the package are not used. The implementation should load first the package defaults, and the overide the ones that the user defines.

Upload package to pypi and create anaconda packages

For the new release, we can upload the python package to the Python Package index (https://pypi.org/) so that version can be installed simply by pip install pysteps.

Also, since the installation of the package has some issues in other platforms different than Linux, we can create the anaconda installers for each OS. In that way, we can configure the conda package to make use of the anaconda compilers and avoid cross-platform compatibility issues.

Axis tick labels with lon-lat lines

The current version of cartopy (0.16.0) can only show axis tick labels for latitudes and longitudes for two projections. Otherwise, it throws an exception:

TypeError: Cannot label gridlines on a plot. Only PlateCarree and Mercator plots are currently supported.

This works with basemap. Until this issue is fixed in cartopy, I suggest implementing a temporary workaround (e.g. implement a function that sets the axis labels manually).

Implement the STEPS model in the spectral domain

The AR(2) models and the noise generation in STEPS can be implemented in the spectral domain. See

S. Pulkkinen, V. Chandrasekar and A.-M. Harri, Stochastic Spectral Method for Radar-Based Probabilistic Precipitation Nowcasting, Journal of Atmospheric and Oceanic Technology, doi: 10.1175/JTECH-D-18-0242.1.

I implemented the above method in the spectral branch. Here is the output of the old version of nowcasts.steps.forecast by using the FMI data (grid size 760x1226 pixels) with 24 ensemble members and 12 threads:

Starting nowcast computation.
Computing nowcast for time step 1... 8.15 seconds.
Computing nowcast for time step 2... 8.23 seconds.
Computing nowcast for time step 3... 8.11 seconds.
Computing nowcast for time step 4... 8.05 seconds.
Computing nowcast for time step 5... 7.90 seconds.
Computing nowcast for time step 6... 8.04 seconds.
Computing nowcast for time step 7... 8.17 seconds.
Computing nowcast for time step 8... 8.11 seconds.
Computing nowcast for time step 9... 8.09 seconds.
Computing nowcast for time step 10... 7.94 seconds.
Computing nowcast for time step 11... 7.94 seconds.
Computing nowcast for time step 12... 8.07 seconds.

The memory usage was around 26% (on a computer with 50 Gb memory). Using the spectral implementation, I got the following results:

Starting nowcast computation.
Computing nowcast for time step 1... 4.48 seconds.
Computing nowcast for time step 2... 4.66 seconds.
Computing nowcast for time step 3... 4.65 seconds.
Computing nowcast for time step 4... 4.54 seconds.
Computing nowcast for time step 5... 4.50 seconds.
Computing nowcast for time step 6... 4.48 seconds.
Computing nowcast for time step 7... 4.51 seconds.
Computing nowcast for time step 8... 4.43 seconds.
Computing nowcast for time step 9... 4.46 seconds.
Computing nowcast for time step 10... 4.48 seconds.
Computing nowcast for time step 11... 4.58 seconds.
Computing nowcast for time step 12... 4.52 seconds.

Memory usage was reduced to 16%.

There are still rough edges and not everything is working. The only noise generation method implemented in the spectral domain is the non-parametric one. Once all bugs have been fixed and the other noise generators have been implemented in the spectral domain, I would recommend using the new version as the default choice. The classical STEPS model should still be kept in pysteps for educational purposes (and also for the localized versions where the localization is done in the spatial domain).

Implement a data model

Pysteps operates on 2d grids that contain metadata (such as the grid definition, units, applied transformations and temporal resolution). Currently this information is carried in a separate metadata dictionary or as separate arguments (see e.g. pysteps.nowcasts.steps.forecast) in a very inconsistent way. It would be a good idea to store all this information into the same object.

Possible solution: use xarray (http://xarray.pydata.org/en/stable) that allows storing metadata into attributes, as well as using named columns and integration with dask.

Why AR(2) model is used on Lagrangian persistence component instead of original data?

For line 204, why images are backward extrapolated first before autoregression?

pysteps/pysteps/nowcasts/sprog.py

Lines 190 to 207 in df090c0

    
           extrap_kwargs = extrap_kwargs.copy() 
        
           extrap_kwargs['xy_coords'] = xy_coords 
        
           # advect the previous precipitation fields to the same position with the 
        
           # most recent one (i.e. transform them into the Lagrangian coordinates) 
        
           res = list() 
        
           def f(R, i): 
        
               return extrapolator_method(R[i, :, :], V, ar_order - i, 
        
                                          "min", 
        
                                          **extrap_kwargs)[-1] 
        
           for i in range(ar_order): 
        
               if not DASK_IMPORTED: 
        
                   R[i, :, :] = f(R, i) 
        
               else: 
        
                   res.append(dask.delayed(f)(R, i))

pysteps 0.2

Since our first release in August, there have been a number of improvements and bug fixes that would justify a new release, in my opinion.

Conditioning of rank histograms

pysteps/pysteps/verification/ensscores.py

Line 167 in 40572e2

mask_nz = np.logical_or(X_o >= X_min, np.any(X_f >= X_min, axis=1))

The current implementation of rank histograms is not optimal if we set the threshold X_min at higher values (e.g. 10 mm/h).
In such cases, the condition for ignoring pairs of observations and forecasts is not enough restrictive.
This is especially visible when all the M ensemble members except one are equal to 0. If the observation is 0, it is randomly assigned in the first M-1 bins. If the observation is larger than the only ensemble member that is different from 0 (which occurs often), it is added to the M+1 bin. The probability of being in the Mth bin is therefore very low. In addition, the histogram is flat for all bins up to M-1 (due to random assignment), which is a bit misleading.
I am wondering how this effect of random assignment is also impacting the rank histograms for lower values of the X_min threshold.

Bias Forecast

Hi All
I am not sure about the definition of Bias Score at

pysteps/pysteps/verification/detcatscores.py

Line 92 in 013997f

B = H + (1 - s)*FA/s

Bias score (frequency bias) should be equal to B = (H + FA)/ (H + M)
as per http://www.cawcr.gov.au/projects/verification/

Carlos

Pysteps User reference

The actual documentation only includes the developer reference with the description of all the modules. This provides a low-level description of PySteps modules.

The documentation is missing an user reference, that introduces how to use the library, step by steps.

These are examples of User guides:

https://matplotlib.org/users/index.html
https://docs.wradlib.org/en/stable/index.html (Check user guide section)

We can use as a starting point, the examples that we included in the library. Here are some tools that can helps us to convert those examples into documentation:

Use examples files: sphinx-gallery; preview
Use jupyter notebooks sphinx-nbexamples: preview
Use jupyter notebooks: nbsphinx

linking pysteps under the GPL3 license

It has been pointed out that there may be an issue regarding the GPL3 license under which pysteps is currently published. This concerns whether or not a non-GPL software can legally link to a GPL library (as explained in this Wikipedia page):

if one releases a GPL-licensed entity to the public, there is an issue regarding linking: namely, whether a proprietary program that uses a GPL library is in violation of the GPL.

In essence, this boils down at whether or not using pysteps can be considered or not as "derivative work", which GPL clearly requires to be put under the same GPL.

Apparently this is a controversial matter and different opinion exist.

I would like to see if anyone has comments/opinions on the issue. Maybe somebody with more experience with open-source lincenses could give us some advise? Should we consider using a different license?

I personally would prefer not to limit the use of pysteps from within non-GPL softwares, but this is of course open to discussion.

Blending with NWP data

Hi all,

With version 0.2 coming up, I was wondering whether there are any plans to implement a functionality for blending with NWP data in next releases?

Thanks!

utils/interface.py fails with default dependencies

pysteps/pysteps/utils/interface.py

Line 102 in 58ef4a7

methods_objects["pyfftw_fft"] = fft.get_method("pyfftw")

The utils/interface.py causes the program to fail when only the default dependencies are installed:

Traceback (most recent call last):
File "run_ensemble_nowcast.py", line 85, in
reshaper = stp.utils.get_method(adjust_domain)
File "/home/ned/anaconda3/envs/pysteps/lib/python3.6/site-packages/pysteps/utils/interface.py", line 102, in get_method
methods_objects["pyfftw_fft"] = fft.get_method("pyfftw")
File "/home/ned/anaconda3/envs/pysteps/lib/python3.6/site-packages/pysteps/utils/fft.py", line 45, in get_method
raise MissingOptionalDependency("pyfftw is required but it is not installed")
pysteps.exceptions.MissingOptionalDependency: pyfftw is required but it is not installed

This is a bug, as pyfftw is in fact an optional package. We'll need to fix this before the next release. Not sure what would be the best way to address this issue, any thoughts @pulkkins?

Part of the problem might be that we currently have two get_method() in utils: one in utils/interface.py and one in the utils/fft.py module. For consistency with the other modules, only the get_method() in utils/interface.py should be kept.

Use specific exception types instead of generic Exception

Generic Exception classes are used in many places where a more specific exception type would be appropriate. Find ones from the builtin Python exception classes or implement own.

release 1.0.0

We need to start working on the release that will be referenced in the publication we are preparing.

There are in my opinion three main aspects that need to be addressed:

Name:
How should we name this release? Simply pysteps 0.3? Another option could be to name it pysteps 1.0.
This would be in my opinion appropriate given that it will be the version described in the paper.

Priority changes yet to be included:
Are there important features that we need to focus on for the next release? Issues that we must solve?
Reorganizing SPROG in a separate method could be one example.

Change log:
Start putting together all your main contributions since our last release 0.2.

Why use np.roll(g[:-1], j) instead of padding zero in autoregression?

For line 131 at

pysteps/pysteps/timeseries/autoregression.py

Lines 128 to 134 in 4a14e00

    
           g = np.hstack([[1.0], gamma]) 
        
           G = [] 
        
           for j in range(p): 
        
               G.append(np.roll(g[:-1], j)) 
        
           G = np.array(G) 
        
           phi_ = np.linalg.solve(G, g[1:].flatten())

G.append(np.roll(g[:-1], j))

Why use np.roll instead of padding zero?

//Yule=Walker Equation: http://www2.econ.osaka-u.ac.jp/~tanizaki/class/2014/model_analysis1/08.pdf P109

Drop the requirement for square-shaped inputs

Currently DARTS and STEPS require the inputs to have a square shape. However, the FFT can be computed from inputs of any shape, and therefore this requirement could be relaxed. This would require implementing more general versions of the bandpass filters and noise generators.

gcc compilation problem on pip install

Hello

I'm struggling with the 'pip install' approach to getting pysteps installed on my Mac (OS Mojave 10.14.2). I've tried:

pip install numpy
(which was successful)
pip install git+https://github.com/pySTEPS/pysteps
after progressing some way, it fails with the following message:
:
:
creating build/temp.macosx-10.9-x86_64-3.7/pysteps/motion
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch x86_64 -g -I/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/numpy/core/include -I/Library/Frameworks/Python.framework/Versions/3.7/include/python3.7m -c pysteps/motion/_vet.c -o build/temp.macosx-10.9-x86_64-3.7/pysteps/motion/_vet.o -fopenmp
clang: error: unsupported option '-fopenmp'
error: command 'gcc' failed with exit status 1

Any thoughts as to why this fails?

Thanks

Could you explain a bit more for extrapolation "XYW = xy_coords + D - V_inc / 2.0"

I am a bit confused at this line, would you explain it a bit more? Thanks!

pysteps/pysteps/extrapolation/semilagrangian.py

Line 131 in b98bb89

XYW = xy_coords + D - V_inc / 2.0

Edge vectors

Adding pre-defined velocity vectors before the Interpolation:
Uli suggested adding COSMO-derived wind velocity vectors in the four edges of the domain (where there is no radar coverage), in order to force the interpolation towards the wind field of the model. I suggest providing an additional optional kwarg

edge_vec : array-like
        Wind vectors in 2d array (rows: x,y,u,v) obtained from model output (e.g. COSMO) to
        stabilise interpolation at image edges with no radar coverage.

which is then appended to the declustered motion vector array:

    # decluster sparse motion vectors
    x, y, u, v = declustering(x0, y0, u, v, decl_grid, min_nr_samples)


    # Append edge vectors if provided
    if not(edge_vec is None):
        x = [x,edge_vec[0,:]]; y = [y,edge_vec[1,:]]
        u = [u,edge_vec[2,:]]; v = [v,edge_vec[3,:]]

    # kernel interpolation
    X, Y, UV = interpolate_sparse_vectors(x, y, u, v, domain_size,
                                          epsilon=kernel_bandwidth, nchunks=nchunks)

pysteps/pysteps/optflow/lucaskanade.py

Line 144 in c12287f

Any script to evaluate different methods?

Hi,
I implemented different optical flow methods, and I want to compare the results.
Is there any script to evaluate the results?

Make code pep8 compilant

Many parts of the source code do not follow good coding practices recommended for python (pep8 style guide).

Although this requires some effort, it will increase the readability of the package library.
A good summary of the standard can be found here.

Some good candidates to tackle first are:

Whitespace and newlines, Indentation, line-length & code wrapping

Always use 4 spaces for indentation (don’t use tabs)
Write UTF-8
Max line-length: 79 characters
Always indent wrapped code for readability
2 blank lines before top-level function and class definitions
1 blank line before class method definitions
Use blank lines in functions sparingly
Avoid extraneous whitespace
Don’t use whitespace to line up assignment operators (=, :)
Spaces around = for assignment
No spaces around = for default parameter values
Spaces around mathematical operators, but group them sensibly
Multiple statements on the same line are discouraged

Useful tools

IDEs : Many development environments show warnings in non-compilant code (like pyCharm).
pylint : code analysis tool.
pep8 code analysis tool.
autopep8 : automatically formats Python code to conform to the PEP 8 style guide.

	for t in range(num_timesteps):
	if n_iter > 0:
	for k in range(n_iter):
	XYW = xy_coords + D - V_inc / 2.0
	XYW = [XYW[1, :, :], XYW[0, :, :]]

	interpolate_motion(XYW, V_inc)
	D -= V_inc
	interpolate_motion(xy_coords + D, V_inc)
	else:
	if t > 0 or D_prev is not None:
	XYW = xy_coords + D
	XYW = [XYW[1, :, :], XYW[0, :, :]]

	interpolate_motion(XYW, V_inc)

	D -= V_inc

	XYW = xy_coords + D
	XYW = [XYW[1, :, :], XYW[0, :, :]]

	def initialize_param_2d_fft_filter(X, **kwargs):
	"""Takes a 2d input field and produces a fourier filter by using the Fast
	Fourier Transform (FFT).

	Parameters
	----------
	X : array-like
	Two-dimensional square array containing the input field. All values are
	required to be finite.

	Optional kwargs
	---------------
	win_type : string
	Optional tapering function to be applied to X.
	Default : flat-hanning
	model : string
	The parametric model to be used to fit the power spectrum of X.
	Default : power-law
	weighted : bool
	Whether or not to apply the sqrt(power) as weight in the polyfit() function.
	Default : True

	Returns
	-------
	F : array-like
	A two-dimensional array containing the parametric filter.
	It can be passed to generate_noise_2d_fft_filter().
	"""

	if len(X.shape) != 2:
	raise ValueError("the input is not two-dimensional array")
	if np.any(~np.isfinite(X)):
	raise ValueError("X contains non-finite values")
	if X.shape[0] != X.shape[1]:
	raise ValueError("a square array expected, but the shape of X is (%d,%d)" % \
	(X.shape[0], X.shape[1]))

	# defaults
	win_type = kwargs.get('win_type', 'flat-hanning')
	model = kwargs.get('model', 'power-law')
	weighted = kwargs.get('weighted', True)

	L = X.shape[0]

	X = X.copy()
	if win_type is not None:
	X -= X.min()
	tapering = build_2D_tapering_function((L, L), win_type)
	else:
	tapering = np.ones_like(X)

	if model.lower() == 'power-law':

	# compute radially averaged PSD
	psd = _rapsd(X*tapering)

	# wavenumbers
	if L % 2 == 0:
	wn = np.arange(0, int(L/2)+1)
	else:
	wn = np.arange(0, int(L/2))

	# compute spectral slope Beta
	if weighted:
	p0 = np.polyfit(np.log(wn[1:]), np.log(psd[1:]), 1, w=np.sqrt(psd[1:]))
	else:
	p0 = np.polyfit(np.log(wn[1:]), np.log(psd[1:]), 1)
	beta = -p0[0]

	# compute 2d filter
	if L % 2 == 1:
	XC,YC = np.ogrid[-int(L/2):int(L/2)+1, -int(L/2):int(L/2)+1]
	else:
	XC,YC = np.ogrid[-int(L/2):int(L/2), -int(L/2):int(L/2)]
	R = np.sqrt(XCXC + YCYC)
	R = fft.fftshift(R)
	F = R**(-beta)
	F[~np.isfinite(F)] = 1

	else:
	raise ValueError("unknown parametric model %s" % model)


	return F

	extrap_kwargs = extrap_kwargs.copy()
	extrap_kwargs['xy_coords'] = xy_coords

	# advect the previous precipitation fields to the same position with the
	# most recent one (i.e. transform them into the Lagrangian coordinates)
	res = list()

	def f(R, i):
	return extrapolator_method(R[i, :, :], V, ar_order - i,
	"min",
	**extrap_kwargs)[-1]

	for i in range(ar_order):
	if not DASK_IMPORTED:
	R[i, :, :] = f(R, i)
	else:
	res.append(dask.delayed(f)(R, i))

	g = np.hstack([[1.0], gamma])
	G = []
	for j in range(p):
	G.append(np.roll(g[:-1], j))
	G = np.array(G)
	phi_ = np.linalg.solve(G, g[1:].flatten())

pysteps / pysteps Goto Github PK

pysteps's Introduction

pysteps - Python framework for short-term ensemble prediction systems

What is pysteps?

Quick start

Installation

Usage

Example data

Contributions

Reference publications

Contributors

pysteps's People

Contributors

Stargazers

Watchers

Forkers

pysteps's Issues

Things that are still missing or maybe are nice to include

Useful tools

Recommend Projects

Recommend Topics

Recommend Org