Giter Site home page Giter Site logo

xarray-selafin's Issues

Limit compulsory objects in DataSet dims/data_vars/attrs/coords and distinguish 2D/3D

What do you think of removing the dimension "plan" for 2D (it would only be available in 3D)?

I would also remove some redondant data:

  • meshx vs x coords
  • meshy vs y coords
  • ipobo2 attribute vs ipobo3 attribute
  • ...

The aim is to be able to write a Selafin file from a more simple DataSet directly.
Compulsory data would be : x, y, ikle2 and data_vars. All other data would be optional.

serialization as netcdf throws an exception

I am not sure if xarray-selafin, wants to support exporting selafin files as netcdfs, but the way the xarray attributes are currently being utilized is not compatible with netcdf. For example:

ds = xr.open_dataset("tests/data/r2d_tidal_flats.slf")
ds.to_netcdf("/tmp/out.nc")

results in:

TypeError: Invalid value for attr 'variables': {'U': ('VELOCITY U', 'M/S'), 'V': ('VELOCITY V', 'M/S'), 'H': ('WATER DEPTH', 'M'), 'S': ('FREE SURFACE', 'M'), 'B': ('BOTTOM', 'M')}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple

Use Hermes for the fastest io

One would like to benefit from the fastest library available to load selafin files: TelemacFile (via HERMES)

There are two practical requisites to build this xarray package:

  1. Have a minimal working setup (the least required libraries possible) -- Ease of installation
  2. Use the fastest tools available - Performance

Point 2. requires the hermes libraries to be recognised in the python environment.

I reproduced a minimal setup on a new branch hermes.

Without the HERMES library:

$ python tests/perf_test.py 
Warning: Using SerafinFile. It is recommended to compile Hermes api
Time taken by telemac: 0.049854278564453125 seconds
Time taken by selafin: 0.007260560989379883 seconds

to be able to have the library working, I needed first to see which libraries I needed:

$ ldd _hermes.cpython-311-x86_64-linux-gnu.so 
        linux-vdso.so.1 (0x00007ffe06530000)
        libhermes4api.so => not found
        libspecial4api.so => not found
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8275c00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f8275ebd000)

I had to load my telemac conda environment by doing:

mamba activate telemac 
mamba activate --stack slf 

slf being the environment where I have the package

and all libraries were then recognised in the path

$ ldd _hermes.cpython-311-x86_64-linux-gnu.so 
        linux-vdso.so.1 (0x00007ffc27f09000)
        libhermes4api.so => /home/tomsail/miniconda3/envs/telemac/opentelemac/builds/gnu.dynamic/wrap_api/lib/libhermes4api.so (0x00007ff6e2e00000)
        libspecial4api.so => /home/tomsail/miniconda3/envs/telemac/opentelemac/builds/gnu.dynamic/wrap_api/lib/libspecial4api.so (0x00007ff6e35c3000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff6e2a00000)
        libmpi_usempif08.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi_usempif08.so.40 (0x00007ff6e356b000)
        libmpi_usempi_ignore_tkr.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi_usempi_ignore_tkr.so.40 (0x00007ff6e355b000)
        libmpi_mpifh.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi_mpifh.so.40 (0x00007ff6e2d96000)
        libmpi.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi.so.40 (0x00007ff6e2c74000)
        libgfortran.so.5 => /home/tomsail/miniconda3/envs/telemac/lib/libgfortran.so.5 (0x00007ff6e2855000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ff6e276e000)
        libgcc_s.so.1 => /home/tomsail/miniconda3/envs/telemac/lib/libgcc_s.so.1 (0x00007ff6e2c59000)
        libquadmath.so.0 => /home/tomsail/miniconda3/envs/telemac/lib/libquadmath.so.0 (0x00007ff6e2735000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff6e2c54000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ff6e35f1000)
        libopen-pal.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/./libopen-pal.so.40 (0x00007ff6e2635000)
        libopen-rte.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/./libopen-rte.so.40 (0x00007ff6e257b000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ff6e2c4f000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ff6e2c4a000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007ff6e2c45000)
        libz.so.1 => /home/tomsail/miniconda3/envs/telemac/lib/././libz.so.1 (0x00007ff6e2c2a000)
$ python tests/perf_test.py 
Time taken by telemac: 0.00033593177795410156 seconds
Time taken by selafin: 0.007600069046020508 seconds

@pmav99,
the good news: it appears that the hermes libraries are correctly installed on my conda (I double checked it)
the tricky one is that I don't know how to make this environment minimal now..

A hint maybe for CI builds could be the matrix build I have set up for the conda environment.

ping @nicogodet, I think this might interesting for you when we figure out something minimal, so you can adapt it for windows. Let me know also if you have any idea to make _hermes more portable

Convert time from second to datetime64

Saving this so I don't forget it:

example of a current loaded dataset:

<xarray.Dataset> Size: 63GB
Dimensions:       (time: 745, node: 3539584, plan: 1, nelem2: 6898577, ndp2: 3,
                   nelem3: 6898577, ndp3: 3)
Coordinates:
    x             (node) float32 14MB ...
    y             (node) float32 14MB ...
  * time          (time) float64 6kB 0.0 3.6e+03 7.2e+03 ... 2.675e+06 2.678e+06
    ikle2         (nelem2, ndp2) >i4 83MB ...
    ikle3         (nelem3, ndp3) >i4 83MB ...
Dimensions without coordinates: node, plan, nelem2, ndp2, nelem3, ndp3
Data variables:
    VELOCITY U    (time, node, plan) float64 21GB ...
    VELOCITY V    (time, node, plan) float64 21GB ...
    FREE SURFACE  (time, node, plan) float64 21GB ...
Attributes: (12/17)
    title:     TELEMAC 2D Model
    meshx:     [ 145.26067   -117.35177    -96.901276  ...  158.9151    -122....
    meshy:     [ 15.181208 -27.162487  69.59621  ...  84.81109  -27.315842  4...
    nelem2:    6898577
    nelem3:    6898577
    npoin2:    3539584
    ...        ...
    iparam:    (1, 0, 0, 0, 0, 0, 0, 186317, 0, 1)
    var_IDs:   ['U', 'V', 'S']
    varnames:  ['VELOCITY U', 'VELOCITY V', 'FREE SURFACE']
    varunits:  [b'M/S             ', b'M/S             ', b'M               ']
    date:      (2023, 7, 1, 0, 0, 0)
    type:      2D

time should be instead:
time (time) datetime64[ns] 2013-01-01 ... 2013-12-31T23:57:30 (example)

It would also propagate back in ds.selafin.write() to assign
the seconds in the core file
and also the first datetime of the file

Add equation example

Hi @lucduron,

I see you added lots of tools to calculate parameters (in variable).
Could you provide a simple example in order to calculate CHEZY BOTTOM FRICTION and add it in a geo file?

I think I could add it in the README then.

Thanks

add feature: ikle2 / ikle3 as variable

This feature would be handy for selecting and cropping windows of the selafin file.

See an example for crop function in Thalassa

Tasks:

  • add the 2D elements as a variable
  • add the 3D elements as a variable when dealing with a 3D Selafin
  • add the number of elements as a dimension

Automate releases

@pmav99 I was wondering if it was possible to avoid to:

  • modify version in setup.cfg, setup.py and pyproject.toml
  • build tar and wheel
  • push on pipy
  • push changes on git
    every time there is a PR in main.

Is is good practice ? and doable?

Time stamp error

Hello,

I tried loading a geometry file using

import xarray as xr
ds = xr.open_dataset("geo.slf", engine="selafin")

and I get the following error message:

Traceback (most recent call last):
  File "/home/sebourban/opentelemac/test.py", line 2, in <module>
    ds = xr.open_dataset("geom_world06_stb.slf", engine="selafin")
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sebourban/opentelemac/bin/miniforge/envs/otm-env/lib/python3.12/site-packages/xarray/backends/api.py", line 573, in open_dataset
    backend_ds = backend.open_dataset(
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sebourban/opentelemac/bin/miniforge/envs/otm-env/lib/python3.12/site-packages/xarray_selafin/xarray_backend.py", line 272, in open_dataset
    times = [datetime(*slf.header.date) + timedelta(seconds=t) for t in slf.time]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: datetime.datetime() argument after * must be an iterable, not NoneType

When I try on a time varying file I get :
Format "" is unknown and is forced to "SERAFIN "
but maybe this is OK - just a little odd to confirm that I have a SERAFIN when I know I do ..

You probably know the answer already ? thanks for looking into it.
Sébastien.

support for 1D result files

This concerns the particular case of 1D time series generated with the keywords:

TIME SERIES FILE 1             = 'results_1D.slf'
TIME SERIES COORDINATES FILE 1 = 'station.in'

in T2D. (I haven't tested yet in T3D or TOMAWAC)

I can provide the file generated:
results_1D.zip (to unzip)

opening it with PyTelTools Serafin gives me the following traceback:

ds = xr.open_dataset('results_1D.slf', engine = 'selafin')
SERAFIN VALIDATION ERROR: Unknown mesh type
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray/backends/api.py", line 573, in open_dataset
    backend_ds = backend.open_dataset(
  File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/xarray_backend.py", line 268, in open_dataset
    slf = read_serafin(filename_or_obj, lang)
  File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/xarray_backend.py", line 22, in read_serafin
    resin.read_header()
  File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/Serafin.py", line 1020, in read_header
    self.header.from_file(self.file, self.file_size)
  File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/Serafin.py", line 885, in from_file
    self._check_dim()
  File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/Serafin.py", line 198, in _check_dim
    raise SerafinValidationError("Unknown mesh type")
xarray_selafin.Serafin.SerafinValidationError: Unknown mesh type

neither BlueKenue can open it.

Although opening it with the Selafin class works without problem :

slf= Selafin('results_1D.slf')
>>> slf.meshx
array([13.55799961, 13.23700047, 12.07441902, ...,  9.87355423,
        5.1170001 ,  5.99513721])
>>> slf.meshy
array([-12.33300018,  -8.7869997 , -15.15768051, ...,  57.59385681,
        61.93299866,  53.44028473])
>>> slf.get_values(0)
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
>>> slf.ikle3
array([[   0],
       [   1],
       [   2],
       ...,
       [1146],
       [1147],
       [1148]])
>>> slf.iparam
array([   1,    0,    0,    0,    0,    0,    0, 1149,    0,    1])

The problem is that the function

def _check_dim(self):
        # verify data consistence and determine 2D or 3D
        if self.is_2d:
            if self.nb_nodes_per_elem != 3:
                raise SerafinValidationError("Unknown mesh type")
        else:
            if self.nb_nodes_per_elem != 6:
                raise SerafinValidationError(
                    "The number of nodes per element is not equal to 6"
                )
            if self.nb_planes < 2:
                raise SerafinValidationError("The number of planes is less than 2")

does not take this case in consideration. While the header class in Selafin might be more forgiving?

This might also be relevant to add this issue directly in PyTelTools.

I let you decide @lucduron

fix dependencies metadata

Mentioned in #26 by @pmav99

The following tasks will streamline metadata and fix the dependencies issues mentioned above and in #2

  • move pytest to [tool.poetry.group.dev.dependencies]
  • matplotlib is not used in the package and can be removed
  • dask is not used in the package, and there is no intention of implementing it soon, so it can be removed too

Make dask an optional dependency

Minor issue, but I am opening the ticket so that I don't forget about it.

Currently, dask is a mandatory dependency of the xelafin backend. Nevertheless dask is an optional dependency of xarray.

My understanding is that it should be possible to develop the backend so that it can work either "eagerly" (without dask) or "lazily" (with dask). For now I guess that we should focus on getting either mode to work, but in the future we should test/support both.

For the record, it might be easier to fix the issues we have with write in the eager mode.

Consolidation of setup.py, setup.cfg and pyproject.toml

There are 3 different files that contain metadata:

  • setup.py
  • setup.cfg
  • pyproject.toml

are all 3 of them needed? Unless there are specific reasons that force the usage of e.g. both setup.py and pyproject.toml I would suggest keeping just one of them. As it is, it is rather difficult to understand which file is being used + all 3 files need to be kept in sync when the metadata get updated.

ikle indexing: pyTelTools ikle = slf ikle + 1

pyTelTools ikle indexing (or triangle connectivity) has a 1 increment (they start from 1) compared to the traditional Selafin indexing (that start from 0).
Maybe I would add a warning (and even better an example with a triplot/tricontour plot using matplotlib) so that users understand the subtility.

dtype not fully set in lazy loading

I am not sure why yet, but a simple printing of the dataset throws out the following error:

>>> ds = xr.open_dataset('r3d_tidal_flats.slf', engine = 'selafin')
>>> ds
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/dataset.py", line 2534, in __repr__
    return formatting.dataset_repr(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/reprlib.py", line 21, in wrapper
    result = user_function(self)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/formatting.py", line 713, in dataset_repr
    nbytes_str = render_human_readable_nbytes(ds.nbytes)
                                              ^^^^^^^^^
  File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/dataset.py", line 1515, in nbytes
    return sum(v.nbytes for v in self.variables.values())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/dataset.py", line 1515, in <genexpr>
    return sum(v.nbytes for v in self.variables.values())
               ^^^^^^^^
  File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/namedarray/core.py", line 476, in nbytes
    return self.size * self.dtype.itemsize
           ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for *: 'int' and 'getset_descriptor'

It seems to be connected with the definition of the self.dtype class variable.

print(ds.variables) also throws the same error but:

print([var for var in ds.variables])
['Z', 'U', 'V', 'W', 'MUD', 'x', 'y', 'time']

works.

So far I have been able to work around this problem (also because we can set lazy_loading = False when reading the file).
But the bug needs to :

  • be identified
  • be added to tests too

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.