seareport / xarray-selafin Goto Github PK
View Code? Open in Web Editor NEWAn xarray engine for opening Selafin files (TELEMAC)
License: The Unlicense
An xarray engine for opening Selafin files (TELEMAC)
License: The Unlicense
What do you think of removing the dimension "plan" for 2D (it would only be available in 3D)?
I would also remove some redondant data:
The aim is to be able to write a Selafin file from a more simple DataSet directly.
Compulsory data would be : x, y, ikle2 and data_vars. All other data would be optional.
I am not sure if xarray-selafin, wants to support exporting selafin files as netcdfs, but the way the xarray attributes are currently being utilized is not compatible with netcdf. For example:
ds = xr.open_dataset("tests/data/r2d_tidal_flats.slf")
ds.to_netcdf("/tmp/out.nc")
results in:
TypeError: Invalid value for attr 'variables': {'U': ('VELOCITY U', 'M/S'), 'V': ('VELOCITY V', 'M/S'), 'H': ('WATER DEPTH', 'M'), 'S': ('FREE SURFACE', 'M'), 'B': ('BOTTOM', 'M')}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple
One would like to benefit from the fastest library available to load selafin files: TelemacFile (via HERMES)
There are two practical requisites to build this xarray package:
Point 2. requires the hermes libraries to be recognised in the python environment.
I reproduced a minimal setup on a new branch hermes
.
Without the HERMES library:
$ python tests/perf_test.py
Warning: Using SerafinFile. It is recommended to compile Hermes api
Time taken by telemac: 0.049854278564453125 seconds
Time taken by selafin: 0.007260560989379883 seconds
to be able to have the library working, I needed first to see which libraries I needed:
$ ldd _hermes.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffe06530000)
libhermes4api.so => not found
libspecial4api.so => not found
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8275c00000)
/lib64/ld-linux-x86-64.so.2 (0x00007f8275ebd000)
I had to load my telemac
conda environment by doing:
mamba activate telemac
mamba activate --stack slf
slf
being the environment where I have the package
and all libraries were then recognised in the path
$ ldd _hermes.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffc27f09000)
libhermes4api.so => /home/tomsail/miniconda3/envs/telemac/opentelemac/builds/gnu.dynamic/wrap_api/lib/libhermes4api.so (0x00007ff6e2e00000)
libspecial4api.so => /home/tomsail/miniconda3/envs/telemac/opentelemac/builds/gnu.dynamic/wrap_api/lib/libspecial4api.so (0x00007ff6e35c3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff6e2a00000)
libmpi_usempif08.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi_usempif08.so.40 (0x00007ff6e356b000)
libmpi_usempi_ignore_tkr.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi_usempi_ignore_tkr.so.40 (0x00007ff6e355b000)
libmpi_mpifh.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi_mpifh.so.40 (0x00007ff6e2d96000)
libmpi.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/libmpi.so.40 (0x00007ff6e2c74000)
libgfortran.so.5 => /home/tomsail/miniconda3/envs/telemac/lib/libgfortran.so.5 (0x00007ff6e2855000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ff6e276e000)
libgcc_s.so.1 => /home/tomsail/miniconda3/envs/telemac/lib/libgcc_s.so.1 (0x00007ff6e2c59000)
libquadmath.so.0 => /home/tomsail/miniconda3/envs/telemac/lib/libquadmath.so.0 (0x00007ff6e2735000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff6e2c54000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff6e35f1000)
libopen-pal.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/./libopen-pal.so.40 (0x00007ff6e2635000)
libopen-rte.so.40 => /home/tomsail/miniconda3/envs/telemac/lib/./libopen-rte.so.40 (0x00007ff6e257b000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ff6e2c4f000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ff6e2c4a000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007ff6e2c45000)
libz.so.1 => /home/tomsail/miniconda3/envs/telemac/lib/././libz.so.1 (0x00007ff6e2c2a000)
$ python tests/perf_test.py
Time taken by telemac: 0.00033593177795410156 seconds
Time taken by selafin: 0.007600069046020508 seconds
@pmav99,
the good news: it appears that the hermes libraries are correctly installed on my conda (I double checked it)
the tricky one is that I don't know how to make this environment minimal now..
A hint maybe for CI builds could be the matrix build I have set up for the conda environment.
ping @nicogodet, I think this might interesting for you when we figure out something minimal, so you can adapt it for windows. Let me know also if you have any idea to make _hermes
more portable
Saving this so I don't forget it:
example of a current loaded dataset:
<xarray.Dataset> Size: 63GB
Dimensions: (time: 745, node: 3539584, plan: 1, nelem2: 6898577, ndp2: 3,
nelem3: 6898577, ndp3: 3)
Coordinates:
x (node) float32 14MB ...
y (node) float32 14MB ...
* time (time) float64 6kB 0.0 3.6e+03 7.2e+03 ... 2.675e+06 2.678e+06
ikle2 (nelem2, ndp2) >i4 83MB ...
ikle3 (nelem3, ndp3) >i4 83MB ...
Dimensions without coordinates: node, plan, nelem2, ndp2, nelem3, ndp3
Data variables:
VELOCITY U (time, node, plan) float64 21GB ...
VELOCITY V (time, node, plan) float64 21GB ...
FREE SURFACE (time, node, plan) float64 21GB ...
Attributes: (12/17)
title: TELEMAC 2D Model
meshx: [ 145.26067 -117.35177 -96.901276 ... 158.9151 -122....
meshy: [ 15.181208 -27.162487 69.59621 ... 84.81109 -27.315842 4...
nelem2: 6898577
nelem3: 6898577
npoin2: 3539584
... ...
iparam: (1, 0, 0, 0, 0, 0, 0, 186317, 0, 1)
var_IDs: ['U', 'V', 'S']
varnames: ['VELOCITY U', 'VELOCITY V', 'FREE SURFACE']
varunits: [b'M/S ', b'M/S ', b'M ']
date: (2023, 7, 1, 0, 0, 0)
type: 2D
time should be instead:
time (time) datetime64[ns] 2013-01-01 ... 2013-12-31T23:57:30
(example)
It would also propagate back in ds.selafin.write()
to assign
the seconds in the core file
and also the first datetime of the file
Title says it all :)
This feature would be handy for selecting and cropping windows of the selafin
file.
See an example for crop
function in Thalassa
Tasks:
@pmav99 I was wondering if it was possible to avoid to:
setup.cfg
, setup.py
and pyproject.toml
Is is good practice ? and doable?
Hello,
I tried loading a geometry file using
import xarray as xr
ds = xr.open_dataset("geo.slf", engine="selafin")
and I get the following error message:
Traceback (most recent call last):
File "/home/sebourban/opentelemac/test.py", line 2, in <module>
ds = xr.open_dataset("geom_world06_stb.slf", engine="selafin")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sebourban/opentelemac/bin/miniforge/envs/otm-env/lib/python3.12/site-packages/xarray/backends/api.py", line 573, in open_dataset
backend_ds = backend.open_dataset(
^^^^^^^^^^^^^^^^^^^^^
File "/home/sebourban/opentelemac/bin/miniforge/envs/otm-env/lib/python3.12/site-packages/xarray_selafin/xarray_backend.py", line 272, in open_dataset
times = [datetime(*slf.header.date) + timedelta(seconds=t) for t in slf.time]
^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: datetime.datetime() argument after * must be an iterable, not NoneType
When I try on a time varying file I get :
Format "" is unknown and is forced to "SERAFIN "
but maybe this is OK - just a little odd to confirm that I have a SERAFIN when I know I do ..
You probably know the answer already ? thanks for looking into it.
Sébastien.
This concerns the particular case of 1D time series generated with the keywords:
TIME SERIES FILE 1 = 'results_1D.slf'
TIME SERIES COORDINATES FILE 1 = 'station.in'
in T2D. (I haven't tested yet in T3D or TOMAWAC)
I can provide the file generated:
results_1D.zip (to unzip)
opening it with PyTelTools Serafin gives me the following traceback:
ds = xr.open_dataset('results_1D.slf', engine = 'selafin')
SERAFIN VALIDATION ERROR: Unknown mesh type
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray/backends/api.py", line 573, in open_dataset
backend_ds = backend.open_dataset(
File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/xarray_backend.py", line 268, in open_dataset
slf = read_serafin(filename_or_obj, lang)
File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/xarray_backend.py", line 22, in read_serafin
resin.read_header()
File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/Serafin.py", line 1020, in read_header
self.header.from_file(self.file, self.file_size)
File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/Serafin.py", line 885, in from_file
self._check_dim()
File "/home/tomsail/work/python/pyPoseidon/.venv/lib/python3.10/site-packages/xarray_selafin/Serafin.py", line 198, in _check_dim
raise SerafinValidationError("Unknown mesh type")
xarray_selafin.Serafin.SerafinValidationError: Unknown mesh type
neither BlueKenue can open it.
Although opening it with the Selafin class works without problem :
slf= Selafin('results_1D.slf')
>>> slf.meshx
array([13.55799961, 13.23700047, 12.07441902, ..., 9.87355423,
5.1170001 , 5.99513721])
>>> slf.meshy
array([-12.33300018, -8.7869997 , -15.15768051, ..., 57.59385681,
61.93299866, 53.44028473])
>>> slf.get_values(0)
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
>>> slf.ikle3
array([[ 0],
[ 1],
[ 2],
...,
[1146],
[1147],
[1148]])
>>> slf.iparam
array([ 1, 0, 0, 0, 0, 0, 0, 1149, 0, 1])
The problem is that the function
def _check_dim(self):
# verify data consistence and determine 2D or 3D
if self.is_2d:
if self.nb_nodes_per_elem != 3:
raise SerafinValidationError("Unknown mesh type")
else:
if self.nb_nodes_per_elem != 6:
raise SerafinValidationError(
"The number of nodes per element is not equal to 6"
)
if self.nb_planes < 2:
raise SerafinValidationError("The number of planes is less than 2")
does not take this case in consideration. While the header class in Selafin might be more forgiving?
This might also be relevant to add this issue directly in PyTelTools.
I let you decide @lucduron
The following tasks will streamline metadata and fix the dependencies issues mentioned above and in #2
Minor issue, but I am opening the ticket so that I don't forget about it.
Currently, dask
is a mandatory dependency of the xelafin backend. Nevertheless dask
is an optional dependency of xarray.
My understanding is that it should be possible to develop the backend so that it can work either "eagerly" (without dask) or "lazily" (with dask). For now I guess that we should focus on getting either mode to work, but in the future we should test/support both.
For the record, it might be easier to fix the issues we have with write
in the eager mode.
There are 3 different files that contain metadata:
setup.py
setup.cfg
pyproject.toml
are all 3 of them needed? Unless there are specific reasons that force the usage of e.g. both setup.py and pyproject.toml I would suggest keeping just one of them. As it is, it is rather difficult to understand which file is being used + all 3 files need to be kept in sync when the metadata get updated.
we should have a conda package, too.
This is needed e.g. for inspectds
pyTelTools ikle indexing (or triangle connectivity) has a 1 increment (they start from 1) compared to the traditional Selafin indexing (that start from 0).
Maybe I would add a warning (and even better an example with a triplot/tricontour plot using matplotlib) so that users understand the subtility.
purely cosmetic
but I would avoid to have names that too long
also on pipy
I am not sure why yet, but a simple printing of the dataset throws out the following error:
>>> ds = xr.open_dataset('r3d_tidal_flats.slf', engine = 'selafin')
>>> ds
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/dataset.py", line 2534, in __repr__
return formatting.dataset_repr(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/reprlib.py", line 21, in wrapper
result = user_function(self)
^^^^^^^^^^^^^^^^^^^
File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/formatting.py", line 713, in dataset_repr
nbytes_str = render_human_readable_nbytes(ds.nbytes)
^^^^^^^^^
File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/dataset.py", line 1515, in nbytes
return sum(v.nbytes for v in self.variables.values())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/core/dataset.py", line 1515, in <genexpr>
return sum(v.nbytes for v in self.variables.values())
^^^^^^^^
File "/home/tomsail/miniconda3/envs/xr-slf/lib/python3.11/site-packages/xarray/namedarray/core.py", line 476, in nbytes
return self.size * self.dtype.itemsize
~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for *: 'int' and 'getset_descriptor'
It seems to be connected with the definition of the self.dtype
class variable.
print(ds.variables)
also throws the same error but:
print([var for var in ds.variables])
['Z', 'U', 'V', 'W', 'MUD', 'x', 'y', 'time']
works.
So far I have been able to work around this problem (also because we can set lazy_loading = False
when reading the file).
But the bug needs to :
A pipy package is available
https://pypi.org/project/xarray-selafin-backend/
@lucduron may I add you to the authors and contact persons? I think you're the most competent person to be listed there :)
@sebourban, I don't think it is relevant to put someone from EDF at this point but let me know if you think otherwise.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.