Giter Site home page Giter Site logo

cdo-bindings's Introduction

Cdo.{rb,py} - Use Ruby/Python to access the power of CDO

Tests

Welcome to the scripting interfaces of CDO! This repository contains interfaces for Ruby and Python. If you are not sure, wether this is useful or not, please have a look at: Why the .... should I use this???

What's going on

Currently this package is in a re-design phase. The target is a 2.0 release that will not be compatible with the exising release 1.5.x:

  • Write operator chains like methods chains with . as much as possible
  • hopefully reduce the number of kwargs keys
  • keep the Ruby and Python interface similar
  • possibly drop python-2.x support ... I am not sure when to do this best

Installation

Releases are distributed via pypi and rubygems:

  • Ruby
    gem install cdo (--user-install)
  • Python
    pip install cdo (--user)
    conda -c conda-forge install python-cdo

Requirements

Cdo.{rb,py} requires a working CDO binary and Ruby 2.x or Python 2.7/3.x

PLEASE NOTE: python-2.7 is unmaintained since January 2021 Many dependencies dropped support for 2.7 so I do manual testing with it,only.

Multi-dimensional arrays (numpy for python, narray for ruby) require addtional netcdf-io modules. These are scipy or python-netcdf4 for python and ruby-netcdf for ruby. Because scipy has some difficulties with netcdf, I dropped the support of it with release 1.5.0.

Thx to Alexander Winkler there is also an IO option for XArray.

Usage

You can find a lot of examples in the unit tests for both languages. Here are the direct links to the ruby tests and the python tests.

The following describes the basic features for both languages

Run operators

Befor calling operators, you have to create an object first:

    cdo = Cdo.new   #ruby
    cdo = Cdo()     #python

Please check the documentation for constructor paramaters. I try to have equal interfaces in both languages for all public methods.

Choose CDO binary

By default the cdo-bindings use the 'cdo' binary found in your $PATH variable. To change that, you can

  • load a module before calling your script(module command or another package manager like conda or spack)
  • use the CDO environment variable to set the path to be used
  • use the python/ruby method cdo.setCdo('/path/to/the/CDO/executable/you/want'). By this technique you can create different objects for different CDO versions.

Debugging

For debugging purpose, both interfaces provide a "debug" attribute. If it is set to a boolian true, the complete commands and the return values will be printed during execution

    cdo.debug = true    #ruby
    cdo.debug = True    #python

The default is false of cause.

File information

    cdo.infov(input: ifile)        #ruby
    cdo.showlevels(input: ifile)
    cdo.infov(input=ifile)         #python
    cdo.showlevels(input=ifile)

Operators with user defined regular output files

    cdo.timmin(input: ifile ,output: ofile)       #ruby
    cdo.timmin(input = ifile,output = ofile)      #python

By default the return value of each call is the name of the output files (no matter if its a temporary file or not)

Use temporary output files

If the output key is left out, one or more (depending on the operator) temporary files are generated and used as return value(s). In a regular script or a regularly closed interactive session, these files are removed at the end automatically.

    tminFile = cdo.timmin(input: ifile)  #ruby
    tminFile = cdo.timmin(input = ifile) #python

However these tempfiles remain if the session/script is killed with SIGKILL or if the bindings are used via Jupyter notebooks. Those session are usually long lasting and the heavy usage of tempfiles can easily fill the system tempdir - your system will become unusable then. The bindings offer two ways to cope with that

  • Set another directory for storing tempfiles with a constructor option and remove anything left in there when you experienced a crash or something like this
   cdo = Cdo(tempdir=tempPath)      #python
   cdo = Cdo.new(tempdir: tempPath) #ruby
  • remove all tempfiles created by this or former usage of the cdo-bindings belonging to your current Unix-user with (taking into account user-defined tempdir from above
   cdo.cleanTempDir() #python
   cdo.cleanTempDir   #ruby

Alternatively you can use environment variables to set this. Python's and Ruby's tempfile libraries support the variables 'TMPDIR', 'TEMP' and 'TMP' in their current versions (python-3.8.2, ruby-2.7.0). This feature might be used by administrators to keep users from filling up system directories.

Operators with parameter

    cdo.remap([gridfile,weightfile],input:   ifile, output: ofile)   #ruby
    cdo.remap([gridfile,weightfile],input => ifile, output => ofile) #python

logging

    cdo = Cdo.new(logging: true, logFile: 'cdo_commands.log') #ruby
    cdo = Cdo(logging=True, logFile='cdo_commands.log')       #python

Set global CDO options

    cdo.copy(input:  ifile, output:  ofile,options:  "-f nc4")     #ruby
    cdo.copy(input = ifile, output = ofile,options = "-f nc4")     #python

Set environment variables

    cdo.splitname(input: ifile.join(' '),
                  output: 'splitTag',
                  env: {'CDO_FILE_SUFFIX' => '.nc'}) #or
    cdo.env = {'CDO_FILE_SUFFIX' => '.nc'}
    cdo.splitname(input = ' '.join(ifiles),
                  output =  'splitTag', 
                  env={"CDO_FILE_SUFFIX": ".nc"})   #or
    cdo.env = {'CDO_FILE_SUFFIX': '.nc'}

Return multi-dimension arrrays

    t = cdo.fldmin(:input => ifile,:returnArray => true).var('T').get  #rb, version <  1.2.0
    t = cdo.fldmin(:input => ifile,:returnCdf => true).var('T').get    #rb, version >= 1.2.0
    t = cdo.fldmin(:input => ifile,:returnArray => 'T')                #rb, version >= 1.2.0
    t = cdo.fldmin(input = ifile,returnArray = True).variables['T'][:] #py, version <  1.2.0
    t = cdo.fldmin(input = ifile,returnCdf = True).variables['T'][:]   #py, version >= 1.2.0
    t = cdo.fldmin(input = ifile,returnArray = 'T')                    #py, version >= 1.2.0

Other options are so-called masked arrays (use returnMaArray) for ruby and python and XArray/XDataset for python-only: use returnXArray or returnXDataset for that.

*) If you use scipy >= 0.14 as netcdf backend, you have to use following code instead to avoid possible segmentation faults:

    cdf = cdo.fldmin(input = ifile,returnCdf = True)
    temperatures = cdf.variables['T'][:]

More examples can be found in test/cdo-examples.rb and on the homepage

Avoid re-processing

If you do not want to re-compute files, you can set

  • the instance attribute 'forceOutput' to false: this will effect all later call of that instance or
  • the operator option 'forceOutput' to false: this will only effect this operator call of this instance

For more information, please have a look at the unit tests.

Support, Issues, Bugs, ...

Please use the forum or ticket system of CDOs official web page: http://code.mpimet.mpg.de/projects/cdo

Changelog

  • next 2.0:
    • reduced usage of keywords:
      • many of them just set return type, so they will go to the run() method
      • options only has effect during run of the tool, so this can also go into run()
      • the different input types can be handled in something like input() or
  • 1.6.0:
    • merged pull requests regarding threading, signal and tempfile handling
    • added support for xarray/xdataset input AND output at the same time (request by Pauline Millet)
  • 1.5.6:
    • slight adoptions for CDO-2.0.0
    • limitted support for python-2.7: many other libs dropped support for it so I can only do limitted testing
    • new API: cdo.config holds a dictionary/hash with built-in CDO features (availble sind CDO-1.9.x), empty otherwise
    • removed cdo.hasLib() and cdo.libsVersion(): relied on unstable output of cdo -V. use cdo.config instead infiles(). This should clean up the lengthy code, which does this currently
  • 1.5.4(python-only):
    • bugfix release for @pgierz regarding #30
  • 1.5.1(ruby-only):
    • fix some warnings with latest ruby release 2.7.x
  • 1.5.0(ruby)/1.5.3(python) API change :
    • simplify the interface:
      • remove returnCdf from constructor, only use it with operator calls
      • remove methods setReturnArray/unsetReturnArray: I fear it's not used anyway, but 'returnArray' in each call
      • remove the optional dependency to scipy since it offers less functionality than netCDF4 and just blows up the code
      • new attributes: hasNetcdf, hasXArray for checking for the respective support
      • fix for cdo-1.9.6: allow non-zero return code for diff operators
    • the 1.5.2 relase for python is identical to 1.5.0 (was testing a new setup.py and version cannot be used twice in pipy)
    • 1.5.1 had building problems with pip based on python2
  • 1.4.0 API change :
    • the operators atribute is no longer a list, but a dict (python) or hash (ruby) holding the number of output streams as value
    • finally fix #16 (missing tempfile generation for more than one output streams)
    • fix #19 (thx @pgierz for the input)
  • 1.3.6:
    • bugfix for non-finding the CDO binary on some systems
    • fix hasCdo (py)
    • add hasCdo (rb)
  • 1.3.5:
    • read/write support for XArray datasets - thx to @pinplex!
    • drop ruby support for 1.9 and older
    • remove module interface from the ruby version
  • 1.3.3:
    • return arrays/lists of output files, which are created by split* operators suggestion from Karl-Hermann Wieners ๐ŸŒŠ NOTE: this is done by simple globbing! Any other files with the appropriate name will be included in the list!
    • use six for python2 and 3 compatibility (thanks to @jvegasbsc)
    • drop full support of CDO version older then 1.5.4: undocumented operators in these version will not be callable
    • new keyword for operators which write to stdout: autoSplit. When set, each line will be split with the given value of the keyword to avoid the need for manual splitting. Nested return arrays of (outer) size 1 are flattened. See #11, thx to @beatorizu
  • 1.3.2
    • improvened stdout/stderr handling, thx to jvegasbsc
  • 1.3.1
    • fix environment handling per call (ruby version)
  • 1.3.0
    • require ruby-2.*
    • support for upcomming CDO release 1.7.1
    • improve loggin for ruby
    • introduce logging for python
    • unicode bugfix - thanks to Sebastian Illing (illing2005) [python-only]
  • 1.2.7
    • Added class interface for ruby version 2.x, mainly for thread safety
  • 1.2.6
    • bugfix for autocompletion in interactive usage [python-only]
  • 1.2.5
    • bugfix for environment handling (Thanks philipp) [python-only]
    • add logging [ruby-only]
  • 1.2.4
    • support python3: Thanks to @jhamman
    • bugfix for scipy: Thanks to @martinclaus
    • docu fixes: Thanks to @guziy
    • allow environment setting via call and object construction (see test_env in test_cdo.py)
  • 1.2.3
    • bugfix release: adjust library/feature check to latest cdo-1.6.2 release
  • 1.2.2
    • allow arrays in additions to strings for input argument
    • add methods for checking the IO libraries of CDO and their versions
    • optionally return None on error (suggestion from Alex Loew, python only)
  • 1.2.1
    • new return option: Masked Arrays if the new keyword returnMaArray is given, its value is taken as variable name and a masked array wrt to its FillValues is returned contribution for python by Alex Loew
    • error handling: return stderr in case of non-zero return value + raise exception contribution for python from Estanislao Gonzalez
    • autocompletion and built-in documentation through help() for interactive use contribution from Estanislao Gonzalez [python]
    • Added help operator for displaying help interactively [ruby]
  • 1.2.0 API change:
    • Ruby now uses the same keys like the python interface, i.e. :input and :output instead of :in and :out
    • :returnArray will accept a variable name, for which the multidimesional array is returned
  • 1.1.0 API change:
    • new option :returnCdf : will return the netcdf file handle, which was formerly done via :returnArray
    • new options :force : if set to true the cdo call will be run even if the given output file is presen, default: false

License

Cdo.{rb,py} makes use of the BSD-3-clause license

cdo-bindings's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cdo-bindings's Issues

Hard coded `-s`

Moin Brian @Try2Code,

it seems like I have no option to access the stdout of operators which makes it hard to use verifygrid or cmor.
Could we implement sth to enable that? Maybe a logfile or sth?

Best,
Fabi

`muldpm` was `muldpy` in conda cdo

In conda version, muldpm multiplies days of year. The correct one should be days of month.
Any idea how to fix this error?

cdo -f nc4 -z zip_1 -muldpm infile outfile

Apply new syntax for calls with mutiple inputs

With the new syntax cdo.op1().op2().op3().run() the question comes up: How to handle operators with multiple inputs.
Here is one idea I'd like to discuss @Chilipp

cdo -add -fldmean -topo,global_0.2 -fldmax -topo,'global_1' /tmp/CdoTemp234.grb 

generated from

cdo = Cdo()
cdo.add(input=[cdo.fldmean.topo('global_0.2'),cdo.fldmax.topo('global_1')]).run()

Any other idea how to write this down?

wrapper does not find cdo in conda env

Since version 1.3.4 it looks like the initialization of the cdo wrapper has changed. I'm installing both cdo and python-cdo via conda. But the cdo wrapper can not find the cdo tool in the PATH. I need to set the PATH when I initialize the cdo wrapper:

import os
from cdo import Cdo
cdo = Cdo(env=os.environ)

threaded cdo

hi @Try2Code ,
thanks again for this, i like to use cdo through these bindings. i have an issue that just recently came up when i try to use dask.distributed to use the rather heavy sp2gpl operator on ERA5 data. I like to delay the execution and run then on several timesteps in parallel. i'll try to make a reproducible case here, e.g.,

setup dask client

from dask.distributed import Client, progress
client = Client()
client

Screenshot 2021-07-19 at 23 12 32

test dataset

import xarray as xr
ds = xr.tutorial.open_dataset("air_temperature")
ds.sel(time='2013-01-01')

some operator that could be embarassingly parallel:

from cdo import Cdo

def test_op(input):
    return Cdo().setgridtype('curvilinear', options='-f nc4', input=input)

lazy evaluation with dask: write out each timestep into a single file and run the operator on that.

def apply(ds, use_dask=False):
    if use_dask:
        from dask import delayed
    else:
        def delayed(x):
            return x
    output = []
    for ti in ds.time:
        # expand_dims so that cdo's to_netcdf call keeps time coordinate in nc file.
        ti_res = delayed(test_op)(ds.sel(time = ti).expand_dims(dim='time'))
        output.append(ti_res)
    return output

Now apply our operator onto the dataset (just the first 10 timesteps for testing...)

out = apply(ds.isel(time=range(0,10)), use_dask=True)
out

Screenshot 2021-07-19 at 23 18 19

This all works fine, however, when i actually trigger the computation, i get this signal error:

import dask
%time out_ = dask.compute(out)

Screenshot 2021-07-19 at 23 22 42

proposed solution

The problem seems to be with signal handling in threads. I could actually avoid the error if the signal handling only runs in the main thread, e.g., something like this in cdo.py

 if threading.current_thread() is threading.main_thread():
        signal.signal(signal.SIGINT, self.__catch__)
        signal.signal(signal.SIGTERM, self.__catch__)
        signal.signal(signal.SIGSEGV, self.__catch__)
        signal.siginterrupt(signal.SIGINT, False)
        signal.siginterrupt(signal.SIGTERM, False)
        signal.siginterrupt(signal.SIGSEGV, False)

I runs fine then and actually scales quite nicely, but i am not sure about correctly handling signals in threads. my solution might be naive. It would really be nice to be able to delay cdo computations with dask to integrate them with other workflows... hope, i could make myself clear :) thanks!!!

cdo-bindings fail with cdo2.0.0

Just realized yesterday when my ci pipeline failed. The lastet cdo version on conda-forge is v2.0.0 since yesterday. I had to state explicitly cdo==1.9.9 to have my ci pipeline run. There is obviously a problem with parsing the cdo version in the bindings.

what i did

conda create -n cdo_test python=3 cdo python-cdo
from cdo import Cdo
cdo = Cdo()

fails with

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/anaconda3/envs/cdo_test/lib/python3.10/site-packages/cdo.py", line 156, in __init__
    self.operators         = self.__getOperators()
  File "/opt/anaconda3/envs/cdo_test/lib/python3.10/site-packages/cdo.py", line 236, in __getOperators
    version = parse_version(getCdoVersion(self.CDO))
  File "/opt/anaconda3/envs/cdo_test/lib/python3.10/site-packages/cdo.py", line 78, in getCdoVersion
    return match.group(1)
AttributeError: 'NoneType' object has no attribute 'group'

works with

conda create -n cdo_test python=3 cdo==1.9.9 python-cdo

New release on PyPI

Hello,

I noticed that the version on PyPI is not up to date with the GitHub version. Can we update this please?

Thanks!
Paul

returnArray cause segmentation fault when using cdfMod "scipy"

Running the test suite aborts with a segmentation fault in test_returnArray on line 252.
Also doing something like

temperature = cdo.stdatm(0,options = '-f nc', returnCdf = True).variables['T'][:]
print temperature

will end in an segmentation fault, when temperature is accessed by print. Everything works as expected, when cdfMod is"netcdf4". Maybe the netcdf4 should be used as default.

$ cdo -V
Climate Data Operators version 1.6.3 (http://code.zmaw.de/projects/cdo)
Compiled: by mclaus on ares.geomar.de (x86_64-unknown-linux-gnu) Feb 25 2014 10:44:28
Compiler: gcc -std=gnu99 -g -O2 -pthread
 version: gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Features: PTHREADS NC4 OPeNDAP Z JASPER UDUNITS2 PROJ.4 XML2 MAGICS CURL
Libraries: proj/4.8 xml2/2.7.8 curl/7.35.0(h7.32.0)
Filetypes: srv ext ieg grb grb2 nc nc2 nc4 nc4c 
     CDI library version : 1.6.3 of Feb 25 2014 10:44:20
 CGRIBEX library version : 1.6.3 of Jan  8 2014 19:55:18
GRIB_API library version : 1.11.0
  netCDF library version : 4.3.0 of Sep 20 2013 12:21:08 $
    HDF5 library version : 1.8.10
 SERVICE library version : 1.3.1 of Feb 25 2014 10:44:06
   EXTRA library version : 1.3.1 of Feb 25 2014 10:44:01
     IEG library version : 1.3.1 of Feb 25 2014 10:44:04
    FILE library version : 1.8.2 of Feb 25 2014 10:44:01

python -c "import scipy; print scipy.__version__"
0.14.0
python -c "import netCDF4; print netCDF4.__version__"
1.0.8

Error loading netCDF4 after installing CDO

I have successfully installed CDO and have confirmed that it is working from the command line. But I'm having trouble getting the python bindings to work. The problem seems to be with the netCDF4 module, which is installed in the same directory as cdo.

Here is the error I get when I try to import the netCDF4 module in python:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "netCDF4/__init__.py", line 3, in <module> from ._netCDF4 import * ImportError: dlopen(netCDF4/_netCDF4.so, 2): Symbol not found: _clock_gettime Referenced from: netCDF4/.dylibs/libcurl.4.dylib (which was built for Mac OS X 10.13) Expected in: /usr/lib/libSystem.B.dylib in netCDF4/.dylibs/libcurl.4.dylib

And when I try to initialize cdo:
>>> cdo = Cdo() Could not load netCDF4 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "cdo.py", line 135, in __init__ self.loadCdf() # load netcdf lib if possible and set self.cdf }}} File "cdo.py", line 431, in loadCdf from netCDF4 import Dataset as cdf File "netCDF4/__init__.py", line 3, in <module> from ._netCDF4 import * ImportError: dlopen(netCDF4/_netCDF4.so, 2): Symbol not found: _clock_gettime Referenced from: netCDF4/.dylibs/libcurl.4.dylib (which was built for Mac OS X 10.13) Expected in: /usr/lib/libSystem.B.dylib in netCDF4/.dylibs/libcurl.4.dylib

Any suggestions for how to fix?

Segmentation fault when using bindings

When using the python bindings for cdo I get a segmentation fault that I don't get when running the same command on the command line.
In python I have the following chained command:

cdo.setreftime("1970-01-01,00:00:00", input="-sellonlatbox,{lon0},{lon1},-90,90 -settunits,hours -setcalendar,standard -masklonlatbox,0,360,-90,90 -remapnn,custom_grid.txt {input}".format(lon0=center-180, lon1=center+180.1, input=path_to_input), output=path_to_output, options = "-P 8 -f nc4 -z zip")

When running this script I get the following output:

>>> cdo -O -P 8 -f nc4 -z zip -setreftime,1970-01-01,00:00:00 -sellonlatbox,-180.0,180.1,-90,90 -settunits,hours -setcalendar,standard -masklonlatbox,0,360,-90,90 -remapnn,custom_grid.txt data/gfas_frpfire_Aug2019.nc data/gfas_frpfire_Aug2019_processed.nc<<<
[...]
Segmentation fault (core dumped)

When I now run the command that the python bindings probably executed (cdo -O ...) on the command line (without python) I do not get a Segmentation fault. The python program does nothing else, so the error really comes from running the command above.

Any idea why this is happening? I am not sure if I built all libraries (hdf, netcdf4) threadsafe but this should not be the problem as there is no SegFault when running the command on the command line.

Python 3 support

Would you be interested in including support for python 3? Having looked at the code, it doesn't seem like it would be much work?

Process using cdo cannot be ctrl-C'd

Running a process with a Cdo() interface instantiated in a python module does not allow for a ctrl-C exit. This is likely a problem with the signal handling. It looks like the issue may be here, where some cleanup occurs but the signal is ignored (using signal.raise may solve the issue).

Using environment setting in a single cdo call is not supported

In the current release 1.3.0 the ruby version does not support the assignment of environment variables per call, only as a object setup, i.e.

cdo.env = {"PLANET_RADIUS" => "6379000"}

works, but

cdo.fldmean(input: ..., env: {"PLANET_RADIUS" => "6379000"}

doesn't

the test (test_env) does not cover this feature, because it is testing for the default parameter values

List return

It could be interesting if commands like showname, showtimestamp returne a list of results no a list of a single string.

When I run

In[3]: cdo.showdate(input='/some/file/')
# return like this
Out[3]: ['2017-01-19  2017-01-20  2017-01-21  2017-01-22  2017-01-23  2017-01-24  2017-01-25  2017-01-26']

Then I run

In[4]: cdo.showdate(input='/some/file/')[0].split('  ')
Out[4]: ['2017-01-19',
         '2017-01-20',
         '2017-01-21',
         '2017-01-22',
         '2017-01-23',
         '2017-01-24',
         '2017-01-25',
         '2017-01-26']

I believe it is better if this is a built-in function. Is this could be improved?

clarify license of `cdo-bindings`

It's not clear to me which license is invoked with cdo-bindings. The conda feedstock says:

https://github.com/conda-forge/python-cdo-feedstock/blob/d8dcf0fa49b178185272c33e6d074df2211f2603/recipe/meta.yaml#L38

about:
  home: https://code.zmaw.de/projects/cdo/wiki/Cdo%7Brbpy%7D
  license: GPL-2.0-or-later
  license_file: COPYING
  summary: Use CDO in the context of Python as if it would be a native library

while in the README:

https://github.com/Try2Code/cdo-bindings/blob/master/README.md#license

Cdo.{rb,py} makes use of the BSD-3-clause license

GPL is copyleft while BSD not, so that makes quite a difference.

Hi sir,I am trying to install python-cdo by using "conda create -n cdo_test python=3 cdo python-cdo" and it shows error.

Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  • python-cdo
  • cdo==1.9.9

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

Note: you may need to restart the kernel to use updated packages.

Feature request: PyCDO for/from Bash

Hello,

I have a bit of a "bizarre" feature request. It is sometimes nice to use pycdo on your own to build nice command chains. However, I often have colleagues who are not so good at python. It might therefore be nice to have an option to spit out whatever the current command chain would look like to a shell script. Sine pycdo calls subprocesses anyway in the background, it should be something as simple as grabbing whatever you are about to send to the shell, and printing it to a file instead.

Might also be nice for debugging.

On the other side (yet this is much more tricky from a gut feeling) it might be nice to sometimes be able to take some sort of shell script, read it in, and generate an equivalent CDO 'object' to keep using afterward. Tricks here might involve recognizing multiple cdo commands, since given any user input, there will likely be more than just one command in there.

What do you think?

Cheers
Paul

Cannot initialize with specified cdopath

When cdo is not on the users path or when the cdo to be used is different, the version 1.6.0 will not work. This was not a problem with v1.5.x. To reproduce the error, look at the example failure section. The fix is surprisingly simple and is described in the resolution section.

Should I fork and make a PR?

Resolution

I think I have traced it down to differences between the Operator class.[1,2] In the new code, the init explicitly uses *args and **kwds, which are not provided. As a result, the cdo argument is lost to the child operators. This can be resolved by explicitly passing the CDO proprity as the argument to Operator.

580c580
<             setattr(self.__class__, method_name, Operator())
---
>             setattr(self.__class__, method_name, Operator(self.CDO))

[1] https://github.com/Try2Code/cdo-bindings/blob/master/python/cdo/cdo.py#L570-L581
[2] https://github.com/Try2Code/cdo-bindings/blob/maintenance-1.5.x/python/cdo.py#L608-L614

Example Failure

For example, I have a cdo binary at /usr/local/apps/cdo-2.2.1/intel-21.4/bin/cdo, which is not in my path.

which cdo
/usr/bin/which: no cdo in ...

But it is fully functional

/usr/local/apps/cdo-2.2.1/intel-21.4/bin/cdo sinfo -stdatm,0
cdo(1) stdatm: Process started
   File format : GRIB
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter ID
     1 : unknown  unknown  c instant       1   1         1   1       : 1             
     2 : unknown  unknown  c instant       1   1         1   1       : 130.128       
   Grid coordinates :
     1 : lonlat                   : points=1 (1x1)
                              lon : 0 degrees_east
                              lat : 0 degrees_north
   Vertical coordinates :
     1 : height                   : levels=1
                            level : 0 m
   Time coordinate :
                             time : 1 step
     RefTime =  0000-00-00 00:00:00  Units = hours  Calendar = proleptic_gregorian
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  0001-01-01 00:00:00
cdo(1) stdatm: 
cdo    sinfo: Processed 2 variables over 1 timestep [0.00s 18MB]

If I use the python bindings with a specified, I get an error FileNotFoundError: [Errno 2] No such file or directory: 'cdo'

import cdo

cpath = '/usr/local/apps/cdo-2.2.1/intel-21.4/bin/cdo'
cdoo = cdo.Cdo(cpath)
f = cdoo.stdatm(0, returnCdf=True)

Error:

Traceback (most recent call last):
  File "/home/bhenders/temp/test_cdo.py", line 5, in <module>
    f = cdoo.stdatm(0, returnCdf=True)
  File "/home/bhenders/temp/cdopy/lib/python3.9/site-packages/cdo/cdo.py", line 580, in __getattr__
    setattr(self.__class__, method_name, Operator())
  File "/home/bhenders/temp/cdopy/lib/python3.9/site-packages/cdo/cdo.py", line 577, in __init__
    super().__init__(*args, **kwargs)
  File "/home/bhenders/temp/cdopy/lib/python3.9/site-packages/cdo/cdo.py", line 173, in __init__
    self.operators = self.__getOperators()
  File "/home/bhenders/temp/cdopy/lib/python3.9/site-packages/cdo/cdo.py", line 246, in __getOperators
    version = parse_version(getCdoVersion(self.CDO))
  File "/home/bhenders/temp/cdopy/lib/python3.9/site-packages/cdo/cdo.py", line 70, in getCdoVersion
    proc = subprocess.Popen(
  File "/usr/local/apps/oneapi/intelpython/python3.9/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/apps/oneapi/intelpython/python3.9/lib/python3.9/subprocess.py", line 1821, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'cdo'

Using pdb, I can go up to 580 line where self.CDO is as expected and calling Operator() raises the error.

> /home/bhenders/temp/cdopy/lib/python3.9/site-packages/cdo/cdo.py(580)__getattr__()
-> setattr(self.__class__, method_name, Operator())
(Pdb) self.CDO
'/usr/local/apps/cdo-2.2.1/intel-21.4/bin/cdo'
(Pdb) Operator()
*** FileNotFoundError: [Errno 2] No such file or directory: 'cdo'

Unable to find CDO version

Dear CDO-Python developers,

As of CDO version 1.7.2, the parsing of the version number no longer works. If I understand the code correctly, the string "Climate Data Operators" is searched for:

match = re.search("Climate Data Operators version (\d.*) .*",cdo_help)

However, the help string now only says CDO, not Climate Data Operators, and the parsing thereby fails.

The fix should be trivial...

getOperators() doesn't work with Python from within Cygwin on Windows

For Windows there are Cygwin64 CDO binaries available. When trying to use these I have the problem that the getOperators() function doesn't work properly. The culprit seems to be that the output of cdo --operators is being split at os.linesep (see, e.g., https://github.com/Try2Code/cdo-bindings/blob/master/python/cdo.py#L305). However, os.linesep on Windows is \r\n, while the output of the cdo --operators call is only separated by \n.

I guess it should be possible to use something like (pseudo-code)

if on_cygwin:
    cdosep = '\n'
else:
    cdosep = os.linesep

and then use cdosep instead of os.linesep.

Would you be open to a PR implementing this?

Can not initialize Cdo object for CDO version 1.9.10

I am working with the python CDO binding in a docker container which runs a debian bullseye distribution (python:3.9.15-slim-bullseye). I installed the cdo biniary using apt-get install cdo which installs CDO version 1.9.10 (https://packages.debian.org/bullseye/cdo).

When I call Cdo() I get an IndexError with a Traceback that ends as follows:

  File "/usr/local/lib/python3.9/site-packages/cdo.py", line 190, in __init__
    self.operators         = self.__getOperators()
  File "/usr/local/lib/python3.9/site-packages/cdo.py", line 341, in __getOperators
    operators[op] = int(ios[i][1:len(ios[i]) - 1].split('|')[1])
IndexError: list index out of range

This is caused by an unexpected behaviour of cdo --operators. When running cdo --operators in the command line, I get an unexpected output in the first line of the output:

PRE-MAIN-DEBUG Registering library [eckit] with address [0x7fa77f2ad9a0]
abs              Absolute value                                                            (1|1)
acos             Arc cosine                                                                (1|1)
add              Add two fields                                                            (2|1)
addc             Add a constant                                                            (1|1)
addtrend         Add trend                                                                 (3|1)
...

So the first line of the output is actually not an operator. I can monkey patch this error with the following code which basically jumps the first line of the output of cdo --operators:

import os
import subprocess

from cdo import Cdo

class CdoMonkeyPatch(Cdo):
    def __getOperators(self):  # {{{
        operators = {}
        proc = subprocess.Popen([self.CDO, '--operators'],
                                stderr=subprocess.PIPE, stdout=subprocess.PIPE)
        ret = proc.communicate()
        ops = list(map(lambda x: x.split(' ')[0], ret[0].decode(
            "utf-8")[0:-1].split(os.linesep)))
        ios = list(map(lambda x: x.split(' ')
                   [-1], ret[0].decode("utf-8")[0:-1].split(os.linesep)))
        for i, op in enumerate(ops):
            if i != 0:
                operators[op] = int(ios[i][1:len(ios[i]) - 1].split('|')[1])

        return operators  # }}}

Is there something wrong with my CDO installation?

BTW: This error did not appear when I was using the python:3.8.13-slim-buster image for which CDO version 1.9.6 is installed by apt-get install cdo (https://packages.debian.org/buster/cdo).

Python binding captures "kill" and "KeyboardInterrupt" signals in all processes after initialization

Attempting to interrupt a process with Ctrl-C after initializing the python binding with cdo = Cdo() often fails and prints weird messages, even when interrupting python processes unrelated to the cdo commands. The nco python bindings (originally forked from this repo) also have the same issue.

Here's an example before initializing the binding:

In [1]: import time
   ...: while True:
   ...:     print('hi')
   ...:     time.sleep(1)
hi
hi
^[^C---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-2-0bf96da52d4a> in <module>
      2 while True:
      3     print('hi')
----> 4     time.sleep(1)

KeyboardInterrupt:

And here's an example after initializing the binding and holding down Ctrl+C:

In [3]: import time
   ...: from cdo import Cdo
   ...: cdo = Cdo()
   ...: while True:
   ...:     print('hi')
   ...:     time.sleep(1)
hi
hi
^Ccaught signal <cdo.Cdo object at 0x7f596c07dee0> 2 <frame at 0x7f5932049d40, file '<ipython-input-3-742fa89f637e>', line 6, code <module>>
hi
^Ccaught signal <cdo.Cdo object at 0x7f596c07dee0> 2 <frame at 0x7f5932049d40, file '<ipython-input-3-742fa89f637e>', line 6, code <module>>
hi
^Ccaught signal <cdo.Cdo object at 0x7f596c07dee0> 2 <frame at 0x7f5932049d40, file '<ipython-input-3-742fa89f637e>', line 6, code <module>>
hi
^Ccaught signal <cdo.Cdo object at 0x7f596c07dee0> 2 <frame at 0x7f5932049d40, file '<ipython-input-3-742fa89f637e>', line 6, code <module>>
hi

It's impossible to kill the while-loop without killing the parent shell -- even sending the ipython session to the background with Ctrl+Z and trying to kill the session with kill %1 fails because cdo seems to capture that signal as well:

$ kill %1
caught signal <cdo.Cdo object at 0x7f596c07dee0> 15 <frame at 0x7f5932049d40, file '<ipython-input-3-742fa89f637e>', line 6, code <module>>

The only way to stop the while loop is to kill the parent shell running the python process. Not sure how the binding is implemented but it seems to be changing something persistent about the python state.

temp files aren't deleted

It appears as if the __del__ method of MyTempFile is never called, leading to the /tmp folder on my system to be filled. A reboot would solve this problem, but how can this be done on computers that are rarely rebooted (e.g. servers)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.