Giter Site home page Giter Site logo

Comments (24)

Try2Code avatar Try2Code commented on August 21, 2024 1

Because you mentioned backward compatibility before wrt using input or infile(): IMO we can drop it. I am willing to maintain the last python release for a while in parallel. I don't get many bug reports, so I do not expect a big amount of work from that. What we plan here is more a disruptive 2.0 version to me - I want to rethink this as a whole.

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

Hey @Try2Code!

Good idea and indeed, this is pretty simple to be implemented.

But let me first note that the call

cdo.add(input=[cdo.fldmean.topo('global_0.2'),cdo.fldmax.topo('global_1')])

raises an error. You specify the input parameter here which, for backwards compatibility, runs the cdo operator. As such, if any kwargs are given to the method, the computation is performed, as controlled by the following lines

see

cdo-bindings/python/cdo.py

Lines 445 to 448 in d4dcd6a

if not user_kwargs or not kwargs.get('compute', True):
return self
elif not kwargs.get('keep', True):
self._cmd.clear()

The returned object from your cdo.add(...) call (a string, or xarray Dataset, or whatsoever) will therefore not have a run method. Within the new framework, you need

cdo.add.infile(cdo.fldmean.topo('global_0.2'),cdo.fldmax.topo('global_1')).run()

And this is also, where I would suggest to implement it, in the infile method

cdo-bindings/python/cdo.py

Lines 355 to 366 in d4dcd6a

def infile(self, *infiles):
for infile in infiles:
if isinstance(infile, six.string_types):
self._cmd.append(infile)
elif self.hasXarray:
import xarray #<<-- python2 workaround
if (type(infile) == xarray.core.dataset.Dataset):
# create a temp nc file from input data
tmpfile = self.tempStore.newFile()
infile.to_netcdf(tmpfile)
self._cmd.append(tmpfile)
return self

My suggestion is to simply combine the _cmd attribute in case of a Cdo object by adding these two lines to the method:

      elif isinstance(infile, Cdo):
        self._cmd.extend(infile._cmd)

In the end, the Cdo.infile method then looks like

  def infile(self, *infiles):
    for infile in infiles:
      if isinstance(infile, six.string_types):
        self._cmd.append(infile)
      elif isinstance(infile, Cdo):
        self._cmd.extend(infile._cmd)
      elif self.hasXarray:
        import xarray #<<-- python2 workaround
        if (type(infile) == xarray.core.dataset.Dataset):
          # create a temp nc file from input data
          tmpfile = self.tempStore.newFile()
          infile.to_netcdf(tmpfile)
          self._cmd.append(tmpfile)
    return self

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

yes, I was just suggesting a syntax - never meant to be able to run it right away ;-)

thx for your thoughts! I was thinking in a similar way since I want to implement this in Ruby,too.

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

awesome 👍 Let me know when you have more of these nice ideas 😃

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

Hey @Try2Code! What came to my mind: the infile method might be a bit counter-intuitive from a Pythonic perspective. It makes sense in terms of CDOs, so I think we should implement this, but I would suggest to also combine Cdo operators through the normal + operator, i.e. something like

command = cdo.add + cdo.fldmean.topo('global_0.2') + cdo.fldmax.topo('global_1')
command.run()

This should be very easy to implement by adding an __add__ and __iadd__ method to the Cdo class. Or do you think it would be to confusing because CDOs also have an add operator?

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

this looks compelling, but I it's essentially an additional operator + on top of add 🤔

Do you have this in mind for operators that have multiple input stream? or in general? What I like about it, is that it allow to write things parenthesis-free. but I fear a little the freedom to do so 🤣

The + can also be regarded as a replacement for . The correct command line could also be generated by

cdo.add.fldmean.topo('global_0.2').fldmax.topo('global_1')

because the CDO binary is smart enough to parse it. It just looks way too confusing in python.

I see problems when using the + operator in situation where the CDO operators using it is not commutative (like ```sub``). Brackets are less beautiful, but easier to understand - a certain order can be enforced with them,too.

But that's what I like about this re-design: we are free to experiment with the syntax 👍

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

Do you have this in mind for operators that have multiple input stream? or in general?

I would have implemented it in general since there are (in Python) no checks implemented so far, how many inputs streams the various CDO operators have.

The + can also be regarded as a replacement for .

You're absolutely right. The only advantage of the + would be the reusability of operators. For instance something like this would be possible:

cdo1 = cdo.topo('global_0.2')
cdo2 = cdo.fldmean + cdo1
final = cdo.sub + cdo2 + cdo1
final.run()

I see problems when using the + operator in situation where the CDO operators using it is not commutative (like ```sub``)

Yes, this is indeed true it should be mentioned in the documentation that the + operation is not commutative by design. For python, this is however not a problem because the + operation does not need to be commutative. a + b will always be evaluated as a.__add__(b).

Brackets can be used, but with the proposed design, they would not have an effect because of its associative property (i.e. (cdo1 + cdo2) + cdo3 = cdo1 + (cdo2 + cdo3) =...). But of course, one can put things in brackets for the purpose of a cleaner code.

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

What we plan here is more a disruptive 2.0 version to me - I want to rethink this as a whole.

Sounds good to me and I am happy to contribute 😃

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

Because you mentioned backward compatibility before wrt using input or infile(): IMO we can drop it.

Yes, I agree, then we should drop the infile method entirely, and just use the input parameter. Actually, I think it would be cleaner to change this to an infile parameter, because this is how it is framed in the CDO docs. I propose something like

cdo.add(infile=[cdo.fldmean.topo('global_0.2'), cdo.fldmax.topo('global_1')]).run()

This also avoids confusion with pythons built-in input function. But it's up to you, keeping input as a parameter for the operator call works well, too.

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

just because we talk about brackets so much: I wrote a short article about the introduction of brackets on the command line for special cases where the parenthesis-free version limits what you can do: in the case of arbitrary number of input streams (https://code.mpimet.mpg.de/boards/53/topics/6702)

I don't use this in cdo.py or cdo.rb because it's still rather new (ok, it's already a year old). But in fact it allows functionality that was not possible before. Maybe it's worth taking this into account,too

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

I wrote a short article about the introduction of brackets on the command line for special cases where the parenthesis-free version limits what you can do: in the case of arbitrary number of input streams

Awesome! I did not know about this new feature 😍. It is indeed a good question how to handle this, because the brackets in cdo1 + (cdo2 + cdo3) are not communicated to the python calls. I have to think about it...

But using the input/infile parameter would allow this, i.e. it would be possible to translate

cdo.add(infile=[cdo.fldmean.topo('global_0.2'), cdo.fldmax.topo('global_1')]).run()

into

cdo -add [ -fldmean -topo.global_0.2 -fldmax -topo,global_1 ]

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

yes, it would mimic the brackets on the command line to a certain extent

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

yes, but as far as I understand the new [ ] notation, it also forces to specify all input streams in this call. So something like

cdo.add(infile=cdo.fldmean.topo('global_0.2')) + cdo.fldmax.topo('global_1')

won't work because it would be translated to

cdo -add [ -fldmean -topo.global_0.2 ] -fldmax -topo,global_1

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

correct - the above call does not work, it has to be cdo -add [ -fldmean -topo,global_0.2 -fldmax -topo,global_1 ]

infile and + are not interchangeable here. We should not have both I guess. IMO infile is better, because [] is designed to collecting/marking inputs for a certain operator and the infile key is meant for exactly that

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

Considering the brackets in the '+' operation. I think the best possibility would be to allow strings in this operation and then let the user insert the brackets where he wants. For example via

cdo.add + '[' + cdo.fldmean.topo('global_0.2') + cdo.fldmax.topo('global_1') + ']'

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

We should not have both I guess.

So you mean we should not implement the + operation at all, right?

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

Considering the brackets in the '+' operation. I think the best possibility would be to allow strings in this operation and then let the user insert the brackets where he wants. For example via

cdo.add + '[' + cdo.fldmean.topo('global_0.2') + cdo.fldmax.topo('global_1') + ']'

hm, this looks more and more like string processing. In fact that's what we do internally 🤣
But originally I wanted to hide this behind methods calls and parameters

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

hm, this looks more and more like string processing

Well, we could also use pythons __getitem__ method and specify input parameters with []:

cdo.add[cdo.fldmean.topo('global_0.2') + cdo.fldmax.topo('global_1')]

or

cdo.add[cdo.fldmean.topo('global_0.2'), cdo.fldmax.topo('global_1')]

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

but personally I do not like this approach to much, because one normally uses this so subset an array which is different than specifying input parameters to an operator call

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

yes, it doesn't really tell what it's doing. I like to use methods or keywords for separating things and be expressive at the same time.
Btw: we concentrated on the input, but what about output? We could have an outfile key or method call? it could also be a parameter of the run() method ...

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

yes, it doesn't really tell what it's doing

👍 So I propose the following strategy:

  1. The recommended usage for input streams is to specify them with the infile parameter in the operator call and then the CDO library puts brackets around them
  2. If we go for the + operation (and I think the reusability mentioned in #23 (comment) is a very good reason for it), we should also allow strings in the + operation. It is straight-forward to implement and allows something like
base_op = cdo.add + 'base.nc'
op1 = base_op + 'infile1.nc'
op2 = base_op + 'infile2.nc'

which would not be possible with the cdo.add(infile=...) syntax.

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

we concentrated on the input, but what about output?

So at the moment this is implemented in the run method and I would leave it there. I think this makes the most sense, right? Especially if we remove the backward compatibility issue mentioned above (i.e. cdo.add(infile=...) does not automatically run the cdo command in the shell).

from cdo-bindings.

Try2Code avatar Try2Code commented on August 21, 2024

I agree - the output is only important in moment of execution. We can avoid this before

from cdo-bindings.

Chilipp avatar Chilipp commented on August 21, 2024

hey @Try2Code! I just want to write down some points of what we talked about quickly before my talk.

My proposal is to transform each cdo operator into a string surrounded by [ ]. Let's define

cdo1 = cdo.fldmean.topo('global_0.2')
cdo2 = cdo.fldmean.topo('global_1')

cdo.add(input=[cdo1, cdo2])
would become equivalent to something like

cdo -add [ -fldmean -topo,global_0.2 -fldmean -topo,global_1 ]

Concerning the + operation, the question is whether we would not want to keep this for the next release 😉 Otherwise, if you want to continue the discussion about this,

here are my thoughts concerning `+`:

Something like cdo.add + cdo1 + cdo2 could become equivalent to

cdo -add [ -fldmean -topo,global_0.2 ] [ -fldmean -topo,global_1 ]

but this is probably not working, right? So we should rather not use brackets for the + and restrict the bracket functionality to the input parameter of the operators. i.e. cdo.add(input=...) gives the result I mentioned above, and cdo.add + cdo1 + cdo2 gives the current behaviour, i.e. something like cdo -add -fldmean -topo,global_0.2 -fldmean -topo,global_1 without any brackets.

from cdo-bindings.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.