julienpalard / pipe Goto Github PK

View Code? Open in Web Editor NEW

1.9K 1.9K 114.0 215 KB

A Python library to use infix notation in Python

License: MIT License

Python 100.00%

pipe's Introduction

Bonjour !

I'm an independent french Python trainer.
I'm the author of HackInScience, a Python learning platform.
I also work on the french translation of https://docs.python.org.

Connect with me

I'm @[email protected] on Mastodon.
I'm mdk on IRC
I'm [email protected] on XMPP.
I'm [email protected] by email.
My home page: https://mdk.fr

pipe's People

Contributors

Stargazers

Watchers

Forkers

martinsvoboda guniorobot jbvsmo butzeb rcludwick michel-slm lost-theory babakness jrwren dingodzilla dreampuf smmoosavi xch89820 ychaouche wolfprogrammer kingxsp hwms sseg theho mulinfro zhangqin plasmatium mebusw xaverrevax lilipan cwahbong leexn nirum ryankung yangfeit19 a-milogradov dm04806 bedekelly safwank wp3xpp j450h1 jakirkham gepcel cutso python3pkg lamborryan gxy2017 kawasaki2013 javadba huangkbaaron epicyon ojomio thautwarm pepe5 dnuffer stillmatic ctongfei psavine42 nykh mmanoeljorn vstoykov mohammad7t korenmiklos hansthen righthandabacus mirth bugatone yuanjie-ai louisparis sgs2018 abdur-rahmaanj fiisoft-public letmutx cjwdoing alanlonglong brothertook vitalfadeev sobolevn vamosraghava sergiors yueyun00 jnoortheen adbmd ygalvao zmbarson alhoo vij17vjr injae sugatoray devrma jerabaul29 drreeww 4144 fidget-spinner techthiyanes kianmeng deshion renesugar eycab shaikhanas1993 yashagarwal dit-zy hubald jenisys zaxebo1

pipe's Issues

Configurable Pipeline with List of transformations

[Feature Request]
Hi @JulienPalard ,
I have an idea wanna run by you before I raise Pull request.
I wanna add capability where user can define various transformations with @Pipe and then have these transformations as part of a configurable pipeline. This pipeline can be reused multiple time rather than defining the same set of transformations to be applied on input every time.

Idea is:

class Pipeline:
    def __init__(self, methods):
        self.methods = methods

    def execute(self, value):
        return self.__recursive(value, self.methods)

    def __recursive(self, value, methods):
        if methods:
            return self.__recursive(value | methods[0], methods[1:])
        return value


if __name__ == '__main__':
    methods = [
        select(lambda x: x * x),
        where(lambda x: x % 2 == 1),
        collect
    ]
    pipeline = Pipeline(methods=methods)
    assert [1, 9, 25, 49, 81] == pipeline.execute(range(10))

Let me know what you think.
Thanks

Include Pipe module' son, sspipe, as part of the Standard Library

sspipe is a module which has as parent Pipe class, whose only difference is that it makes Pipe a first class citizen.

https://github.com/sspipe/sspipe/issues/4

So would you be willing to contribute to make Pipe / sspipe as part of the Standard Library?

IDEA: fan out operator

You could implement the & (or + ?) operator as a fan out operator.
It could be used if you want to send the output of one Pipe to several others at the same time, something like:

[1,2,3,4] | where(...) | ( stdout & sum & max )

Of course the result is a bit questionable... a pipeable tuple ?

add "how to install" to readmy

Add a how to install in the introduction of the readme, e.g.:
pip install pipe

Minor issue, but one can't assume it's simply called pipe at PYPI.

passing parameters

hello,
what if i want to pass paramenters between functions??
like

[1,2,3] | func1(param1) |func2(param2)

the code should inject [1,2,3] in func1 adding the param1

Is this possible?
regards

Add a LICENSE to repository and PyPI package

Please add a license file to the repository and the PyPI package as well.

The PyPI package says: MIT license, but does not include a LICENSE file in the tar.gz file.

Is there any possibility to add more functions?

I've been working on a library called pyf that implements other functions that don't exist in JulienPalard/Pipe

I'd like to know, Is there plans to add more functions?
If don't, I'd like to know the reason.

Pipe does not work if left side object defines or

Hi,

I just found your project reading my github RSS feed and it is really nice! Congratz.

I have a project, Should-DSL, and I use the same __ror__ approach in some places.

But there is a problem with this approach. If the left side object defines it's own __or__, the right side object (in your case a Pipe instance) __ror__ is never called.

Let me show you an example:

from pipe import count

class MyList(list):
    def __or__(self, other):
        return "FOO"

def test_builtins():
   result = [1, 2, 3] | count
   assert result == 3, result

def test_custom_objects():
    result = MyList([1, 2, 3]) | count
    assert result == 3, result


if __name__ == '__main__':
    test_builtins()
    test_custom_objects()

The output follows:

Traceback (most recent call last):
  File "failing_example.py", line 18, in <module>
    test_custom_objects()
  File "failing_example.py", line 13, in test_custom_objects
    assert result == 3, result
AssertionError: FOO

Unfortunately I have no solution for this. The approach a friend of mine was trying to use in Should-DSL is to add a new operator for special cases - but that's not a good solution.

Please, let me know if you find out a solution!

Cheers,
Hugo.

has "first" but not "last"

Would you mind creating a new release?

It looks like it's been over a year. Thanks!

How to apply function to each element?

I want to convert elements to int. It's like map(int, [1,2,3]). How to do it?

Lambda replacements

Hi!

I like this package! I wrote something similar: https://github.com/gamis/flo, but I think I like yours better.

The one thing mine has that yours doesn't is a concise replacement for lambda functions. I find lambda x: x**2 kind of annoying. With my library, instead of having code that looks like

mylist = ['pretty','cool','items', 'kiddo']
myindex = mylist | map(lambda x: x.upper()) | where(lambda x: 'E' in x) | groupby(key=lambda x: x[0]) | select(lambda x: x[0])

it could look like

myindex = mylist | map( _.upper() ) | where( _.has('E') ) | groupby( _[0] ) | select( _[0] )

or potentially even terser:

myindex = mylist | map_.upper() | where_.has('E') | groupby_[0] | select_[0]

If you're interested in such an addition, I can work on a PR in the next month or two. In any case, I'd love any feedback you have.

Thanks!

Greg

about lazy iteration

Just a small question to clarify: I looked at the code, and this library is not lazily executed, right? May be a good idea to add a small note in the readme that this library is providing "syntactic sugar" but no "expected" / advanced functional programming goodies like lazy evaluation (which some people coming from other languages may expect quite strongly when they see some piping like, functional programming like syntax :) ).

Just to be clear, I love this library, and this is not a criticism, just I think it would be great to make it clear to the user - and also, that may be inspiration for future authors / future improvements etc :) .

two iterators as args

is this correct way of doing it ..


@Pipe
def prod(it): 
	for x in itertools.product(it[0],it[1]): yield x

i use it like this :

(it,it) | prod

chain with PLUS

I tried several things but couldnt make it work ..

itertools.chain(it1, it2) | ...

instead I want to do it this way :

it1 + it2 | ....

As a reverse of that i want to pass iterator to multiple functions and join the result, something like :

it | (fun1 + fun2 + ..) | ...

which is :

fun1(it) + fun2(it) + ... | ...

Collector as a side effect ?

how would you write a function that acts as collector i.e. because i normally have the source as list or numpy array, but want to collect the result as 2D-bitarray ...f.e.

np.random.randint(0,100,10)  | ....... | before_last |  list2Dbitary()

in this case before_last generates numpy-array-of-bitarrays and then in the last step I convert it to 2Dbitarray ... i.e. the pipe generates 2 data structs which can be large and will take too much memory..

If I have something as context/state then as a side effect of functions I can update this 2Dbitarray created before the pipe starts.

How would you do that ?

Typo on PyPI page

There is a typo on the PyPI page.
It should be "enabling".

Starmap

Hi,
I was thinking of submitting a PR with a starmap function, something like this:

@Pipe
def starmap(iterable, selector):
    def starfunc(args):
        return selector(*args)
    return builtins.map(starfunc, iterable)

This is really useful in situations where you have an iterable of args that you want to pass to a function and have them unpacked positionally. An example might be in parsing a row of values into a datamodel E.g:

PersonAges = namedtuple("PersonAges", ("name", "age"))

rows = [["john", 32], ["paul", 31], ["ringo", 33], ["george", 34]]

people = list(
    rows
    | starmap(PersonAges)
)

This is what you get:

[PersonAges(name='john', age=32),
 PersonAges(name='paul', age=31),
 PersonAges(name='ringo', age=33),
 PersonAges(name='george', age=34)]

Makes it much simpler than doing something like this (especially when you have a large number of values to map):

map(lambda row: PersonAges(name=row[0], age=row[1]))

I couldn't find anything resembling a starmap but if there is another way to achieve this let me know.

PEP

Have you considered making a PEP for this? Is there one and I missed it?

providing aliases corresponding to the functional programming "lingua franca"

Many thanks again for an awesome library.

For what I know, there is not a "hard" standardization on functional programming terminology across languages yet, but still some conventions are found across languages. Some of them are already present in this library (like map, take), some others are not present yet (like filter, reduce, head). The situation is made a bit confusing for people switching between languages, because as said this is not standardized, there are aliases, and conflicting uses of the same terminology here and there.

Still, wondering if this library could be a nice occasion to try to stick to / follow / establish some lingua franca, by using aliases to map to other programming languages.

Some aliases I specially miss and that are more or less lingua franca are, for example, i) filter as an alias for where, ii) reduce (seems not implemented yet?).

do you think it would be reasonable to go the extra mile in this library, and think carefully / fit to / try to establish some lingua franca?
in addition (quire related, so putting it here, but could be moved to another issue), regarding the documentation in the readme (section "Existing Pipes in this module"), do you think it may be a good idea / possible to order the entries by alphabetical order, and maybe name all aliases in the line defining the "main" function?

two loops

when i try to list() the result :

I'm trying to do nested iterators ...


In [108]: list( file('text/622_lines.txt') | twofor((sents,words))  )                                                                                                        
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-108-6a273f2a1594> in <module>
----> 1 list( file('text/622_lines.txt') | twofor((sents,words))  )

/......./lib/tools/text.py in twofor(it, its)
     28 @Pipe
     29 def twofor(it,its):
---> 30         for it1 in it | its[0]:
     31                 for item in it1 | its[1]:
     32                         yield item

TypeError: 'Pipe' object is not iterable

undocumented pipes?

Looks like there are some undocumented pipes; see for example netcat in https://github.com/JulienPalard/Pipe/blob/master/pipe.py (in addition to the deprecated closing pipes, but I guess not documenting these is the whole point :) ).

VS Code prompt doesn't work well with function defined with @Pipe

In VS Code, when I hover mouse on a function name, it will prompt me the parameters. However, if the function is decorated by @ Pipe, it just tell me
(function) xxx: Pipe
As there some solutions to it?

How about "apply" instead of "select"

Absolutely brilliant library. Thank you for it. Now that I have it as my shiny new hammer I'm looking for a rusty nail to use it on. One minor suggestion — did you consider other names for "select". As it is, "select" sounds like it should filter, not operate like map (which I see is also supported). Might "apply" be a better term?

Please tag the commits corresponding to releases

Just noticed this really elegant library from a mention on StackOverflow, and I wanted to browse around and see what the changes are between versions -- and noticed that the versions are not tagged!

Could you tag them, or in case you did but the tags didn't get pushed to GitHub, could you push them? Thanks!

`groupby` Update and Use Case

groupby seems to still produce an itertools._grouper object, which appears to be a type in process to deprecate by 2.0.

I also wonder how the keyfunc parameter works and if it's scoped for pipe end users. I tried to pass something to it, and I got a multiple values error. Does it, by chance, allow for type recasting of the object returned? If not, a feature like that would be wonderful so as to quickly generate outputs of pipe operations (maybe pipes have something like that already?)

Finally, in the documentation x%2 and "Even" will produce unexpected results :)

unexpected indent

hello this
euler2 = fib() | where(lambda x: x % 2 == 0)
| take_while(lambda x: x < 4000000)
| add

gives me an error
| take_while(lambda x: x < 4000000)
^
IndentationError: unexpected indent

can you help?

allowing Pipe constructor with more arguments acting as a partial

The examples in the readme

sum(count() | select(lambda x: x ** 2) | take(10))

can also be written as:

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum)

calling the pipe allows using it similarly to partial

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum)(10)

The extra brackets is a little odd, I think.

What about this?

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum,10)

Very cool project

TKS for sharing this amazing project!!
The coding style is easy to read and the Pipe is a powerful tool ~

closing pipes, like ```reduce```, are currently not supported

Moved from the initial discussion in #67 ; from the author:

For reduce I don't think pipe can handle it, I deprecated "closing pipes" a year or so ago:

What I mean by a "closing pipe" is a pipe that does not return an iterable, but a value, so itself cannot be on the left hand side of a pipe, "breaking" the pipe, example:
>>> range(100) | filter(lambda x: x % 2 == 0) | sum 
would be readable, OK, but sum does not return an iterable, so no further | can be used. I deprecated this in favor of the even shorter and standard:
>>> sum(range(100) | filter(lambda x: x % 2 == 0)) 
I think reduce enters this category of finalizing pipes so it can't be added, as it would be better written as:
>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(100) | filter(lambda x: x % 2 == 0))

I understand the point of the author, but I would like to disagree on this point :) . This is maybe mostly aesthetics, but to my eyes this:

res = ( range(100) | filter(lambda x: x % 2 == 0)
                   | reduce(lambda x, y: x + y) )

looks nicer and more readable than this:

>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(100) | filter(lambda x: x % 2 == 0))

Because in the first case, I can just "follow the logics" as it flows and as my brain expects it, but in the second case, I have to force my brain to remember that while most of the expression flows from left to right, the final step is actually an exception to this rule as it is at the far left... Also, this breaks my habits from other places where I see similarly formatted "functional programming" expressions, like rust et. co., that would look much more like the first way of formatting it.

Would there be a way to enable a syntax that looks like the first case, but without the worries about closing pipes that are raised by the author? For example, using a separate, special "closing pipe" class that would allow to perform checks and to issue meaningful error messages if closing pipes are used at the wrong place in the expression?

Is anybody uses Pipe in production?

Is anybody uses Pipe in production? What are good and bad parts?

Could the built-in functions always output an iterator?

When the input is an iterator, it's quite tempting to run next() on the output. But a number of the functions don't return an iterator, resulting in a TypeError. e.g.:

>>> next(range(10) | tail(2))
[...]
TypeError: 'collections.deque' object is not an iterator

It would be sweet if all/most of the functions would return an actual iterator. Exceptions would of course be things like as_list where you're explicitly asking for an output type.

My use-case, in case it matters, is leisurely throwing python scripts at a csv file that doesn't fit in memory. Pandas tries to load the whole thing in memory and fails, unless I use the chunksize argument, which makes it choke every so often -- and the syntax is god awful. Line by line seems to work fine, however slow. I ran into your library while looking for one that could basically support some kind of read_csv | some_stuff | more_stuff | write_csv type of workflow and do the entire thing row by row without me needing to reinvent the wheel. (Suggestions welcome if you know of a better option.)

pylint issue

The code snipped from the readme produces an pylint error:

sum(range(100) | where(lambda x: x % 2 == 0))

No value for argument 'predicate' in function call pylint(no-value-for-parameter)

Shadowing built-in functions

These four functions in the package

any()
all()
max()
min()

shadow built-in functions. Since python doesn't have an easy way to hide certain names during from pipe import *, may I suggest renaming name (perhaps simply prefix them with p like pmax?) Since we already took the stance of renaming map to select and filter to where I think this is in keeping with the whole strategy and will allow users to from pipe import * without worrying about shadowing.

Since this is definitely a breaking change, I respect your opinion. The exact naming is also up to you.

documentation correction

the docs state that

[1, 2, [3]] | chain

Gives a TypeError: chain argument #1 must support iteration Consider using traverse.

However, the command does not give an error but rather returns a <itertools.chain object>
Only when casting to a list:

list([1, 2, [3]] | chain)

there is an error TypeError: 'int' object is not iterable

Concat on unicode objects

Sorry if this has already been patched in one of the forks.

In [189]: "héllo" | pipe.concat
Out[189]: 'h, \xc3, \xa9, l, l, o'

In [190]: u"héllo" | pipe.concat
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)

/home/chaouche/CODE/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pipe.py in __ror__(self, other)
350
351     def __ror__(self, other):
--> 352         return self.function(other)
353
354     def __call__(self, *args, **kwargs):

/usr/lib/python2.7/site-packages/pipe.py in concat(iterable, separator)
488 @Pipe
489 def concat(iterable, separator=", "):
--> 490     return separator.join(map(str,iterable))
491
492 @Pipe

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc3' in position 0: ordinal not in range(128)

new version?

I found that I can pip install pipe but it is kind of old (v1.4.2 from 2010) but surprisingly, the pipe.py here also marked as v1.4.2! Would you consider bump up the version and update the pypi.org python repository as well?

@Pipe decorator for class methods

The Pipe decorator does not work on methods defined on a class.

This would be useful to make a pipe out of an existing method. As an example, I can do something like this:

class Job(requests.Session):
      def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
  
      def get(self, items, url, *args, **kwargs):
          for item in items:
              yield super().get(url.format(item), *args, **kwargs).json()
  
      def jq(self, items, stmt):
          compiled = jq.compile(stmt)
          for item in items:
              yield from iter(compiled.input_value(item))

  job = Job()
  
  x = (
      ["hansthen", "JulienPalard"] |
      Pipe(job.get)("https://api.github.com/users/{}/repos") |
      Pipe(job.jq)(".[].license // empty | .name")
  )

But I would rather create a the Pipe at class level like this:

 class Job(requests.Session):
      def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
  
      @Pipe
      def get(self, items, url, *args, **kwargs):
          for item in items:
              yield self.session.get(url.format(item), *args, **kwargs).json()
  
      @Pipe
      def jq(self, items, stmt):
          compiled = jq.compile(stmt)
          for item in items:
              yield from iter(compiled.input_value(item))
  
  
  job = Job()
  
  x = (
      ["hansthen", "JulienPalard"] |
      job.get("https://api.github.com/users/{}/repos") |
      job.jq(".[].license // empty | .name")
  )

Could you add a recipe to make the Pipe class work with class methods?

What is the license of this software?

What is the license of this software? Is it BSD?

Thanks

Using Pipes in the functions

i'm stuck on something that seem it should be simple..

@Pipe
def flat(it): return sum(it,[])

syns = map(lambda z : z.synsets())

what I want instead is :

syns = map(lambda z : z.synsets()) | flat

so I can say :

.... | syns

instead of :

... | syns | flat

Pipes with Context

how would i do in addition to the iterator to have a context :


@Pipe
def foo(it, ctx): ....

still use it like :

... | foo | bar | ...

Recommendations for additional operators and make it more Pipe

Wow, what an incredible project and a fantastic idea! When I first came across this repository, it immediately reminded me of the Rust iterator and its ability to chain multiple methods on an iterator in a lazy manner. It got me thinking about how great it would be to implement some of those operations, just like in Rust iterators.
For example:

... | <Pipe function> | collect(factory) instead of factory(... | <Pipe function>) to PIPE ALL!
Creating more official operators that are both practical and captivating for users. Here are some potential examples:

@Pipe
def step_by(iterable, step):
    "Yield one item out of 'step' in the given iterable."
    for i, item in enumerate(iterable):
        if i % step == 0:
            yield item


@Pipe
def reduce(iterable, predicate):
    "Reduce the given iterable to one element using the given criterion."
    return functools.reduce(predicate, iterable)


@Pipe
def position(iterable, predicate):
    "Get the position of the element in the iterable."
    for i, item in enumerate(iterable):
        if predicate(item):
            return i


@Pipe
def next_chunk(iterable, n):
    ...

# something more

If you consider it to be practical, I would be delighted to contribute.

Conda package information

Please provide link to conda package in readme if it exists; if not, please create a conda package, thank you!

Update package on pip

Hi,

Can you update the pipe package on pip?

Just installed the package and these are the functions available:
['Pipe', '__all__', '__author__', '__builtins__', '__credits__', '__date__', '__doc__', '__file__', '__name__', '__package__', '__version__', 'add', 'aggregate', 'all', 'any', 'as_dict', 'as_list', 'as_tuple', 'average', 'builtins', 'chain', 'chain_with', 'closing', 'concat', 'count', 'first', 'groupby', 'islice', 'itertools', 'izip', 'lineout', 'max', 'min', 'netcat', 'netwrite', 'permutations', 'reduce', 'reverse', 'select', 'skip', 'skip_while', 'socket', 'sort', 'stdout', 'sys', 'tail', 'take', 'take_while', 'tee', 'traverse', 'where']

There are no strip, lstrip, rstrip

Regards

netwrite: 'to_send' is not defined

line 470:
def netwrite(iterable, host, port):
i think like this
def netwrite(to_send, host, port):

How do you handle types that implement | ?

if variable X of type Y implement |-operator the pipe is not working because the left var has precedence in interpretation ... how do you overcome that case ? Ex.:

  X | ZZ

TypeError: bitarray object expected for bitwise operation

Optimization

https://github.com/JulienPalard/Pipe/blob/master/pipe.py#L458

Wouldn't simple len(generator) do the job?

input Type dependent preprocessing function ?

how would you integrate Type dependent preprocessing function like this one :

  def pre(fun, data):
            if hasattr(data, '__iter__') : return [ fun(d) for d in data ]
            else : return fun(data)

tryed several different ways, but can't get it work !

Add type hints

Use a static type checker called mypy. In libs, I think it's very important to have type hints, so that when users are going to use the lib, code linters can help, knowing exactly what kind of parameters to receive, and what the return type is.

The ideal solution would be to add type hints as specified by PEP 484, PEP 526, PEP 544, PEP 586, PEP 589, and PEP 591 directly on your code.