Giter Site home page Giter Site logo

pipe's Introduction

pipe's People

Contributors

abdur-rahmaanj avatar amper avatar babakness avatar brentp avatar briaoeuidhtns avatar dalexander avatar devrma avatar dit-zy avatar fidget-spinner avatar gliptak avatar hros avatar isidroas avatar j450h1 avatar javadba avatar jbvsmo avatar jerabaul29 avatar julienpalard avatar kianmeng avatar mrjbq7 avatar nykh avatar righthandabacus avatar safwank avatar sergiors avatar sobolevn avatar sugatoray avatar vstoykov avatar yorailevi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pipe's Issues

Configurable Pipeline with List of transformations

[Feature Request]
Hi @JulienPalard ,
I have an idea wanna run by you before I raise Pull request.
I wanna add capability where user can define various transformations with @Pipe and then have these transformations as part of a configurable pipeline. This pipeline can be reused multiple time rather than defining the same set of transformations to be applied on input every time.

Idea is:

class Pipeline:
    def __init__(self, methods):
        self.methods = methods

    def execute(self, value):
        return self.__recursive(value, self.methods)

    def __recursive(self, value, methods):
        if methods:
            return self.__recursive(value | methods[0], methods[1:])
        return value


if __name__ == '__main__':
    methods = [
        select(lambda x: x * x),
        where(lambda x: x % 2 == 1),
        collect
    ]
    pipeline = Pipeline(methods=methods)
    assert [1, 9, 25, 49, 81] == pipeline.execute(range(10))

Let me know what you think.
Thanks

IDEA: fan out operator

You could implement the & (or + ?) operator as a fan out operator.
It could be used if you want to send the output of one Pipe to several others at the same time, something like:

[1,2,3,4] | where(...) | ( stdout & sum & max )

Of course the result is a bit questionable... a pipeable tuple ?

add "how to install" to readmy

Add a how to install in the introduction of the readme, e.g.:
pip install pipe

Minor issue, but one can't assume it's simply called pipe at PYPI.

passing parameters

hello,
what if i want to pass paramenters between functions??
like

[1,2,3] | func1(param1) |func2(param2)

the code should inject [1,2,3] in func1 adding the param1

Is this possible?
regards

Is there any possibility to add more functions?

I've been working on a library called pyf that implements other functions that don't exist in JulienPalard/Pipe

I'd like to know, Is there plans to add more functions?
If don't, I'd like to know the reason.

Pipe does not work if left side object defines __or__

Hi,

I just found your project reading my github RSS feed and it is really nice! Congratz.

I have a project, Should-DSL, and I use the same __ror__ approach in some places.

But there is a problem with this approach. If the left side object defines it's own __or__, the right side object (in your case a Pipe instance) __ror__ is never called.

Let me show you an example:

from pipe import count

class MyList(list):
    def __or__(self, other):
        return "FOO"

def test_builtins():
   result = [1, 2, 3] | count
   assert result == 3, result

def test_custom_objects():
    result = MyList([1, 2, 3]) | count
    assert result == 3, result


if __name__ == '__main__':
    test_builtins()
    test_custom_objects()

The output follows:

Traceback (most recent call last):
  File "failing_example.py", line 18, in <module>
    test_custom_objects()
  File "failing_example.py", line 13, in test_custom_objects
    assert result == 3, result
AssertionError: FOO

Unfortunately I have no solution for this. The approach a friend of mine was trying to use in Should-DSL is to add a new operator for special cases - but that's not a good solution.

Please, let me know if you find out a solution!

Cheers,
Hugo.

Lambda replacements

Hi!

I like this package! I wrote something similar: https://github.com/gamis/flo, but I think I like yours better.

The one thing mine has that yours doesn't is a concise replacement for lambda functions. I find lambda x: x**2 kind of annoying. With my library, instead of having code that looks like

mylist = ['pretty','cool','items', 'kiddo']
myindex = mylist | map(lambda x: x.upper()) | where(lambda x: 'E' in x) | groupby(key=lambda x: x[0]) | select(lambda x: x[0])

it could look like

myindex = mylist | map( _.upper() ) | where( _.has('E') ) | groupby( _[0] ) | select( _[0] )

or potentially even terser:

myindex = mylist | map_.upper() | where_.has('E') | groupby_[0] | select_[0]

If you're interested in such an addition, I can work on a PR in the next month or two. In any case, I'd love any feedback you have.

Thanks!

Greg

about lazy iteration

Just a small question to clarify: I looked at the code, and this library is not lazily executed, right? May be a good idea to add a small note in the readme that this library is providing "syntactic sugar" but no "expected" / advanced functional programming goodies like lazy evaluation (which some people coming from other languages may expect quite strongly when they see some piping like, functional programming like syntax :) ).

Just to be clear, I love this library, and this is not a criticism, just I think it would be great to make it clear to the user - and also, that may be inspiration for future authors / future improvements etc :) .

two iterators as args

is this correct way of doing it ..


@Pipe
def prod(it): 
	for x in itertools.product(it[0],it[1]): yield x

i use it like this :

(it,it) | prod

chain with PLUS

I tried several things but couldnt make it work ..

itertools.chain(it1, it2) | ...

instead I want to do it this way :

it1 + it2 | ....

As a reverse of that i want to pass iterator to multiple functions and join the result, something like :

it | (fun1 + fun2 + ..) | ...

which is :

fun1(it) + fun2(it) + ... | ...

Collector as a side effect ?

how would you write a function that acts as collector i.e. because i normally have the source as list or numpy array, but want to collect the result as 2D-bitarray ...f.e.

np.random.randint(0,100,10)  | ....... | before_last |  list2Dbitary()

in this case before_last generates numpy-array-of-bitarrays and then in the last step I convert it to 2Dbitarray ... i.e. the pipe generates 2 data structs which can be large and will take too much memory..

If I have something as context/state then as a side effect of functions I can update this 2Dbitarray created before the pipe starts.

How would you do that ?

Starmap

Hi,
I was thinking of submitting a PR with a starmap function, something like this:

@Pipe
def starmap(iterable, selector):
    def starfunc(args):
        return selector(*args)
    return builtins.map(starfunc, iterable)

This is really useful in situations where you have an iterable of args that you want to pass to a function and have them unpacked positionally. An example might be in parsing a row of values into a datamodel E.g:

PersonAges = namedtuple("PersonAges", ("name", "age"))

rows = [["john", 32], ["paul", 31], ["ringo", 33], ["george", 34]]

people = list(
    rows
    | starmap(PersonAges)
)

This is what you get:

[PersonAges(name='john', age=32),
 PersonAges(name='paul', age=31),
 PersonAges(name='ringo', age=33),
 PersonAges(name='george', age=34)]

Makes it much simpler than doing something like this (especially when you have a large number of values to map):

map(lambda row: PersonAges(name=row[0], age=row[1]))

I couldn't find anything resembling a starmap but if there is another way to achieve this let me know.

PEP

Have you considered making a PEP for this? Is there one and I missed it?

providing aliases corresponding to the functional programming "lingua franca"

Many thanks again for an awesome library.

For what I know, there is not a "hard" standardization on functional programming terminology across languages yet, but still some conventions are found across languages. Some of them are already present in this library (like map, take), some others are not present yet (like filter, reduce, head). The situation is made a bit confusing for people switching between languages, because as said this is not standardized, there are aliases, and conflicting uses of the same terminology here and there.

Still, wondering if this library could be a nice occasion to try to stick to / follow / establish some lingua franca, by using aliases to map to other programming languages.

Some aliases I specially miss and that are more or less lingua franca are, for example, i) filter as an alias for where, ii) reduce (seems not implemented yet?).

  • do you think it would be reasonable to go the extra mile in this library, and think carefully / fit to / try to establish some lingua franca?
  • in addition (quire related, so putting it here, but could be moved to another issue), regarding the documentation in the readme (section "Existing Pipes in this module"), do you think it may be a good idea / possible to order the entries by alphabetical order, and maybe name all aliases in the line defining the "main" function?

two loops

when i try to list() the result :

I'm trying to do nested iterators ...


In [108]: list( file('text/622_lines.txt') | twofor((sents,words))  )                                                                                                        
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-108-6a273f2a1594> in <module>
----> 1 list( file('text/622_lines.txt') | twofor((sents,words))  )

/......./lib/tools/text.py in twofor(it, its)
     28 @Pipe
     29 def twofor(it,its):
---> 30         for it1 in it | its[0]:
     31                 for item in it1 | its[1]:
     32                         yield item

TypeError: 'Pipe' object is not iterable


How about "apply" instead of "select"

Absolutely brilliant library. Thank you for it. Now that I have it as my shiny new hammer I'm looking for a rusty nail to use it on. One minor suggestion — did you consider other names for "select". As it is, "select" sounds like it should filter, not operate like map (which I see is also supported). Might "apply" be a better term?

Please tag the commits corresponding to releases

Just noticed this really elegant library from a mention on StackOverflow, and I wanted to browse around and see what the changes are between versions -- and noticed that the versions are not tagged!

Could you tag them, or in case you did but the tags didn't get pushed to GitHub, could you push them? Thanks!

`groupby` Update and Use Case

groupby seems to still produce an itertools._grouper object, which appears to be a type in process to deprecate by 2.0.

I also wonder how the keyfunc parameter works and if it's scoped for pipe end users. I tried to pass something to it, and I got a multiple values error. Does it, by chance, allow for type recasting of the object returned? If not, a feature like that would be wonderful so as to quickly generate outputs of pipe operations (maybe pipes have something like that already?)

Finally, in the documentation x%2 and "Even" will produce unexpected results :)

unexpected indent

hello this
euler2 = fib() | where(lambda x: x % 2 == 0)
| take_while(lambda x: x < 4000000)
| add

gives me an error
| take_while(lambda x: x < 4000000)
^
IndentationError: unexpected indent

can you help?

allowing Pipe constructor with more arguments acting as a partial

The examples in the readme

sum(count() | select(lambda x: x ** 2) | take(10))

can also be written as:

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum)

calling the pipe allows using it similarly to partial

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum)(10)

The extra brackets is a little odd, I think.

What about this?

count() | select(lambda x: x ** 2) | take(10) | Pipe(sum,10)

Very cool project

TKS for sharing this amazing project!!
The coding style is easy to read and the Pipe is a powerful tool ~

closing pipes, like ```reduce```, are currently not supported

Moved from the initial discussion in #67 ; from the author:

For reduce I don't think pipe can handle it, I deprecated "closing pipes" a year or so ago:

What I mean by a "closing pipe" is a pipe that does not return an iterable, but a value, so itself cannot be on the left hand side of a pipe, "breaking" the pipe, example:

>>> range(100) | filter(lambda x: x % 2 == 0) | sum 

would be readable, OK, but sum does not return an iterable, so no further | can be used. I deprecated this in favor of the even shorter and standard:

>>> sum(range(100) | filter(lambda x: x % 2 == 0)) 

I think reduce enters this category of finalizing pipes so it can't be added, as it would be better written as:

>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(100) | filter(lambda x: x % 2 == 0))

I understand the point of the author, but I would like to disagree on this point :) . This is maybe mostly aesthetics, but to my eyes this:

res = ( range(100) | filter(lambda x: x % 2 == 0)
                   | reduce(lambda x, y: x + y) )

looks nicer and more readable than this:

>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(100) | filter(lambda x: x % 2 == 0))

Because in the first case, I can just "follow the logics" as it flows and as my brain expects it, but in the second case, I have to force my brain to remember that while most of the expression flows from left to right, the final step is actually an exception to this rule as it is at the far left... Also, this breaks my habits from other places where I see similarly formatted "functional programming" expressions, like rust et. co., that would look much more like the first way of formatting it.

Would there be a way to enable a syntax that looks like the first case, but without the worries about closing pipes that are raised by the author? For example, using a separate, special "closing pipe" class that would allow to perform checks and to issue meaningful error messages if closing pipes are used at the wrong place in the expression?

Could the built-in functions always output an iterator?

When the input is an iterator, it's quite tempting to run next() on the output. But a number of the functions don't return an iterator, resulting in a TypeError. e.g.:

>>> next(range(10) | tail(2))
[...]
TypeError: 'collections.deque' object is not an iterator

It would be sweet if all/most of the functions would return an actual iterator. Exceptions would of course be things like as_list where you're explicitly asking for an output type.

My use-case, in case it matters, is leisurely throwing python scripts at a csv file that doesn't fit in memory. Pandas tries to load the whole thing in memory and fails, unless I use the chunksize argument, which makes it choke every so often -- and the syntax is god awful. Line by line seems to work fine, however slow. I ran into your library while looking for one that could basically support some kind of read_csv | some_stuff | more_stuff | write_csv type of workflow and do the entire thing row by row without me needing to reinvent the wheel. (Suggestions welcome if you know of a better option.)

pylint issue

The code snipped from the readme produces an pylint error:

sum(range(100) | where(lambda x: x % 2 == 0))

No value for argument 'predicate' in function call pylint(no-value-for-parameter)

Shadowing built-in functions

These four functions in the package

  • any()
  • all()
  • max()
  • min()

shadow built-in functions. Since python doesn't have an easy way to hide certain names during from pipe import *, may I suggest renaming name (perhaps simply prefix them with p like pmax?) Since we already took the stance of renaming map to select and filter to where I think this is in keeping with the whole strategy and will allow users to from pipe import * without worrying about shadowing.

Since this is definitely a breaking change, I respect your opinion. The exact naming is also up to you.

documentation correction

the docs state that

[1, 2, [3]] | chain

Gives a TypeError: chain argument #1 must support iteration Consider using traverse.

However, the command does not give an error but rather returns a <itertools.chain object>
Only when casting to a list:

list([1, 2, [3]] | chain)

there is an error TypeError: 'int' object is not iterable

Concat on unicode objects

Sorry if this has already been patched in one of the forks.

In [189]: "héllo" | pipe.concat
Out[189]: 'h, \xc3, \xa9, l, l, o'

In [190]: u"héllo" | pipe.concat
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)

/home/chaouche/CODE/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pipe.py in __ror__(self, other)
350
351     def __ror__(self, other):
--> 352         return self.function(other)
353
354     def __call__(self, *args, **kwargs):

/usr/lib/python2.7/site-packages/pipe.py in concat(iterable, separator)
488 @Pipe
489 def concat(iterable, separator=", "):
--> 490     return separator.join(map(str,iterable))
491
492 @Pipe

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc3' in position 0: ordinal not in range(128)

new version?

I found that I can pip install pipe but it is kind of old (v1.4.2 from 2010) but surprisingly, the pipe.py here also marked as v1.4.2! Would you consider bump up the version and update the pypi.org python repository as well?

@Pipe decorator for class methods

The Pipe decorator does not work on methods defined on a class.

This would be useful to make a pipe out of an existing method. As an example, I can do something like this:

class Job(requests.Session):
      def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
  
      def get(self, items, url, *args, **kwargs):
          for item in items:
              yield super().get(url.format(item), *args, **kwargs).json()
  
      def jq(self, items, stmt):
          compiled = jq.compile(stmt)
          for item in items:
              yield from iter(compiled.input_value(item))

  job = Job()
  
  x = (
      ["hansthen", "JulienPalard"] |
      Pipe(job.get)("https://api.github.com/users/{}/repos") |
      Pipe(job.jq)(".[].license // empty | .name")
  )

But I would rather create a the Pipe at class level like this:

 class Job(requests.Session):
      def __init__(self, *args, **kwargs):
          super().__init__(*args, **kwargs)
  
      @Pipe
      def get(self, items, url, *args, **kwargs):
          for item in items:
              yield self.session.get(url.format(item), *args, **kwargs).json()
  
      @Pipe
      def jq(self, items, stmt):
          compiled = jq.compile(stmt)
          for item in items:
              yield from iter(compiled.input_value(item))
  
  
  job = Job()
  
  x = (
      ["hansthen", "JulienPalard"] |
      job.get("https://api.github.com/users/{}/repos") |
      job.jq(".[].license // empty | .name")
  )

Could you add a recipe to make the Pipe class work with class methods?

Using Pipes in the functions

i'm stuck on something that seem it should be simple..

@Pipe
def flat(it): return sum(it,[])

syns = map(lambda z : z.synsets())

what I want instead is :

syns = map(lambda z : z.synsets()) | flat

so I can say :

.... | syns

instead of :

... | syns | flat

Pipes with Context

how would i do in addition to the iterator to have a context :


@Pipe
def foo(it, ctx): ....

still use it like :

... | foo | bar | ...

Recommendations for additional operators and make it more Pipe

Wow, what an incredible project and a fantastic idea! When I first came across this repository, it immediately reminded me of the Rust iterator and its ability to chain multiple methods on an iterator in a lazy manner. It got me thinking about how great it would be to implement some of those operations, just like in Rust iterators.
For example:

  1. ... | <Pipe function> | collect(factory) instead of factory(... | <Pipe function>) to PIPE ALL!
  2. Creating more official operators that are both practical and captivating for users. Here are some potential examples:
@Pipe
def step_by(iterable, step):
    "Yield one item out of 'step' in the given iterable."
    for i, item in enumerate(iterable):
        if i % step == 0:
            yield item


@Pipe
def reduce(iterable, predicate):
    "Reduce the given iterable to one element using the given criterion."
    return functools.reduce(predicate, iterable)


@Pipe
def position(iterable, predicate):
    "Get the position of the element in the iterable."
    for i, item in enumerate(iterable):
        if predicate(item):
            return i


@Pipe
def next_chunk(iterable, n):
    ...

# something more

If you consider it to be practical, I would be delighted to contribute.

Conda package information

Please provide link to conda package in readme if it exists; if not, please create a conda package, thank you!

Update package on pip

Hi,

Can you update the pipe package on pip?

Just installed the package and these are the functions available:
['Pipe', '__all__', '__author__', '__builtins__', '__credits__', '__date__', '__doc__', '__file__', '__name__', '__package__', '__version__', 'add', 'aggregate', 'all', 'any', 'as_dict', 'as_list', 'as_tuple', 'average', 'builtins', 'chain', 'chain_with', 'closing', 'concat', 'count', 'first', 'groupby', 'islice', 'itertools', 'izip', 'lineout', 'max', 'min', 'netcat', 'netwrite', 'permutations', 'reduce', 'reverse', 'select', 'skip', 'skip_while', 'socket', 'sort', 'stdout', 'sys', 'tail', 'take', 'take_while', 'tee', 'traverse', 'where']

There are no strip, lstrip, rstrip

Regards

How do you handle types that implement | ?

if variable X of type Y implement |-operator the pipe is not working because the left var has precedence in interpretation ... how do you overcome that case ? Ex.:

  X | ZZ

TypeError: bitarray object expected for bitwise operation

input Type dependent preprocessing function ?

how would you integrate Type dependent preprocessing function like this one :

  def pre(fun, data):
            if hasattr(data, '__iter__') : return [ fun(d) for d in data ]
            else : return fun(data)

tryed several different ways, but can't get it work !

Add type hints

Use a static type checker called mypy. In libs, I think it's very important to have type hints, so that when users are going to use the lib, code linters can help, knowing exactly what kind of parameters to receive, and what the return type is.

The ideal solution would be to add type hints as specified by PEP 484, PEP 526, PEP 544, PEP 586, PEP 589, and PEP 591 directly on your code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.