Giter Site home page Giter Site logo

pipetools's People

Contributors

0101 avatar arnie97 avatar crazylionheart avatar gschaffner avatar mpwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pipetools's Issues

Enhancement possibility? --> Pipe cache

There is some overhead to create pipes. For some use cases it may be advantageous to cache pipes or even partial pipes. Would it be possible to cache pipes automatically? ... or by some switch, etc.?

Here you can see the "penalty" associated with creating pipes.

$ ipython3
Python 3.7.5 (default, Oct 17 2019, 12:21:00) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from pipetools import pipe,X,foreach

In [2]: def my_func(count=10000, predef = False):
   ...:     if predef == False:
   ...:         for k in range(count):
   ...:             a = range(10) > pipe | foreach(X**2) | sum
   ...:     else:
   ...:         my_pipe = pipe | foreach(X**2) | sum
   ...:         for k in range(count):
   ...:             a = range(10) > my_pipe
   ...:     return a
   ...: 

In [3]: %timeit my_func()
202 ms ± 8.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit my_func(predef=True)
59.5 ms ± 1.67 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [5]: %timeit for k in range(10000): a=sum([x**2 for x in range(10)])
29.9 ms ± 962 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Class Methods

Can pipetools be used to compose class methods within class using the self.function_name and outside class using its object like obj.function_name ?

>> instead of > for input

I don't know if it's possible to change > to >> for the input method? IMO >> is less used, especially in numerical / science-related code. I looked around the source code but didn't find where to make the change (I don't see __gt__ method in Pipe class).

Thank you.

New Feature: Add the ability to pipe args and kwargs.

PR 'Add the ability to pipe args and kwargs.' -> #23

I added the functionality to pipe *args and **kwargs to a function.
And now you don't need to pipe a tuple with the first argument as a function and the second argument as a parameter
Now you can pass *args and **kwargs to a function using pipe '|'.
Now prepare_function_for_pipe knows how to handle keyword-only arguments.
And or knows how to handle next_func as *args and **kwargs to self.func.

    # Automatic partial with *args
    range_args: tuple[int, int, int] = (1, 20, 2)
    # Using pipe
    my_range: Callable = pipe | range | range_args
    # Using tuple
    my_range: Callable = pipe | (range, range_args)
    # list(my_range()) == [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

    # Automatic partial with **kwargs
    dataclass_kwargs: dict[str, bool] = {'frozen': True, 'kw_only': True, 'slots': True}
    # Using pipe
    my_dataclass: Callable = pipe | dataclass | dataclass_kwargs
    # Using tuple
    my_dataclass: Callable = pipe | (dataclass, dataclass_kwargs)
    @my_dataclass
    class Bla:
        foo: int
        bar: str

    # Bla(5, 'bbb') -> Raises TypeError: takes 1 positional argument but 3 were given
    # Bla(foo=5, bar='bbb').foo == 5

tqdm support?

Suppose I have a pipe out of multiple complicated functions:

long_computation = (pipe | f1 | f2 | f3 | f4)
long_computation(x)

It would be cool to provide a tqdm support here (togglable with something like long_computation.enable_progressbar()):

  • pipe learns number and the names of the functions it's composed of
  • it uses tqdm to output a progress bar to terminal: 50%|███████████ | 2/4 [00:33<00:33, 15s/iter, func=f2]

I'd like to help implementing this, provided this is possible!

@pipe_util usage

I'm trying to do some experiments with pipetools:

@pipe_util
def myaggregate(function):
    return partial(functools.reduce, function)

ret = glob.iglob | foreach(readf) | flatten | foreach(len) | myaggregate(operator.add)
print(ret(path))

and it works. But I'm trying to replace flatten with more_itertools.flatten for speed reason:

ret = glob.iglob | foreach(readf) | (more_itertools.flatten) | foreach(len) | myaggregate(operator.add)
print(ret(path))

with more_itertools.flatten takes 6.49 seconds, with flatten 13.68 seconds.
I have tried with:

@pipe_util
def myflatten():
    return partial(more_itertools.flatten)

but it doesn't work, what am I missing?
Thanks

Streams

Hi everyone,
Just wanted to throw an idea out here - would pipetools be a good fit for stream-based/reactive programming? Specifically what I'm thinking of is piping streams of data together, rather than piping functions, where it's not necessarily the case that a stream will a emit a datum every time it consumes one, in the way that a piped function will always pass its result to the next function in the sequence.

Would it be enough to override the "compose" method of the Pipe class to adapt to such a use-case?

Computation Using Two X Objects, is it possible?

I was wondering if computation using two X objects is possible, such as in the following situation:

import itertools
from pipetools import *

# given 4 sets: a, b, c, d, find their intersections
abcd = {
  'a': set([1,2,2,8,3]),
  'b': set([1,3,5,7]),
  'c': set([1,1,2,3,5,8]),
  'd': set([4,5]) }
  1. combination is okay, foreach generates 3-tuple nicely
abcd > X.items() | (itertools.combinations, X, 2) | foreach(('{0[0]}&{1[0]}', X[0][1], X[1][1])) | foreach_do(print)
# ('a&b', {8, 1, 2, 3}, {1, 3, 5, 7})
# ('a&c', {8, 1, 2, 3}, {1, 2, 3, 5, 8})
# ('a&d', {8, 1, 2, 3}, {4, 5})
# ('b&c', {1, 3, 5, 7}, {1, 2, 3, 5, 8})
# ('b&d', {1, 3, 5, 7}, {4, 5})
# ('c&d', {1, 2, 3, 5, 8}, {4, 5})
  1. intersection with lambda is okay
abcd > X.items() | (itertools.combinations, X, 2) | foreach(('{0[0]}&{1[0]}', (lambda t: t[0][1] &t[1][1]))) | foreach_do(print)
# ('a&b', {1, 3})
# ('a&c', {8, 1, 2, 3})
# ('a&d', set())
# ('b&c', {1, 3, 5})
# ('b&d', {5})
# ('c&d', {5})
  1. intersection with X does not, is this the limitation mentioned in the document?
abcd > X.items() | (itertools.combinations, X, 2) | foreach(('{0[0]}&{1[0]}', X[0][1] & X[1][1])) | foreach_do(print)
# ('a&b', X[1] | X[1] | {8, 1, 2, 3} & X)
# ('a&c', X[1] | X[1] | {8, 1, 2, 3} & X)
# ('a&d', X[1] | X[1] | {8, 1, 2, 3} & X)
# ('b&c', X[1] | X[1] | {1, 3, 5, 7} & X)
# ('b&d', X[1] | X[1] | {1, 3, 5, 7} & X)
# ('c&d', X[1] | X[1] | {1, 2, 3, 5, 8} & X)

I also tried something simpler as follows.
It seems that X cannot be used twice, is that correct?

pipe_square = pipe | X ** 2
pipe_XbyX = pipe | X * X

pipe_square < 7 # 49
pipe_XbyX < 7   # 7 * X

`join` util

I'd like to propose a new join util.

The idea is simple: it calls foreach(str) before calling str.join. So, whenever you have something like

| foreach(str)
| ', '.join

you can replace it with

| join(', ')

You can customize the string conversion by passing either a function or a format string as the second parameter, e.g.:

| join(', ', lambda x: '-{}-'.format(x))        # function
| join(', ', '-{}-')                            # fmt string

Implementation

def join(delim, formatter=str):
    '''
    join(' ')
    join(' ', fmtFn)
    join(' ', fmtString)
    '''
    
    return foreach(formatter) | delim.join

Tests

def test_join(self):
    
    r = [1, 2, 3] > (pipe
                    | join(', ')
                    )
    
    self.assertEquals(r, '1, 2, 3')
    
    
def test_join_with_formatter(self):
    
    r = [1, 2, 3] > (pipe
                    | join(', ', lambda x: '-{}-'.format(x))
                    )
    
    self.assertEquals(r, '-1-, -2-, -3-')
    
    
def test_join_with_fmtString(self):
    
    r = [1, 2, 3] > (pipe
                    | join(', ', '-{}-')
                    )
    
    self.assertEquals(r, '-1-, -2-, -3-')

foreach_i

I'd like to propose a new util: foreach_i.

Motivation

You know how JS's Array.map also passes the element index as a 2nd parameter to the function?

> ['a', 'b', 'c'].map((x, i) => `Element ${i} is ${x}`)
[ 'Element 0 is a', 'Element 1 is b', 'Element 2 is c' ]

That's exactly what I'm trying to do here. The only difference is that the index would be passed as the 1st param:

def test_foreach_i():
    
    r = ['a', 'b', 'c'] > (pipe
                          | foreach_i(lambda i, x: f'Element {i} is {x}')
                          | list
                          )
    
    assert r == [ 'Element 0 is a'
                , 'Element 1 is b'
                , 'Element 2 is c'    
                ]

(Naïve) Implementation

from pipetools.utils import foreach, as_args
from typing import Callable, TypeVar


A = TypeVar('A')
B = TypeVar('B')

def foreach_i(f: Callable[[int, A], B]):
    
    return enumerate | foreach(as_args(f))

The same could be done for foreach_do.

Does pipetools support Pandas Dataframes?

I am trying to run this example

additives_df.loc[:, ["pallets"]] > pipe | (lambda df: df.head())

but i get the exceptions

TypeError: '>' not supported between instances of 'float' and 'function'

Is there a proper way to pipe pandas dataframes

Add support Python 3.9

According alert from DeprecationWarning we have two issues:

pipetools/main.py:1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
    from collections import Iterable

and

pipetools/utils.py:2: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
    from collections import Mapping

Docs: https://docs.pytest.org/en/stable/warnings.html

Using `x > pipe | foo | bar` gives different results than `(pipe | foo | bar)(x)`

Not totally sure about anything here (this is my first time leaving a bug) but it seems like I've found a situation where these two seemingly-the-same statements do different things. I've included a minimal example:

import numpy as np
from pipetools import pipe

def scaleDiagonal(matrix):

	m = np.copy(matrix)
	size = matrix.shape[0]

	smallest = min([ m[i,i] for i in range(size) ])

	for i in range(size):
		m[i,i] = m[i,i] / smallest

	return m

x = np.array([1,2,3,4,5,6,7,8,9]).reshape(3,3)

y = (pipe | scaleDiagonal)(x)         # Works
z = x > pipe | scaleDiagonal          # Errors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.