Giter Site home page Giter Site logo

asuiu / pyxtension Goto Github PK

View Code? Open in Web Editor NEW
40.0 40.0 1.0 317 KB

Pure Python extensions library that includes Scala-like streams, Json with attribute access syntax, and other common use stuff

License: MIT License

Python 98.83% Shell 0.99% Batchfile 0.18%
java-streams mapreduce python python-iterables python-itertools python-json python-mapreduce python-multiprocessing python-multithreading python-streaming streaming

pyxtension's People

Contributors

asuiu avatar marian-rusu avatar snyk-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

adzilla

pyxtension's Issues

error invalid syntax when startup

Hi there, got strange error,

i installed 2 version, local and prods server, but on prod server it gives me error like this :

Traceback (most recent call last):
File "versi1_telebot.py", line 1, in
from pyxtension.Json import Json
File "/usr/local/lib/python3.5/dist-packages/pyxtension/Json.py", line 14, in
from pyxtension.streams import *
File "/usr/local/lib/python3.5/dist-packages/pyxtension/streams.py", line 45
_IDENTITY_FUNC: Callable[[T], T] = lambda _: _
^
SyntaxError: invalid syntax

pip install version does not match

I installed pyxtension with pip, but it's weird the version shown doesn't match the real version.
I want to install version 1.11:

$ pip install pyxtension==1.11
Collecting pyxtension==1.11
Installing collected packages: pyxtension
Successfully installed pyxtension-1.0

then I use pip freeze to see the version:

$ pip freeze | grep pyxtension
pyxtension==1.0

It shows version 1.0.

Implement split_in_batches

Example:

def split_in_batches(self, itr: Iterable[T], batch_max_size: int = BATCH_SIZE) -> Generator[slist[T], None, None]:
        batch = slist()
        for el in itr:
            batch.append(el)
            if batch.size() == batch_max_size:
                yield batch
                batch = slist()
        if batch.size():
            yield batch

Implement try_map() method on streams

It should try the map function, and if it throws exception return an instance of custom "Fail" object that evaluates to False for filter() functions, and contain the exception data (stack trace, etc..)

Use pathos multiprocess for mpmap

Standard lib multiprocess is not able to handle lambda functions due to use of limited serializer (Pickle). Pathos multiprocess fork is able to serialize lambdas using Dill serializer.

Proposal is to make pyxtension use this improved multiprocessing library to handle cases like lambda functions and other advantages it brings

Installation with pip

Hello

Is there any plans to standardize installation process with pip?
P.S. Thank you for such a convenient library!

multiprocessing method mpmap() raises error even for the sample case provided

Hi Andrei ,

For the sample given:

corpus = [
        "MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster.",
        "At Google, MapReduce was used to completely regenerate Google's index of the World Wide Web",
        "Conceptually similar approaches have been very well known since 1995 with the Message Passing Interface standard having reduce and scatter operations."]

    def reduceMaps(m1, m2):
        for k, v in m2.iteritems():
            m1[k] = m1.get(k, 0) + v
        return m1

    word_counts = stream(corpus). \
        mpmap(lambda line: stream(line.lower().split(' ')).countByValue()). \
        reduce(reduceMaps)

The processing results in following error:
Traceback (most recent call last):
File "C:\Users\ses\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 537, in _handle_tasks
put(task)
File "C:\Users\ses\AppData\Local\Programs\Python\Python39\lib\multiprocessing\connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "C:\Users\ses\AppData\Local\Programs\Python\Python39\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x00000183A90173A0>: attribute lookup on main failed

Similar error occurs with other case like:

result = stream([v for v in entry_list.values()]). \
            mpmap(lambda entry: TwigBasic.split_multiple_yang_structures_in_one_line(entry) if TwigBasic._check_for_twig(entry) else [entry]). \
            flatMap(). \
            enumerate(). \
            map(lambda item: (str(item[0]), item[1])). \
            toMap()

Could you pls have a look on it!
Best regards,
Serge

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.