Giter Site home page Giter Site logo

Comments (3)

brentp avatar brentp commented on July 28, 2024

some other suggestions after using this a bit more:

  1. random_op might also be better named as something like parallel_apply or whatever, since there's no actual randomness (this confused me when i first found this method)
  2. maybe you could allow specifying a reduce function so that it could take all the results and return a summary (like randomstats).
  3. is there a way to get around the wrapper functions in stats.py to allow methods more directly? e.g. if func is a method then automatically create a wrapper?
    new_func = function(self, other, **kwargs){
        return getattr(self, func.__name__)(other, **kwargs)
    }

(or something like that --completely untested)

from pybedtools.

daler avatar daler commented on July 28, 2024
  1. good point
  2. also good point
  3. i think this will be complicated, see below.

Class methods are not well supported (maybe not supported at all?) across process boundaries because they can't be pickled (as pool.apply() complains). I don't know the details, but basically I found that you can only pass functions. I haven't tested the wrapping of a method in a function as you suggest though . . .

Also, class variables do not share state across processes. Importantly, this includes BedTool._TEMPFILES. I was getting all sorts of strange behavior using multiprocessing and pybedtools' existing auto-handling of temp files, and files were not being completely cleaned up.

So stats.py was my attempt to address both of these issues by having functions work on instances passed to the process (via the func args) and by being careful about cleaning up tempfiles.

Anyway, I agree that something like this would be useful but I think it will take some playing around with.

from pybedtools.

daler avatar daler commented on July 28, 2024

Thanks to your suggestions, I made a new, general way of applying any arbitrary BedTool method many times in parallel -- see 3f3673c. Eventually, I'd like to deprecate randomstats and the stuff in stats.py in favor of this since it's 1) a lot cleaner and 2) a lot more general.

from pybedtools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.