Giter Site home page Giter Site logo

Comments (8)

kendonB avatar kendonB commented on June 18, 2024 1

The docs for future.scheduling say: Average number of futures ("chunks") per worker. If ‘0.0’, then a single future is used to process all elements of ‘x’. If ‘1.0’ or ‘TRUE’, then one future per worker is used. If ‘2.0’, then each worker will process two futures (if there are enough elements in ‘x’). If ‘Inf’ or ‘FALSE’, then one future per element of ‘x’ is used.

For the future package, is it appropriate to refer to each element of x as a job, and each cluster process (when using distributed) as a worker?

Then, as I understand it, a future is some chunk of x (or number of jobs) that can resolve at some future point in time from the point of view of the host process. If a worker is processing more that one future, the first ones resolve before the worker is done with all of its work?

If I'm getting this at all right, future.scheduling controls when the results of the jobs resolve from the perspective of the host process?

With this issue, I'm really looking for a way for new workers to spring up when old ones finish with their work, while limiting the total number that exist at one time. This is kind of like the behavior of parLapplyLB. With length(x) == 5000, I would want future_lapply to ask batchtools to first create 100 workers for x[1:100], then once the first finishes it would create the 101th worker and so on.

Is this what future.scheduling = FALSE does when using batchtools?

from future.batchtools.

HenrikBengtsson avatar HenrikBengtsson commented on June 18, 2024

The workers argument to plan doesn't work as I'd like to keep the jobs small so I can get load balancing working.

Thanks for feedback, though this is a bit too sparse to fully understand what you've observed/concluded. Can you please clarify/expand on this? (*)

(*) It could be related to what I've recently realized myself (while implementing future.callr; HenrikBengtsson/future.callr@32a8e3c) - argument 'workers' is only respected for future_lapply() calls but ignored when using future() directly. It's something I've overlooked here and that part is a bug.

from future.batchtools.

kendonB avatar kendonB commented on June 18, 2024

Apologies for the lack of clarity! I am using future_lapply at the moment. With workers = 100, the behavior I currently see is that the N >> 100 jobs are distributed amongst 100 workers, so the time spent is determined by the slowest set of jobs sent to one of the 100 workers.

The feature requested is an argument that would limit the total number of workers at any time to 100, but would create new workers once any of the first 100 completes.

from future.batchtools.

kendonB avatar kendonB commented on June 18, 2024

Of course, the workers argument is also very useful to avoid the overhead associated with creating the workers dominating runtime.

from future.batchtools.

HenrikBengtsson avatar HenrikBengtsson commented on June 18, 2024

Still not 100% sure (because naming convention clashes of what it is meant by a "job", "future", "worker"), but does argument future.scheduling (especially if set to FALSE) for future_lapply() provide what you need?

from future.batchtools.

HenrikBengtsson avatar HenrikBengtsson commented on June 18, 2024

I've had a chance to look a bit more into this. Turns out that due to the bug I mention in #18 (comment), future_lapply(x, FUN, future.scheduling = FALSE) will not acknowledge n in plan(batchtools_nnn, workers = n). I'll fix that.

from future.batchtools.

HenrikBengtsson avatar HenrikBengtsson commented on June 18, 2024

I haven't forgot about this one; this is still a bug in future.batchtools, cf. Issue #19.

from future.batchtools.

HenrikBengtsson avatar HenrikBengtsson commented on June 18, 2024

This was actually resolved when issue #19 was resolved in the v0.11.0 [2022-12-13] release

from future.batchtools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.