Comments (8)
The docs for future.scheduling
say: Average number of futures ("chunks") per worker. If ‘0.0’, then a single future is used to process all elements of ‘x’. If ‘1.0’ or ‘TRUE’, then one future per worker is used. If ‘2.0’, then each worker will process two futures (if there are enough elements in ‘x’). If ‘Inf’ or ‘FALSE’, then one future per element of ‘x’ is used.
For the future
package, is it appropriate to refer to each element of x
as a job, and each cluster process (when using distributed) as a worker?
Then, as I understand it, a future
is some chunk of x
(or number of jobs) that can resolve at some future point in time from the point of view of the host process. If a worker is processing more that one future, the first ones resolve before the worker is done with all of its work?
If I'm getting this at all right, future.scheduling
controls when the results of the jobs resolve from the perspective of the host process?
With this issue, I'm really looking for a way for new workers to spring up when old ones finish with their work, while limiting the total number that exist at one time. This is kind of like the behavior of parLapplyLB
. With length(x) == 5000
, I would want future_lapply
to ask batchtools
to first create 100 workers for x[1:100]
, then once the first finishes it would create the 101th worker and so on.
Is this what future.scheduling = FALSE
does when using batchtools
?
from future.batchtools.
The
workers
argument toplan
doesn't work as I'd like to keep the jobs small so I can get load balancing working.
Thanks for feedback, though this is a bit too sparse to fully understand what you've observed/concluded. Can you please clarify/expand on this? (*)
(*) It could be related to what I've recently realized myself (while implementing future.callr; HenrikBengtsson/future.callr@32a8e3c) - argument 'workers' is only respected for future_lapply()
calls but ignored when using future()
directly. It's something I've overlooked here and that part is a bug.
from future.batchtools.
Apologies for the lack of clarity! I am using future_lapply at the moment. With workers = 100, the behavior I currently see is that the N >> 100 jobs are distributed amongst 100 workers, so the time spent is determined by the slowest set of jobs sent to one of the 100 workers.
The feature requested is an argument that would limit the total number of workers at any time to 100, but would create new workers once any of the first 100 completes.
from future.batchtools.
Of course, the workers
argument is also very useful to avoid the overhead associated with creating the workers dominating runtime.
from future.batchtools.
Still not 100% sure (because naming convention clashes of what it is meant by a "job", "future", "worker"), but does argument future.scheduling
(especially if set to FALSE
) for future_lapply()
provide what you need?
from future.batchtools.
I've had a chance to look a bit more into this. Turns out that due to the bug I mention in #18 (comment), future_lapply(x, FUN, future.scheduling = FALSE)
will not acknowledge n
in plan(batchtools_nnn, workers = n)
. I'll fix that.
from future.batchtools.
I haven't forgot about this one; this is still a bug in future.batchtools, cf. Issue #19.
from future.batchtools.
This was actually resolved when issue #19 was resolved in the v0.11.0 [2022-12-13] release
from future.batchtools.
Related Issues (20)
- huge results file with 'conditions' - performance bottleneck HOT 4
- use batchtools directly for scheduling
- Eqw on SGE cluster while R code finishes without error
- Simple chunking with nested parallelism HOT 1
- Slurm readLog() Error - Option to change fs.latency & scheduler.latency from batchtools_slurm or future::tweak HOT 10
- Proposed bugfix for batchtools reveals bug in future.batchtools? HOT 4
- Problem forwarding batchtools resources to individual futures HOT 4
- problem with running example parallel futures using batchtools_lsf
- Add batchtools template for SGE
- batchtools templates: `resources[["asis"]]` for as-is declarations HOT 1
- print() for BatchtoolsFuture should report on the template file used HOT 1
- Compatibility with promises package HOT 2
- TESTS: Error: identical(Sys.getenv(), oenvs0) is not TRUE on MS Windows CRAN HOT 1
- run(), resolved()[?], and result(): the RNG state is updated - from where? HOT 3
- Add support for plan(batchtools_multicore, workers = I(1))
- All batchtools_nnn() functions should return the future invisibly
- Template tools: add option to ShellCheck rendered template
- Template tools: export functions to find and render template
- PERFORMANCE: status() to memoize "finalized" state? HOT 2
- Error: Log file for job with id 1 not available HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from future.batchtools.