Comments (12)
going to have a go at this
from verk.
That sounds great! I've opened #67 for this issue.
I'd be glad to take a look or help for the worker manager/worker changes whenever :)
from verk.
from verk.
WorkersManager watches the workers and keep track of the free workers. We do this because :poolboy.status
is a GenServer.call
and we already have the workers information available.
We could simply start using the status
to know if we have available workers.
I would try this approach changing this private function: https://github.com/edgurgel/verk/blob/v0.9.13/lib/verk/workers_manager.ex#L210-L212
Then doing some quick benchmarks to see if it affects too much. If it's decreasing the performance we could potentially call status
just when we think that we have free workers. Basically use both the WorkersManager ets table AND :poolboy.status
.
Does this make any sense? :P
from verk.
Makes sense to me.
from verk.
Awesome @mitchellhenke. I was thinking on a simple way to run benchmarks for workers.
I was thinking on:
- Enqueue a "StartBenchWorker" that prints the time
- Enqueue 10k/100k NoOpWorkers
- Enqueue a "FinishBenchWorker" that prints the time
We would get for free a pipeline of workers running. OFC that the FinishBenchWorker wouldn't be the strict last worker to run as it will run concurrently with the NoOpworkers in the end. It will be a good estimate still.
This may even become part of our test suite somehow.
Thoughts?
from verk.
So I started investigating this on my machine (2.7 GHz Intel Core i5 / 8 GB 1867 MHz DDR3 for what it's worth).
From this limited benchmark, it doesn't look like :poolboy.status
makes a significant difference in throughput for 10k or 100k jobs:
10k Jobs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Current (ms) | 8250.157 | 6568.419 | 6970.148 | 7006.6 | 6191.896 | 6966.409 | 6132.113 | 6773.351 | 6320.074 | 7959.506 |
Average | 6913.8673 | |||||||||
Std Dev | 708.4890134 | |||||||||
Poolboy Status (ms) | 5777.431 | 7104.012 | 6427.04 | 6457.407 | 8404.173 | 6383.751 | 6420.81 | 8187.788 | 7864.628 | 6475.438 |
Average | 6950.2478 | |||||||||
Std Dev | 895.9131822 |
100k Jobs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Current (ms) | 71920.293 | 73990.513 | 68666.102 | 65416.878 | 68894.512 | 71506.667 | 63874.567 | 62768.951 | 64139.824 | 68178.992 |
Average | 67935.7299 | |||||||||
Std Dev | 3811.426986 | |||||||||
Poolboy Status (ms) | 77480.684 | 75714.092 | 71881.198 | 68302.955 | 68916.468 | 62165.245 | 61049.917 | 62572.434 | 63234.677 | 61973.636 |
Average | 67329.1306 | |||||||||
Std Dev | 6071.034864 |
The code for the benchmark is here: https://github.com/mitchellhenke/verk/compare/master...benching?expand=1
The code for the :poolboy.status
change is here: https://github.com/mitchellhenke/verk/compare/benching...poolboy_status?expand=1
What I did notice was that the current bottleneck is likely the WorkersManager
handling the large amount of DOWN
and :done
messages, which would get in the way of sending jobs to the Worker
s?
Would it make sense to have the workers handle the dequeuing of jobs? Sidekiq switched from something resembling verk's current manager style dequeuing to having the workers themselves dequeue jobs. It would be a heavier undertaking, but would solve the issue at hand and potentially increase throughput. I will probably play around with it and see what happens.
Outside of that, should I PR the :poolboy.status
change for now? Happy to explore other options to solve this issue as well if you have other ideas.
from verk.
Great findings. I already have a local branch with some work to investigate extracting part of the workers manager processing to the workers. I will push this soon so we can start discussing. Is this fair?
I would still push forward the :poolboy.status change so we fix the current issue. What do you think?
from verk.
I should also point that it's important to tweak the benchmark as well to check with workers that spend more time doing something. I think it's safe to say you wouldn't put stuff to be done as a job if you can do it instantly. I'm keen to try with some random sleep (say 0 to 60 seconds). So then we have a less artificial benchmark. I will play with the benchmark today :)
from verk.
I forgot to add a comment to this:
Would it make sense to have the workers handle the dequeuing of jobs? Sidekiq switched from something resembling verk's current manager style dequeuing to having the workers themselves dequeue jobs. It would be a heavier undertaking, but would solve the issue at hand and potentially increase throughput. I will probably play around with it and see what happens.
Sidekiq can't be a real comparison because of the concurrency that the Erlang VM can deal with. The amount of workers and queues that Verk can handle is much greater. We have 400+ queues running each more than 10 workers per machine. If all of them start hitting Redis to dequeue jobs we may overload the Redis connections with unnecessary requests. The monitoring also can't be avoided completely as a Worker can simply die and we need to keep track of such failures.
(I thought that it would be a good addition to this discussion.)
I'm currently thinking of different approaches to speed up the "happy path".
from verk.
That makes sense, yeah. Appreciate the explanation.
Is splitting up the things the workers manager has to do (queueing, handling monitoring, etc.) into separate processes worth looking into?
from verk.
@mitchellhenke yeah. I just created this issue so we can discuss more! I'm keen to improve the WorkersManager
:)
from verk.
Related Issues (20)
- Strange behaviour with integer argument, potential bug HOT 17
- Add mix task to reset queues HOT 9
- Add pending jobs to queue stats HOT 2
- Unable to overwrite `max_retry_count` via config.exs HOT 2
- Event for dead job HOT 2
- Question. Can Verk check more frequently for new jobs? HOT 2
- Undocumented `max_dead_jobs` HOT 1
- function Verk.Supervisor.start_link/1 is undefined or private HOT 3
- Im surely missing something. HOT 4
- Coupling of producers and consumers? HOT 1
- Upgrade GenStage to current version HOT 5
- Version Conflict with Confex 3.4.0 HOT 2
- Loosen dependency on redis HOT 2
- Verk 2.0 HOT 12
- Skip queuing jobs during tests. HOT 1
- Batch processing? HOT 2
- Defining contextual data for worker processes HOT 4
- [question] What are the risks of the "experimental" generate_node_id feature? HOT 4
- How do I setup Verk as a producer only? HOT 1
- Connect to Redix via ipv6? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from verk.