Comments (3)
That's a great question!
25 is somewhat magic. It was the worker count that proved the most stable on my test setup early on when there were still some stability issues, and kind of stuck around since then. So right now it's somewhat arbitrary.
That said, running more phantom workers than the amount of threads supported on the CPU did have a slight performance gain in my benchmarks in the sense that it could handle more requests but that there was an additional (small) delay on all requests. This makes sense for some use cases. But it also comes at the cost of the server becoming IO/memory bound, which can be problematic when running it in virtual servers.
The "right" amount of workers for keeping the CPU at ideal load generally depends on the CPU; both the number of cores, and even more importantly, if the CPU supports hyperthreading or not.
There's another side to it as well though, which you also mentioned. Workers have to be restarted every once in a while, as performance starts degrading after running for a while. The way the pooling works is that each worker has a maximum number of work pieces that can be performed before it's killed and replaced.
Having a larger pool means that it takes longer for each individual worker to reach its maximum work count, which enhances performance quite a bit since there's overhead involved in starting up a new worker which quickly adds up to severe delays with heavy traffic. Each worker is also initially seeded with a random work count to avoid situations where all the workers restart at the same time, and the scheduler will try to distribute traffic uniformly across the entire pool when a new piece of work is posted. The result is somewhat queue-esque I suppose from a lower level perspective, but from node's perspective tasks are performed continuously (though at the mercy of the OS scheduler). There's a small request queue too in which work will end up if all workers are busy. This queue is limited to 5 pieces of work, which is also a magic number. However, this one was reached based on real-life data from heavy traffic surges in our production system. Having a larger queue would cause it to grow out of control if several workers were hogged for longer periods. Connections are dropped if the pool is saturated and the queue is full. This almost never happens in our production system.
In production (for export.highcharts.com), we use a worker count that ensures that there are rarely more active workers than the number of supported HW threads. Our traffic patterns are very predictable, so we choose hardware that could sustain the average request count without saturating the pool. We have a headroom of around 1.7x on top of the hardware supported thread count when looking at number of cores + hyperthreading. If the pool saturates during traffic spikes, the overall delay is fairly negligible. It took some iterations before we hit the sweet spot, but the service has now been running in production for about three months without any issues or intervention.
As for the timeouts, they are there to avoid bad charts from hogging a worker indefinitely, for instance if misbehaving JavaScript is injected (e.g infinite loops and so on). Not having it would make the pool very susceptible to DoS.
Anyway, the ideal work count varies based on the hardware on which it's ran, as well as traffic patterns, and as such it will vary based on use case. The best approach is to benchmark different settings for your specific use case to find a number that makes sense to you.
I'm going to adjust the defaults though, as 25 is quite high for most use cases. :)
from node-export-server.
Thank you for this very detailed explanation and reasoning, I'm grateful you took the time and elaborated real pruduction experiences. I will do more performance testing and try to find ideal settings for us. We are not so afraid of DoS as we really want to process all requests if possible as we are testing the service internally.
So far trying to run load tests using docker has mostly taken the whole docker virtual host down - something that does not happen with PhantomJS + nginx load balancer. Something I need to investigate more.
from node-export-server.
No problem at all!
I haven't tried running it in docker, so I'm afraid I can't really offer any advice there (we run the production service in elastic beanstalk), but do let me know if there's anything we can do to help.
from node-export-server.
Related Issues (20)
- Cannot add series-on-point module to the export server HOT 3
- Highcharts.AST.allowedAttributes HOT 9
- Preview and download are not generating in windows HOT 2
- Different export results on the first and the second requests HOT 1
- Previous version output result error HOT 6
- v3: compiled CLI via pkg not working HOT 6
- Image with URL included in the exported chart only sporadically HOT 1
- Chart being not rendering correctly in case of file format: image/svg+xml HOT 3
- Chart options object is not available in the customCode context in current node-export-server (master) HOT 5
- Please add tags to repo which correspond to the current version HOT 2
- v4: dist folder missing HOT 4
- Startup error: no such file or directory, open '/usr/lib/node_modules/highcharts-export-server//msg/startup.msg HOT 2
- v4 'cachePath' with absolute path HOT 4
- v4: outdated example for Node.js module usage HOT 1
- v4: getting error HOT 6
- v4: make userDataDir configurable HOT 2
- Export server run issue for multiple request HOT 2
- Data Limit/Inefficient data transfer of chart data (options) to the browser HOT 1
- Highcharts cache folder HOT 2
- v4: no error is thrown when callback is set and it's a file but allowFileResources is not set to true HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from node-export-server.