Comments (5)
Hello,
this Pool
implementation was added because there was none which would support disaster recovery and hanging workers while still offering a clean interface.
Billiard was the closest one but its undocumented and confusing API was causing issues in our systems. Moreover, it was not properly handling timeouts: celery/billiard#104. It might have improved by now.
The Pebble implementation is quite stable and we use it in some high load systems.
Among the issues/features it handles/offers:
- timing out operations
- task cancellation (with worker termination)
- crashing workers (python interpreter disasters or C libraries segfaults)
- allows to transfer large data between server and workers
- iteration (
map
) over faulty results
The only known issue I'm investigating (apart from issue #10) is the pool hanging in rare occasions where very large data is being transferred from the workers back to the server at once (in a single result). I'm not 100% sure is due to Pebble though.
The lines you linked deal with hard-killing an unresponsive worker (hanging in a C loop for example) without corrupting the Queue. But there are plenty of other small corner cases to deal with.
from pebble.
Matteo thanks for the detailed and quick response!
Seems that the code is pretty stable. When you say very large data is being transferred from the workers back to the server
that happens when the workers are remote (over the network)? How large is very large
?
I'll start writing some unittests for my current implementation and then migrate to Pebble. Hopefully it will all work well 👍
from pebble.
By server I mean the Pool
process. Pebble is not capable to deal with remote processes. For that, I would recommend to take a look at Celery or Luigi.
As I said, I'm not sure about such issue as I've been dragged over other things a while ago. I'll try to resume the investigation in that regards and, if there's seems to be a real issue, I'll open a report myself.
Let me know if you encounter any problem integrating Pebble.
I will close this issue for now.
from pebble.
Integration with my code went really well andresriancho/w3af@27c6e25
I now have less code to maintain, and the whole thing seems to be working as expected.
Thanks for making pebble open source!
from pebble.
Np, glad it helps somebody.
I took a look at my notes regarding the "large data issue". It was a test which was hanging due to a mistake of mine in the test itself. I will fix the test in the following days (not really urgent).
On Windows, it might be problematic when transferring large amount of data through the Pool
. I need to research a bit on how to improve that.
Nevertheless it's not a good idea to transfer large chunks of data via IPC. Better to rely on the filesystem for such use cases.
from pebble.
Related Issues (20)
- getting the pid of the process HOT 5
- Feature Request: Finaliser Method HOT 2
- Type Hints HOT 1
- Support dill HOT 3
- user_done_callback fires too early on cancellation or timeout HOT 9
- add running() method to Future HOT 3
- Pebble when used with PyQT5 is generating multiple UI instances HOT 6
- Pebble wheels pushed to PyPI are incorrectly tagged HOT 3
- The timeout argument of ProcessPool().submit() is inconsistent with ThreadPool().submit() HOT 3
- Bug: new `submit` function makes it impossible to call a function that has a `timeout` argument HOT 6
- Logging process name inside concurrent.process HOT 2
- How to handle errors when using pool.schedule HOT 1
- shutdown of main program HOT 2
- @concurrent.process returned future blocks/hangs on running(), cancelled(), done() calls HOT 2
- Get information about broken process HOT 4
- Documentation for Pebble indicates threads created with a ThreadPool are cancellable HOT 1
- Channel mutex timeout HOT 6
- How can I use a multiprocessing.manager alongside with pebble to avoid re-importing the function everytime? HOT 1
- Type hint error of wrapped function HOT 7
- ISSUE with with ProcessPool when scheduled function return exception (not raise it) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pebble.