Comments (22)
Hi folks,
Thanks for pushing boundaries. Broadway has potential to make our lives so much simpler, and according to my current understanding having multiple producers is a key here.
In our case latency and correctness are two top priorities. Say, we need to consume a 1k messages at the same time (1k producers?) and be sure that each of them will end up in the sam processing unit (partitioned demand dispatcher?).
Our events has a lifecycle and to avoid queuing we are ok for spawning a process per event and shutting it down at the end of the event lifecycle or by timeout in days.
from broadway.
Alright, it is more about ordering than locality. Thank you!
from broadway.
from broadway.
Our events has a lifecycle and to avoid queuing we are ok for spawning a process per event and shutting it down at the end of the event lifecycle or by timeout in days.
Follow up question: why is it important to go to the same unit? Do you keep intermediate results in memory?
from broadway.
-
Each message has an unique
event_id
field -
Nope, it's about sequentiality. Some of a parallelly processed messages may update an exact same entity and in our system all updated should happen in an exactly the same order as they were in a queue.
Similar to a problem with word "are" in the word counting example ;)
from broadway.
The broadway sqs producer is very difficult to tune without this feature. If I cannot separate download concurrency from processing concurrency (by making a separate stage for downloaders and a separate stage for processors) then I cannot scale them independently.
The best workaround currently is to have the processor spawn a few tasks to do the download but that leaves the processor waiting instead of processing an already ready message (say from another downloader).
from broadway.
Hi @mgwidmann!
If I cannot separate download concurrency from processing concurrency (by making a separate stage for downloaders and a separate stage for processors) then I cannot scale them independently.
They are already independent. You can define the concurrency level of the producer or processor by setting the :stages
option individually. This issue is about supporting multiple different kinds of producers/sources simultaneously, like consuming data from different queues or even from SQS and RabbitMQ.
from broadway.
Sorry I misread the title, thought this was about multiple processors! Is there a separate issue for that?
from broadway.
@mgwidmann the issue right above this one. :D #39 I have some comments on this, so please copy and paste your original comment there and we can discuss solutions.
from broadway.
We have a use case for multiple producers where we have different (RabbitMQ) producers producing from different RabbitMQ connections but producing the same kind of messages, that we want to process in the same way. I think it might not be such a unique use case, so it might be worth adding this :) As always, I volunteer to help if we want to go through with this at some point.
from broadway.
In this case you can share the code using modules. I think we won’t get this feature in because we are adding the feature for a producer to change the topology, só producers could change the topology in conflicting ways.
from broadway.
@josevalim I can share the code, yep, but I need to start two different Broadway pipelines with basically the same set of options except for the producer. It's fine, it's what we do now, but since I saw the open issue I thought I would discuss. I'm a bit concerned about the current Broadway API which suggests that multiple producers/processors should be supported (for example, passing a list of producers/processors, passing the producer name in handle_message
, and so on). So maybe it might be worth deprecating the current API at some point.
from broadway.
@josevalim btw, can you expand on the feature of a producer changing the topology?
from broadway.
It is issue #100.
from broadway.
I have a situation where we have ~40 SQS queues that need to be consumed from.
I have concerns about setting each SQS queue up in an isolated BroadwaySQS pipeline, with the major concern being that tuning the global concurrency of message handling isn't possible.
In an ideal scenario I would be able to merge the messages from all queues into a singular pipeline, of which a limited number of processors would handle messages across all 40 queues (perhaps with custom priority logic).
With each BroadwaySQS pipeline in isolation, each of the 40 isolated pipelines would have a fixed number of processors, and under heavy load could overwhelm the system.
from broadway.
My suggestion is two:
- Change BroadwaySQS to allow multiple queues
- Allow the queue/queues themselves to be costumised per producer index (BroadwayRabbitMQ already has this feature)
Then the idea is that you start X producers with Y queues. This is better than 40 producers because demand is always individual between producer/processor, and not shared.
from broadway.
Pull request are welcome! :)
from broadway.
Is someone working on this?
My use case would be to handle multiple sources of information in one pipeline and thus needing multiple types of producers in one pipeline.
from broadway.
@xandervr do you mean the SQS case that José mentioned above?
from broadway.
Not in particular, in general I just want Broadway to be able to have multiple types of producers in 1 pipeline. SQS, RabbitMQ... does not really matter, just every type of GenStage should be supported.
Let me know if I misunderstood your question.
from broadway.
I cleaned up the thread a bit and reopened it.
from broadway.
My use case is I have a bunch of GCP deadletter topics that I'd like to consume from and process all the exact same way: store them in a table for review and possible retry. Would be nice not to set up a pipeline for each one.
from broadway.
Related Issues (20)
- 1->n message processing HOT 3
- Possibility to update Pipeline :context ? HOT 1
- Allow use of nimble_options 0.4.0 HOT 2
- [Question] how am i able to update a message in handle_failed and send to another batcher? HOT 1
- How to stop a Broadway Kafka pipeline? HOT 1
- Make producer module a keyword list to ease configuration management? HOT 6
- Broadway.update_rate_limit doesn't reset the counter/interval right away HOT 3
- NoopAcknowledger fails with ack key being set HOT 7
- Disable automatic call to handle_batch/4 HOT 2
- [Question] Creating a Broadway Message struct for testing?
- Telemetry distinguish between Producer metrics HOT 2
- Expected Behavior on Startup? HOT 4
- [docs] The `Broadway.test_batch` example doesn't work with Broadway 1.0.3 HOT 3
- Dialyzer error on ack_immediately/1 HOT 3
- Broadway v1.0.4 Broadway.NoopAcknowledger returns NoopAcknowledger instead of Broadway.NoopAcknowledger HOT 1
- Add `terminate/3` callback HOT 11
- Oban producer HOT 3
- Allow support for Nimble Options 1.0 HOT 5
- Request for MQTT support in Elixir Broadway HOT 3
- Issues using Broadway with DynamicSupervisor
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from broadway.