Comments (5)
Hi @carlthuringer! That is indeed an extreme case. Do you have 4k unique job classes or are you generating those dynamically? This sounds more like you need simpler batching and/or splitting into smaller workflows.
from gush.
The workflow I'm attempting to orchestrate is a "Seeding" workflow. Just imagine a typical rails monolith riddled with vast and complex belongs_to and has_many relationships, and you'll have some idea what I'm trying to produce.
Generating all the data in one shot is unreliable and time-consuming. There's no way I can do it in a transaction, so logically the next step is to break it down into idempotent stages. Gush seemed like a great solution to elegantly declare what data needed to be created, and which data depended upon which, all without needing to juggle specific references or use confusing flow control.
I've got about 18 job classes and I'm using a lot of looping to run MyJobs
with various after: [batch, of, jobs]
type configuration. In the leaves I end up with M * N * O
combinations that I want to generate data for and orchestrate with Gush.
The biggest advantage to using Gush is that I can specify complex convergent workflows, which is impossible with Sidekiq Batches.
Without the convergence, I have to have several breakpoints where data is prepared, the system arrives in a known state, and then the next stage is started. The programming becomes more procedural and literal, instead of the elegant and implicit way I can set up all the initial data with Gush, and then have various downstream jobs dependent on initial data waiting for those jobs to complete.
from gush.
Thanks for the detailed explanation. There are ways to improve storing the state I will explore. My colleague here at Chaps suggested some good ideas which we will try to implement as the current one is rather naive (we never expected such large workflows :) )
Stay tuned!
from gush.
@carlthuringer year later but I just pushed a change to activejob
(will be in 1.0) branch which greatly reduces time needed to spawn hundreds or thousands of jobs. Overall should speed up execution by a lot.
from gush.
Closing this as 1.0.0 was released with major improvements to performance. If issue still occurs please open a new ticket.
from gush.
Related Issues (20)
- Allow redis 5 HOT 1
- Update Redis gem HOT 1
- Make gush handle wokflows with 1000s of jobs HOT 13
- Many mutex fails after #49000a30fd34ac21cabdc7d HOT 9
- Activejob callbacks not available in the Gush::Job
- Long-running jobs and human job completion HOT 11
- Jobs that eventually succeed will not queue downstream jobs HOT 6
- Gush.configure with namespace issue HOT 3
- Can we run same workflow parallelly with different arguments? HOT 1
- bundle exec viz workflow with parameters HOT 6
- Shortcircuit workflows HOT 15
- Cannot assign requested address - connect(2) for [::1]:6379 HOT 12
- Overlapping workflows HOT 1
- Update Gem on RubyGems HOT 3
- Project status HOT 8
- Can I create a dynamic workflow based on job's output? HOT 1
- Project Idea - Web UI HOT 1
- cannot create workflow instance HOT 2
- Dead Link to "Parallelizing Operations With Dependencies" by Stephen Toub
- Polling scenario in job HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gush.