Giter Site home page Giter Site logo

simplekiq's Issues

Ability to quiet orchestrations

Sidekiq has the ability to quiet jobs (e.g. keep new jobs from be enqueued) enabling graceful shutdown. Because orchestrations contain jobs that are often coupled to each other code changes can be painful. If you have an orchestration like:

run JobB
run JobA

And move work from JobB to JobA when you deploy you might have JobB jobs already queued. When your updated code starts running the queued JobB jobs will run but you'll lose the work that you've moved to JobA. For these reasons a convenient tool would be the ability to stop starting new orchestrations. You can then wait for your running orchestrations to finish before making your code changes.

Batch invalidation

We have an initial experiment of this in Campaigns that went well, we could extract this into the gem but we also need to figure out what exactly should happen when a job realizes that its orchestration has been invalidated.

Equivalent to perform_in for the run method?

Hey all! I've been using Simplekiq for several months and love it. ❤️ I hit an interesting challenge this past week and had an idea. I have an orchestration job that looks something like this:

class OrchestrateFooBarJob
  include Simplekiq::OrchestrationJob

  def perform_orchestration(id)
    foo = Foo.find(id)
    return if foo.complete?

    in_parallel do
      foo_bar_ids = foo.foo_bars.ids # Foo has_many :foo_bars
      foo_bar_ids.each do |foo_bar_id|
        run FooBarJob, foo_bar_id
      end
    end
  end
end

Each FooBarJob makes an API request and stores some data on the FooBar AR model. The problem is, during peak app usage, there may be several OrchestrateFooBarJob jobs running at a time, firing off many parallel FooBarJob jobs nearly simultaneously and exceeding our API request concurrency.

We've upgraded to Sidekiq Enterprise and I'm about to implement Concurrent Rate Limiting on the FooBarJob class. But in my mind, ideally I wouldn't be relying solely on the OverLimit exceptions plus a backoff with jitter.

As the docs say under Limiting is not Throttling:

If you push 1000 jobs to Redis, Sidekiq will run those jobs as fast as possible which may cause many of those jobs to fail with an OverLimit error. If you want to trickle jobs into Sidekiq slowly, the only way to do that is with manual scheduling. Here's how you can schedule 1 job per second to ensure that Sidekiq doesn't run all jobs immediately:

1000.times do |index|
 SomeWorker.perform_in(index, some_args)
end

Looking at the Batches docs Notes section, I see that:

Batches can contain scheduled jobs too, e.g. perform_in(10.minutes).

So, now I'm wondering, would it be possible (and make sense) to have a run command equivalent for perform_in, something like run_in below?

class OrchestrateFooBarJob
  include Simplekiq::OrchestrationJob

  def perform_orchestration(id)
    foo = Foo.find(id)
    return if foo.complete?

    in_parallel do
      foo_bar_ids = foo.foo_bars.ids
      foo_bar_ids.each_with_index do |foo_bar_id, i|
        run_in i.seconds, FooBarJob, foo_bar_id
      end
    end
  end
end

Or have you maybe found another way to solve this?

Gracefully handle callback removal

BatchingJob auto-registers Sidekiq batch callbacks when you define an on_x method. The problem is that if you have a callback that you don't need anymore and update the code removing the method you'll get exceptions when the previously registered callbacks try to run. Since we're auto-registering it seems reasonable for us to also check that they are still there before trying to run them -- basically defining BatchingJob callbacks that no-op if the user's job hasn't implemented the callback methods. Some tricky async code change stuff to think through.

Orchestration failure handling

Simplekiq::BatchedJob provides access to simplekiq-pro's on_death callback, but there's no similar behavior provided for orchestrations. Also, even if we did add a way to tap into batch callbacks, that wouldn't give us the actual exception, stacktrace, etc.

There's some additional error handling mechanisms provided by sidekiq which might be useful for this - https://github.com/mperham/sidekiq/wiki/Error-Handling - perhaps we can get the exception info from there and somehow tie it into on_death batch handling, or perhaps batch-level error handling wouldn't be useful for handling orchestration failures.

Ideally we would want to make sure to make this respect retry behavior, a job that failed, retried, and succeeded shouldn't trigger this (the sidekiq-pro on_death callback follows this behavior so we should match that) and it should have a simple interface so custom handling can be easily defined for an orchestration, similar to how easy it is to define failure handling for Simplekiq::BatchedJob

Setting reserved queue for job doesn't include BatchTrackerJob

I set up a Simplekiq::OrchestrationJob which I then call via a sidekiq schedule specifying that it should be on a new queue called "priority". This mostly works well, and I can see that the job executes on this reserved queue. However there are some SimpleKiq logs related to this job which appear on the default queue from class=Simplekiq::BatchTrackerJob.

Is it possible to configure Simplekiq to perform all work on a specified reserved queue? E.g something along the lines of setting sidekiq_options queue: "priority" in the OrchestrationJob file such as in the Sidekiq docs.

Should we opensource workflows?

I think we should -- it was part of our original vision and helps solve the same set of problems. But we probably should first make significant changes -- mainly because there is a lot of random code in there that none of us understand.

Also while tree-view can definitely be helpful it probably shouldn't be the primary view since the raison d'être of simplekiq is enable you to think of your background code as a sequential flow. This view would be more orchestration, leveraging knowledge of orchestrations to flatten the tree based on orchestration callbacks, hiding jobs that aren't user-owned and leveraging orchestration "checkpoints" if we build those out. These two views can share some underlying abstractions but separating them also means you could use Simplekiq workflows without Orchestrations.

Limited parallelization for in_parallel and/or BatchedJob

https://github.com/doximity/campaigns/blob/f3b5a7d457a0fc4b5e310c851ac9b55570ce18b2/app/jobs/audiences/sleepy_mass_cache_orchestration_job.rb

^ this job takes an approach to limiting the parallelization by recursively queuing itself after the partial set has completed, we could probably use a similar approach transparently in our batching behaviors, perhaps like in_parallel(10) or a sidekiq option in BatchedJob or something like that.

This is useful for avoiding flooding the queue so that new jobs can still interleave with the orchestrated batch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.