Comments (19)
I apologize for the less than useful initial reply. It looks like the transactional push delay causes the batch to go out of scope and so the job does not get a BID associated. That seems like a footgun I should handle. 👍🏻
from sidekiq.
We can either guarantee that Batch jobs are pushed atomically at the end of the jobs
block or transactionally, but not both. What do you expect to happen?
from sidekiq.
Heya @mperham! Long time no see. Is this a case of "we're using it wrong" or is there a real question here?
from sidekiq.
Clarified the description a little bit, I hope that helps. Ideally the jobs would still be pushed at the end of the transaction, but they would get their BID assigned. If it's not possible to do both, then I guess it's more important for a BID to be assigned than to enqueue at the end of the transaction.
It seems like transactional_push!
already doesn't support push_bulk
, so it would make sense to explicitly decide that it doesn't support batches either, but the way it currently stands it's definitely a footgun.
Thanks for looking into it!
from sidekiq.
Somehow this works for me. I can't reproduce the behavior yet. Can you use 7.2.0?
it "works transactionally" do
q = Sidekiq::Queue.new
assert_equal 0, q.size
require "active_record"
ActiveRecord::Base.establish_connection(
adapter: "sqlite3",
database: "test.db"
)
require "after_commit_everywhere"
Sidekiq.default_job_options["client_class"] = Sidekiq::TransactionAwareClient
Sidekiq::JobUtil::TRANSIENT_ATTRIBUTES << "client_class"
class XAJob
include Sidekiq::Job
end
ActiveRecord::Base.transaction do
b = Sidekiq::Batch.new
b.jobs do
XAJob.perform_async
end
assert_equal 1, q.size
end
assert_equal 1, q.size
end
from sidekiq.
I suspect the batch refactoring in 7.1.x fixing inline execution also changed this behavior.
from sidekiq.
Try main and see if it works better for you.
from sidekiq.
The problem still exists on main. The test you wrote isn't testing the issue I mentioned, which involves the batch id not being set. I'm also not convinced this test passes, given that it's trying to use transactional push, I would expect the q.size
outside the transaction to be +1 the q.size
inside the transaction...
I would rewrite it as:
it "works transactionally" do
q = Sidekiq::Queue.new
assert_equal 0, q.size
require "active_record"
ActiveRecord::Base.establish_connection(
adapter: "sqlite3",
database: "test.db"
)
require "after_commit_everywhere"
Sidekiq.default_job_options["client_class"] = Sidekiq::TransactionAwareClient
Sidekiq::JobUtil::TRANSIENT_ATTRIBUTES << "client_class"
class XAJob
include Sidekiq::Job
end
ActiveRecord::Base.transaction do
b = Sidekiq::Batch.new
b.jobs do
XAJob.perform_async
end
assert_equal 1, q.size
assert_equal q.first.klass, 'Sidekiq::Batch::Empty'
end
assert_equal 2, q.size
assert_equal q.map{|j| j.klass}, ['Sidekiq::Batch::Empty', 'XAJob']
assert_equal q.map{|j| j.bid}, [b.bid, b.bid]
end
the last assertion is still broken on main.
from sidekiq.
@mperham sorry it took me a while to respond, wanted to make sure you saw this? ^
Can we reopen this issue?
from sidekiq.
Are you using Sidekiq + Sidekiq Pro 7.2.0? 7.1.0 will not work.
from sidekiq.
Batch jobs do not pay attention to transactional boundaries. The batch.jobs
semantic is clear: all job are pushed at the end of the jobs block.
ActiveRecord::Base.transaction do
b = Sidekiq::Batch.new
b.jobs do
XAJob.perform_async
end # the job will be pushed here
assert_equal 1, q.size
assert_equal q.first.klass, 'Sidekiq::Batch::Empty' # nope, this will be XAJob
end
assert_equal 2, q.size # nope, only one job
assert_equal q.map{|j| j.klass}, ['Sidekiq::Batch::Empty', 'XAJob']
assert_equal q.map{|j| j.bid}, [b.bid, b.bid] # and yes, you should get a bid
from sidekiq.
Batch jobs do not pay attention to transactional boundaries
Is this documented somewhere? This is surprising to me and also does not match the behavior I'm seeing, in both 7.1.0 and 7.2.0
With Sidekiq + Sidekiq Pro both on 7.2.0 what I'm seeing is:
ActiveRecord::Base.transaction do
b = Sidekiq::Batch.new
b.jobs do
XAJob.perform_async
end # the XAJob job is not pushed here, instead there is a Sidekiq::Batch::Empty job
assert_equal 1, q.size
assert_equal q.first.klass, 'Sidekiq::Batch::Empty' # this part passes
end
assert_equal 2, q.size # I see two jobs here
assert_equal q.map{|j| j.klass}, ['Sidekiq::Batch::Empty', 'XAJob'] # this part passes
assert_equal q.map{|j| j.bid}, [b.bid, b.bid] # this part fails, with the second bid being nil
from sidekiq.
As a side note, with 7.1.0 I am not seeing the Sidekiq::Batch::Empty
job, but the missing bid issue is still there
ActiveRecord::Base.transaction do
b = Sidekiq::Batch.new
b.jobs do
XAJob.perform_async
end # the XAJob job is not pushed here
assert_equal 0, q.size # passes
end
assert_equal 1, q.size # passes
assert_equal q.map{|j| j.klass}, ['XAJob'] # passes
assert_equal q.map{|j| j.bid}, [b.bid] # fails, j.bid is nil
from sidekiq.
You need Sidekiq main + Pro 7.2.
from sidekiq.
🤦 I didn't realize main was ahead of 7.2.0 and included this fix. My bad 😬
Looks like that resolved the issue! I will (ask our infra team to) update when it gets released. Thank you!
from sidekiq.
@mperham I'm working on updating sidekiq and bump into this issue ((isolator alerted us of pushing to jobs while inside a transaction). Our scenario is as follows:
- create a batch
- push jobs into the batch that we actually want to track and consider part of the batch
- when those jobs are processed they might open a db transaction perform some tasks that might also push other jobs to sidekiq that we don't really care to track inside the batch and would rather just keep the old behavior of pushing after the transaction is done.
For the pushing after the transaction part I believe we can add AfterCommitEverywhere.after_commit { Worker.peform_async }
manually to all those places which is doable but kind of pain (and not needed when those jobs are pushing for other parts of the app that's not using batches). Can you think of any other way to get back old behavior?
It seems that side effect of that would be that those jobs would be hanlded outside the batch. Is this assumption correct?
I think the uniqueness of our scenario is that we have jobs that we care to track as part of the batch and some jobs that we don't. Idealy we would be able to provide a list to the bach of what classes are part of it and it would only track those instead of assuming every job pushed in the context of a batch is part of it.
pd: sorry to post on a closed issue, not sure if that was the way to go or if I should have opened a new one
from sidekiq.
- Batch guarantees that jobs are pushed at the end of the
jobs
block atomically. - Transactional push guarantees that jobs are pushed after the transaction is committed.
We cannot provide both guarantees at the same time.
Here's an idea: do the work to separate those two concerns. Either commit your transactions before creating the batch or scope the transaction within the jobs
block so your jobs are committed before the batch jobs are pushed:
Commit in batch:
batch.jobs do
User.transaction do
...
end
end
Commit then batch:
User.transaction do
...
end
batch.jobs do
end
from sidekiq.
We don't have a problem or a transaction at the initial level. We do:
batch.jobs do
UserWorker.perform_async(id)
end
class UserWorker
def perform(id)
user = User.find(id)
user.transaction do
user.thing
OtherWorker.perform_async(id)
end
end
end
with the latest changes OtherWorker
is fired inmediatly (before the commit) and its also added as part of the batch. Where before it would wait for the transaction to commit and not be part of the barch.
We could do:
class UserWorker
def perform(id)
user = User.find(id)
user.transaction do
user.thing
AfterCommitEverywhere.after_commit { OtherWorker.perform_async(id) }
end
end
end
but thats kind of annoing since the call for that worker might be nested insde a bunch of method calls (and sometimes its called without a batch being present in those cases we wouldn't need the AfterCommitEverywhere
block.
from sidekiq.
Just to close the loop, turns out that pushing jobs from a job inside a batch DOES NOT push the jobs to the batch automatically that was happening on our test because of some workarounds we have to be able to perform integration tests on jobs that depend on batches. Sorry for the noise!
from sidekiq.
Related Issues (20)
- `Object#respond_to_missing?` redefined with too few arguments HOT 2
- Sidekiq Web: Unhandled can't convert String into an exact number (TypeError) HOT 15
- Errors after configuring pro web ui with redis pool HOT 5
- [Feature Proposal] Compressing scheduled jobs to save memory HOT 5
- Difference between capsule and concurrency=1 HOT 3
- Suggestion: Raise error when calling `batch.jobs` more than once when defining the batch? HOT 2
- Capsules don't observe the `-q` parameter of their process HOT 2
- Update Sidekiq API with capsule data HOT 1
- set(unique_for: ...) not working with ActiveJob HOT 3
- Batches wrapping unique_for jobs completing before jobs have run HOT 5
- "Error fetching job" after upgrade to Sidekiq 7 HOT 8
- Worker not running in schedule.rb HOT 7
- Sidekiq::Limiter.unlimited does not comply with signature of other limiters
- Jobs Stuck in Enqueue state (Enqueued Actionmailer) HOT 20
- Current attributes lost after inline execution HOT 2
- Child batches with enqueued jobs don't prevent parent batches from completing HOT 8
- `sidekiq-pro` gem 7.2.1 not found in the repository HOT 1
- Add splash logo on startup when using a custom log formatter HOT 2
- Job payload inconsistently persisted through retries HOT 3
- delete_by_class returns zero and doesn't remove any jobs HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sidekiq.