nesquena / backburner Goto Github PK

View Code? Open in Web Editor NEW

428.0 428.0 68.0 508 KB

Simple and reliable beanstalkd job queue for ruby

Home Page: http://nesquena.github.com/backburner

License: MIT License

Ruby 100.00%

backburner's People

Contributors

Stargazers

Watchers

Forkers

muthhus achiurizo danielfarrell stevebooks worst nicotaing carsonbaker lashd sandeepravi benshell denniskuczynski crudson iaintshine mateuszef ryanjohns thcrock epoplavskis kke nitrodist vad4msiu fasterthanlime silentshade att14 sibylus menxu rochefort alup contentfree forresty tupalo bfolkens deepakshinde itsmikeq geoffroymontel calasyr ninjapanzer bukalapak merwan ramkumar-kr eltone krugerheavyindustries unitytech nathantsoi nicolasleger sztheory dplummer utilum berniechiu kathgironpe boncey dezull keonjeo kennabila icstunna crealettres easypost jontsai albb0920 dyet92k blitline-dev funkyboy shoplineapp miclast run26kimo ripaglobal aforian david-zw-liu

backburner's Issues

Introduce ResqueCompat library

Allow ResqueCompat module to support old resque 1.1 users

Enqueuing a job by instantiating the job with the arguments

Discussing with @bradgessler:

Currently:

Backburner::Worker.enqueue NewsletterSender, [self.id, user.id], :ttr => 1000

instead:

Backburner::Worker.enqueue Backburner::Job.new(NewsletterSender, [self.id, user.id], :ttr => 1000)

or:

# include module
Backburner::Worker.enqueue NewsletterSender.job(self.id, user.id).tap { |p| p.ttr = 200 }

Better retry support

Right now jobs are buried if they raise an exception or timeout. Instead, perhaps retry with a delay for some max times before burying.

Better support for testing jobs

From @bradgessler:

A stub could be provided for peeps that want to assert jobs are thrown on the queue in a test env. Make testing a job is performed work easily. Right now I use a hacky thing on my projects:

# Backburner::Worker.enqueue NewsletterSender, [self.id, user.id], :ttr => 1000
Backburner::Worker.class_eval do
  class << self; alias_method :original_enqueue, :enqueue; end
  def self.enqueue(job_class, args=[], opts={})
    job_class.perform(*args)
  end
end

to force the jobs to be executed automatically. Open to the right way to do this that is simple.

Beanstalk failover

The beanstalk-client gem allows you to define multiple beanstalk servers, but I don't see how that works with backburner's configure block.

This way the workers could listen to the same queue on multiple beanstalkd servers and if job enqueuing fails, backburner should try another server in the pool.

Worker fails with undefined method bury

When queue is empty worker explodes with message:

/Users/jdudulski/.rvm/gems/ruby-1.9.2-p290/gems/backburner-0.1.0/lib/backburner/worker.rb:100:in `rescue in work_one_job': undefined method `bury' for nil:NilClass (NoMethodError)

Worker is not always watching for my queue.

Hello

I defined one job in my Rails app, as follows

class ImageDownloadJob
  include Backburner::Queue
  queue "image-download"
  queue_priority 1000 # most urgent priority is 0

  def self.perform(image_id, url, expected_mime_type)
    Rails.logger.debug ("ImageDownloadJob.perform #{image_id} #{url} #{expected_mime_type}")
    Image.find(image_id).save_file_from_url(url, expected_mime_type)
  end
end

In config/initializers/backburner.rb I put the following code

Backburner.configure do |config|
  config.beanstalk_url    = "beanstalk://127.0.0.1"
  config.tube_namespace   = "myapp.#{Rails.env}"
  config.on_error         = lambda { |e| Rails.logger.error "Backburner / Beanstalk error = #{e}" }
  config.max_job_retries  = 0 # default 0 retries
  config.retry_delay      = 2 # default 5 seconds
  config.default_priority = 65536
  config.respond_timeout  = 120
  config.default_worker   = Backburner::Workers::Simple
  config.logger           = Rails.logger
  config.primary_queue    = "backburner-jobs"
  config.priority_labels  = { :custom => 50, :useless => 1000 }
end

Beanstalkd is running.

Then I am starting a worker with the following command :

QUEUES=image-download bundle exec rake backburner:work

I get the following in my log

Working 1 queues: [ myapp.development.backburner-jobs ]

which does not make sense to me, because it should start waiting on my image-download queue.

Then if I start my Rails app with bundle exec rails s, the app enqueues some jobs that the worker never gets the work.

Then, if I kill the backburner:work process with a ctrl+c, and relaunch it, it will get my image-download queue and process the jobs.

Working 2 queues: [ myapp.development.image-download, myapp.development.backburner-jobs ]
Work job ImageDownloadJob with [45, "test", "image"]
ImageDownloadJob.perform 45 test image

Why do I have to start the worker after my Rails app has put something in the queue ?
Is it a bug or just me ?

Thanks for your help !!

Best

Geoffroy
Geoffroy

Move configuration to yml file.

Is it possible, to move this block to yaml file?

Backburner.configure do |config|
  ...
end

Its needed, when you have different configurations for development, staging and production.
Or is there any other convenient way to split configuration?

License missing from gemspec

Some companies will only use gems with a certain license.
The canonical and easy way to check is via the gemspec,

via e.g.

spec.license = 'MIT'
# or
spec.licenses = ['MIT', 'GPL-2']

Even for projects that already specify a license, including a license in your gemspec is a good practice, since it is easily
discoverable there without having to check the readme or for a license file.

For example, there is a License Finder gem to help companies ensure all gems they use
meet their licensing needs. This tool depends on license information being available in the gemspec. This is an important enough
issue that even Bundler now generates gems with a default 'MIT' license.

If you need help choosing a license (sorry, I haven't checked your readme or looked for a license file),
github has created a license picker tool.

In case you're wondering how I found you and why I made this issue, it's because I'm collecting stats on gems (I was originally
looking for download data) and decided to collect license metadata,too, and make issues for gemspecs not specifying a license as a public service :).

I hope you'll consider specifying a license in your gemspec. If not, please just close the issue and let me know. In either case, I'll follow up. Thanks!

p.s. I've written a blog post about this project

Delay within a worker

I didn't find it anywhere in docs, but what I would like to do is within a job call a delay method, for example:

class ImportSongs
  include Backburner::Queue

  def self.perform(api_token, songs)
    api = API.new api_token

    songs.each_with_index do |song, i|
      # make current worker proceed with another job while it's sleeping
      delay 60*60  if i != 0 && i % 100 == 0

      api.import_song song
    end
  end
end

Init script for linux systems

Is there any init script for linux to start/stop backburner?
For example, if you use monit and want to monitor your workers with pid - its really good to have init script for start/stop/restart.

Backburner::Worker#retry_connection! does not close connections before creating new ones

Backburner::Worker#retry_connection! does not call close, leaving open connections to servers until the process holding those connections is killed.

Can be reproduced by starting 2 servers. Then, queue jobs in a loop.

require 'backburner'

Backburner.configure do |config|
  config.beanstalk_url = ['beanstalk://127.0.0.1:11300', 'beanstalk://127.0.0.1:11301']
end

class Job
  def self.perform(message)
    p message
  end
end

loop do
  Backburner::Worker.enqueue(Job, ['Hello'])
end

In another process run the worker.

Backburner.work

Kill one of the servers. The other servers output will look like this.

$ beanstalkd -V -p 11300
pid 4318
bind 3 0.0.0.0:11300
accept 5
accept 6
accept 7
accept 8
accept 9
accept 10
accept 11
accept 12
accept 13
accept 14
accept 15

lsof of this process.

COMMAND     PID    USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME                                                                                                                                                                                                                                                                                                     [2/1880]
beanstalk 32480 vagrant  cwd    DIR    8,3     4096 261123 /home/vagrant
beanstalk 32480 vagrant  rtd    DIR    8,3     4096      2 /
beanstalk 32480 vagrant  txt    REG    8,3    63744 789844 /usr/bin/beanstalkd
beanstalk 32480 vagrant  mem    REG    8,3   156928 522458 /lib64/ld-2.12.so
beanstalk 32480 vagrant  mem    REG    8,3  1926800 522604 /lib64/libc-2.12.so
beanstalk 32480 vagrant    0u   CHR  136,4      0t0      7 /dev/pts/4
beanstalk 32480 vagrant    1u   CHR  136,4      0t0      7 /dev/pts/4
beanstalk 32480 vagrant    2u   CHR  136,4      0t0      7 /dev/pts/4
beanstalk 32480 vagrant    3u  IPv4  97801      0t0    TCP *:11300 (LISTEN)
beanstalk 32480 vagrant    4u   REG    0,9        0   3780 [eventpoll]
beanstalk 32480 vagrant    5u  IPv4  97810      0t0    TCP localhost:11300->localhost:55278 (ESTABLISHED)
beanstalk 32480 vagrant    6u  IPv4  98268      0t0    TCP localhost:11300->localhost:55306 (ESTABLISHED)
beanstalk 32480 vagrant    7u  IPv4  98274      0t0    TCP localhost:11300->localhost:55308 (ESTABLISHED)
beanstalk 32480 vagrant    8u  IPv4  98279      0t0    TCP localhost:11300->localhost:55310 (ESTABLISHED)
beanstalk 32480 vagrant    9u  IPv4  98284      0t0    TCP localhost:11300->localhost:55312 (ESTABLISHED)
beanstalk 32480 vagrant   10u  IPv4  98291      0t0    TCP localhost:11300->localhost:55314 (ESTABLISHED)
beanstalk 32480 vagrant   11u  IPv4  98296      0t0    TCP localhost:11300->localhost:55316 (ESTABLISHED)
beanstalk 32480 vagrant   12u  IPv4  98301      0t0    TCP localhost:11300->localhost:55318 (ESTABLISHED)
beanstalk 32480 vagrant   13u  IPv4  98306      0t0    TCP localhost:11300->localhost:55320 (ESTABLISHED)
beanstalk 32480 vagrant   14u  IPv4  98311      0t0    TCP localhost:11300->localhost:55322 (ESTABLISHED)
beanstalk 32480 vagrant   16u  IPv4  98236      0t0    TCP localhost:11300->localhost:55302 (ESTABLISHED)
beanstalk 32480 vagrant   17u  IPv4  98252      0t0    TCP localhost:11300->localhost:55304 (ESTABLISHED)

Web Front-end

Sinatra web-frontend would be awesome

Add better logging support

Right now logger is only printing to stdout, maybe allow a better logger to be specified

Worker Hooks

Talking with @bradgessler want a way to hook into the worker and define arbitrary code to run before and after.

Backburner::Worker.on_start do |job|
  NewRelic.add_instrumentation "..."
end

Backburner::Worker.before_enqueue do |job|

end

Backburner::Worker.after_enqueue  do |job|

end

Passing "complex" objects

Hi, I'm loving the backburner/beanstalk combo and am using it to process large amounts of data within my app.

From what I have read so far; the prevailing advice seems to be to pass database ids into enqueue and then have the worker read the information it needs from the database.
I am doing this now - I persist data for the backburner worker to then (almost) immediately read that data back out (then delete that row once processing is complete).
This (the database write/read/delete) is proving to be a bit of a bottleneck within my app.

What I'd like to do is not to have to touch the database at all but to pass all my data straight to backburner and then have my workers process it without having to read it back from the database.
My data is around 12 distinct text strings (none more than a few hundred characters long) although some can be non-ASCII (ie, UTF-8) text.

Am I likely to hit any real problems with this approach?

Is anyone else already doing it this way?

Thanks in advance for any help.
Darren.

Is backburner:threads_on_fork:work reliable ?

Hello

I am running Ubuntu 12.04TLS and beanstalkd 1.9

I have a very simple Rails 4 application which uses Backburner to do some audio processing, image and video downloads in the background.

It kinda works but sometimes some jobs are not processed.

I can reproduce it fairly easily and I got this kind of behaviour.

The code to enqueue the image to download is

    Rails.logger.debug("[Asset.enqueue_download_from_url] id = #{id}, url = #{url}")
    Backburner.enqueue ImageDownloadJob, self.id, url

And the code in my Job is

class ImageDownloadJob
  include Backburner::Queue
  queue "image-download"
  queue_priority 1000 # most urgent priority is 0

  def self.perform(image_id, url)
    Rails.logger.debug("[ImageDownloadJob.perform] id = #{image_id}, url = #{url}")
    Image.find(image_id).download_from_url(url)
  end
end

and even though it works 9 / 10 times
sometimes I get the following in my development log

[Asset.enqueue_download_from_url] id = 71, url = http://upload.wikimedia.org/wikipedia/commons/8/89/Drosophilidae_compound_eye_.jpg

but no ImageDownloadJob.perform trace afterwards. It seems like the job is never enqueued or never processed.

I'm using backburner:threads_on_fork:work

Here is my Procfile

web:             bundle exec rails s
backburner:      env QUEUE=image-download:3:50:2,video-generation:1:50:1,audio-analysis:1:50:1 rake backburner:threads_on_fork:work
rails_logs:      tail -f log/development.log
backburner_logs: tail -f log/backburner.development.log

Is it a known bug of the backburner:threads_on_fork:work strategy ? Any advice ?

Thanks in advance, best regards

Geoffroy

Resolve namespace from beanstalk url

beanstalkd://localhost/foo?retry_times=5
# parse prefix and retry from the url

By default, have all jobs enqueued to the same app-specific tube

From @bradgessler:

I don't think having different queues per app is a good idea in a beanstalkd world. Resque does this because there's no concept of priorities in Redis. Since Beanstalkd lets people specify priories of jobs, its a moot point to run different numbers of workers for different queue names. Put more emphasis on priority to deal with this.

I like the idea of having all classes by default piping to the same tube and using priorities more heavily. If a job should be in a different tube, then that is of course possible using the same format today.

Reserving job fails when haproxy client timeout is exceeded

Hi I am looking in to setting up beanstalkd and backburned in a HA context.

To protect against a beanstalkd instance going down I have multiple instances fronted by haproxy.

My test is using the Simple worker.

Whilst executing the work_one_job method in worker.rb, backburner attempts to reserve a job without using a timeout. When there are no jobs on a particular tube, the connection is kept open until one arrives.

When connecting to beanstalkd directly, the connection is held open for as long as it takes for a job to arrive. However haproxy terminates the connection, after the timeout that it is configured with is exceeded.

At this point the exception is caught but a further exception is thrown as the job variable is nil. The fix to this problem is unfortunately not a one liner as the tcpsocket connection held in a class variable is also broken so subsequent retries would also fail.

What do you think the solution should be to make connections to beanstalkd more resilient, when operating in the this context? Should these terminations be handled? I would like to contribute to the project, but would like to know your thoughts before I spend to much time going in the wrong direction.

Thanks for your help,
lashd

Backburner looks great by the way.

Each thread in ThreadsOnFork worker should get its own connection

I've noticed a problem with the ThreadsOnFork worker once the job queue goes empty. If there is a thread trying to reserve a job on an empty queue, it holds the mutex so no other communication can happen on that connection (note: the mutex code is in beaneater, though the mutex code itself is not necessarily a bug, IMO). This is problematic if another thread has just reserved a job and is trying to process said job. Most importantly, in order to do actual job processing, the 'stats-job' is run to retrieve the ttr from beanstalkd, and if this job is held up then the already-reserved jobs will fail to process, and you're deadlocked.

The only way I've seen this deadlock broken is when beanstalkd eventually returns 'DEADLINE_SOON' to the blocking reserve jobs, by default 120s later. This is pretty terrible latency, however.

You can alleviate this by having more connection addresses defined in your backburner config; beaneater will use these as a pool, but you'll still get collisions where two threads are trying to use the same connection. Really, no two threads should ever try to use the same connection/socket if a blocking reserve is involved.

An easy way to reproduce this is to instantiate a ThreadsOnFork worker for a particular tube with n (>= 2) max threads, and then queue m (n < m < 2n) jobs onto that tube. The first n jobs should process immediately, and the remaining ones should get hung in a state where they have technically been reserved, but can't communicate with the beanstalkd server so they won't process for 120 seconds (unless you've overridden ttr). In my particular case, I had 10 threads and 15 jobs.

Create forking worker

Resque uses forking to control memory management and bloat and I think this could be a good idea to have a fork worker to apply the same principles.

CLI for querying and filtering jobs

From @bradgessler:

CLI for querying/filtering jobs for performing batch operations on jobs that are in the queue. This is important for when things go bad in production and certain jobs with certain payloads may need to be buried until a patch can be pushed to prod and the jobs are re-run. Web GUI might be able to use this CLI interface.

Have priority shorthand names for convenience

From @bradgessler:

We had a mis-numbered priority in production once that brought our system down. We'd like to have "named" priorities that are stack-ranked. This is a simple DSL that looks like: Backburner.config.priorities = [:high, :medium, :custom_pri, :low], which is mapped to the int values that beanstalkd understands. When tossing a job on the queue, a named priority could be specified like User.new.async(pri: :high).blah.

Using DataMapper, how can I delay a method I want to fire after creation?

I am using DataMapper callbacks to call a method that imports user data from Zendesk when I create a "desk" object in my app. This is working great. However, importing the user data can take some time for some of the larger accounts I work with. Does anyone have any advice on how to call this callback via backburner?
http://datamapper.org/docs/callbacks.html

Handling restarts or shutdowns

How does backburner handle restarts or shutdowns? Do the running jobs get put back into the queue or do they just "timeout"?

Support priority labels in initial configure block

From #52, need to support priority labels in configure block

NewRelic monitoring hooks

Track backburner tasks with newrelic API for background processing.

Updating Tube code.

When I Backburner, I am unable to run the worker with the updated code, it still seems to run the old code. Any pointers?

Support jobs without need for queue mixin

From @bradgessler:

I'm not convinced that the Job mix-in is the best approach. A job should be a class itself. Mixin's make sense as syntatical sugar to make it easier to queue an instance of a class into a job, and from pulling the job off the queue and shoving it back into that class for processing.

I agree, would be ideal to support any ruby object that responds to perform ala resque. I still want to keep the mixin around for the nice syntactic sugar it affords.

Backburner Specific Admin Panel

After chatting with @kr today, have a few ideas to jot down for a first class backburner admin frontend panel. First I would want visibility into the jobs in the ready queue as well as a view showing failed jobs with backtraces and a way to 'kick' a job, all available via a sinatra web UI.

In particular, some ideas on how to do this.

Ready Jobs

Create a sinatra view that reserve(0) 100 jobs and then collects the information and immediately releases them. Aggregate the 100 jobs and display them in the sinatra view. By reserving and releasing immediately, we can create a view for both development and production by showing the next 100 jobs.

Buried Jobs

I would want a place for buried jobs where you can see the next 100 buried jobs across all queues perhaps. For this we could have a special tube called 'failed-jobs' that is special and used by the admin panel. In Backburner, everytime a job fails and is buried, we can then insert the job into the 'failed-jobs' queue before burying it. Then when we want to show last 100 failing jobs simply reserve(0) and release 100 jobs from that tube. In this way we can have a buried jobs list. We can even have a button to kick the job (which will then kick the job based on the real id). The way I was thinking about it, the failed-jobs tube could have jobs with this format:

{ "job-id" : 1234, "tube" : "foo", "backtrace" : "...", "tries" : 3 }

and then this can be used to display the buried jobs in a table.

Implementation Thoughts

Also, I love beanstalkd_view and it's such a great start, I wonder @denniskuczynski if you would have any interest being a core contributor to backburner and helping out with the admin panel. Ideally we could have a familiar feel to https://github.com/defunkt/resque#section_The_Front_End and obviously be unabashedly clear we are inspired by that project's interface as a point of reference (as well as beanstalkd_view itself).

Improve github-pages, add logo

Cleanup github pages and add logo. As suggested by @bradgessler we can use http://thenounproject.com/noun/stove/#icon-No4325 as a starting point.

Download the SVG and PNG here: https://www.dropbox.com/s/kndb4kv5py4m5nh/stove.zip

Possibility to run worker with reserve timeout

Wouldn't it be convenient if we could specify timeout in reserve method which is called from work_one_job method. So if worker waits on reserve command longer than specified timeout it stops automatically. Actually this is first parameter which can be passed to reserve method of laying below in hierarchy Beaneater::Tubes class.

I want to run such a worker from time to time so I would have it only to process current requests after processing it should stop.

I am going to send to beanstalkd 20 tasks at once. Each pack of 20 tasks will be assigned to different tube. Each Tube will have only one worker which will process tasks, so I will be able to log data from processing these particular 20 tasks to separate file. That is why I would like to have possibility to run worker which will stop after it sees that there is nothing more to do for him. And this 'seeing' could be basically timeout on reserve.

I believe that reserve-with-timeout is the way it may be achieved.

Please correct me if I am wrong and tell me if you are going to add this nice feature to backburner.

Regards,
Mateusz

Create a threaded worker

Would be cool to create a threaded worker for processing beanstalk jobs possibly using celluloid similar to @mperham's sidekiq or just a custom threaded solution.

When multiple connections are present, removing one should not cause failure

It is expected that the client should try to connect to each server. If a connection is not established it will remove the connection from its connection pool. As long as the number of successful connections to a server is greater than zero, work should continue to occur. Ideally, the client would be smart enough to retry the connection to failed server and add it back to the pool when its connection is restored.

The following will reproduce the error if you start a worker on port 11300 or 11301, not both.

require 'backburner'

Backburner.configure do |config|
  config.beanstalk_url = ['beanstalk://127.0.0.1:11300', 'beanstalk://127.0.0.1:11301']
end

class Job
  def self.perform(message)
    p message
  end
end

Backburner::Worker.enqueue(Job, ['Hello'])

Beaneater::NotConnected: Could not connect to '127.0.0.1:11301'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/connection.rb:96:in `rescue in establish_connection'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/connection.rb:92:in `establish_connection'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/connection.rb:36:in `initialize'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/pool.rb:25:in `new'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/pool.rb:25:in `block in initialize'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/pool.rb:25:in `map'
        from /usr/lib64/ruby/gems/2.1.0/gems/beaneater-0.3.2/lib/beaneater/pool.rb:25:in `initialize'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/connection.rb:27:in `new'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/connection.rb:27:in `connect!'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/connection.rb:13:in `initialize'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/worker.rb:58:in `new'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/worker.rb:58:in `connection'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/worker.rb:185:in `retry_connection!'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/worker.rb:173:in `rescue in retryable_command'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/worker.rb:170:in `retryable_command'
        from /usr/lib64/ruby/gems/2.1.0/gems/backburner-0.4.5/lib/backburner/worker.rb:33:in `enqueue'

The problem lies within Backburner::Worker#retry_connection!. Instead of the simple @connection = nil, the failed connection should be removed before retrying.

Backburner jobs get stuck in 'ready' state.

I have some trouble with backburner. I have a setup with two workers running against two beanstalk instances. Somehow, it seems like jobs get 'stuck' somehow sometimes. I can see the workers in beanstalkd view like this:

And if I keep restarting the workers, it eventually picks up the jobs and process them. I'm wondering if this could be related to having multiple beanstalk servers, but I am a bit at a loss about how to debug further.

Problems with environment when running backburner as daemon

When I run backburner as a daemon, it seems to always prefix the queue with 'backburner.worker' irrespective of what is in my apps config file or what I provide as command-line parameters.

Working 1 queues: [ backburner.worker.queue.myapp.mailer ]

I tried to change the app config to load work into the queue named above, but then it seems that the environment isn't loaded properly so jobs that get queued get immediately buried.

The rake task always works but is ugly and difficult to connect to monit or God. This is what I'm doing as a short-term work around.
nohup rake backburner:work &

Any ideas?

config.beanstalk_url = ["beanstalk://127.0.0.1:11300"]
config.tube_namespace = "myapp"
config.on_error = lambda { |e| puts e }
config.max_job_retries = 3 # default 0 retries
config.retry_delay = 5 # default 5 seconds
config.default_priority = 65536
config.respond_timeout = 120
config.default_worker = Backburner::Workers::Simple
config.default_queues = ["staging", "staging-mailer"]
config.logger = Logger.new("backburner-staging.log")

Support setting max_job_retries and retry_delay on a per-tube basis

Currently the max_job_retries and retry_delay settings apply to every tube that backburner watches. For some tubes that I'm processing I'd like a large number of retries for, while for others others I'd like a single retry. It would be great if back burner could be configured to support this.

Resetting backburner connection to beanstalkd

I am trying to use backburner in my rails application and am using Backburner.enqueue to add jobs to the beanstalkd queue. However, if the connection to the beanstalkd server breaks, all calls to enqueue fail with the exception Beaneater::NotConnected. Even if beanstalkd is back online, enqueue continues to fail since @connection in Backburner::Worker is still set and a new Connection is not made.

Maybe I am just missing something simple in the code/documentation, but what is the best way to reattempt/reset a connection without creating a setter for the connection method (patching) in the Worker? I do not want to restart my rails application in order for a new connection to be made.

Checking on a job after it's been put into the queue?

Is there a way for me to check on a job after it's been enqueued? Preferably after calling "async.command" I could get an id back that I can go back and check so I can follow up afterwards?

My use-case is this. I am triggering async jobs from several parts of my web application. I'd like to give the user feedback about the completion of those jobs and I can't move the user forward until they finish. There are other parts where we trigger a job but I need to give them a way to see if they were successful or failed. I'm trying to find a way where I don't need to wrap all of these with another layer of database tables to provide this information.

Backburner vs beaneater

Not really an issue, I just don't know where is the best place to ask this question.
I'm just starting with background jobs and wondering what is advantage of backburner over beaneater?

Shreko

Command line enqueue?

As part of moving work away from rails runner based cron jobs into a backburner based system I've started on a simple way to enqueue jobs from the command line. Is that something you'd be interested in incorporating into backburner?

Tube runtime versioning

Tube namespacing to support versioning? in the wiki

burying undefined method `getagents' for #<Enumerator: Desk:find(2)>

When I run my workers, they can't seem to find the objects. I get this error:

burying undefined method `getagents' for #<Enumerator: Desk:find(2)>

Has anyone else seen this before?

Create a threads_on_fork worker hybrid

@ShadowBelmolve is making great progress with an interesting threads_on_fork worker hybrid. Hope to get that released into backburner with docs and tests soon.

Batch processing multiple jobs

At the moment I have 16 workers processing jobs in parallel.
The outcome of these jobs is either no action or a write to a database.

As each process saves to the database it invokes its own separate BEGIN/COMMIT database transaction. This is proving to be quite slow and I'd like to find a way to speed that bit up.

I was wondering about this approach instead:
Instead of saving to a database - send the data to a different beanstalk queue/tube.
For that queue I'd like to read a batch of jobs into a single worker and then commit them in a single database transaction.
If I could process jobs in a batch of 50 at a time I could significantly reduce the number of database commits.

Can I do such a thing with backburner?
Or if not, maybe using beaneater instead?
Any issues with it from a theoretical standpoint?

Thanks in advance, Darren.

delete a job from beanstalkd queues ?

how can I delete a job from beanstalkd queues when using backburner gem

Workers do not respond appropriately to SIGTERM during activerecord mysql queries

If a worker process receives SIGTERM while a mysql query is running, rails apparently rescues the SignalException and re-raises it as an ActiveRecord::StatementInvalid. This causes Backburner to gracefully recover from the exception, bury the job, and keep on working additional jobs, effectively ignoring the intent of the TERM signal.

Apparently this issue is at least somewhat common:

http://stackoverflow.com/questions/548048/activerecordstatementinvalid-when-process-receives-sigterm

I'm not sure if this is a problem on all workers, but we can confirm it is present on the simple worker, and the net effect is that once every few deploys, our old workers simply refuse to die when we attempt to shut them down.

Identifying / differentiating between Backburner workers

I'm trying to use Backburner in my application to create, say, 50 workers all watching the same tube for incoming jobs. The problem is, each of those 50 workers, when performing a job, does some heavy file manipulation, etc., and each of them needs to perform some operations on a uniquely-named numbered directory. For example, worker 7 needs to work with a directory named dir7. How do I handle this situation using Backburner?

Initially I thought I could make a ThreadsOnFork worker with 50 threads, and I'd be able to access a number associated with each thread from the worker's perform() method, but I haven't been able to do this yet.

Please help. Thanks!

PS: Apologies for asking this question on GitHub Issues, but I couldn't find a link to any official forum / google group on the backburner page at http://nesquena.github.com/backburner/.

Add hooks for plugins

Add hooks for plugins ala resque

https://github.com/defunkt/resque/blob/master/docs/HOOKS.md

nesquena / backburner Goto Github PK

backburner's People

Contributors

Stargazers

Watchers

Forkers

backburner's Issues

Ready Jobs

Buried Jobs

Implementation Thoughts

Recommend Projects

Recommend Topics

Recommend Org