albrow / jobs Goto Github PK

A persistent and flexible background jobs library for go.

License: MIT License

Go 91.81% Lua 8.19%

go golang redis worker-pool

jobs's Issues

Support environmental prefix

Problem: Current version of the package does not support environmental prefixes, hence if a common redis instance is being used between two environments (lets say alpha and beta) then there is no mechanism to ensure that consumers/workers are executing jobs from producers of that particular environment or some other.
Proposed solution: One of the approach is to prefix the environment string to all the redis keys. Hence, there should be a feature to set prefix which varies as per the environment. Also, before implementing that at go layer, the hard coding of keys in the lua script should also be configurable in the sense that it should accept prefix as one of the arguments.

Use sets instead of sorted sets where appropriate

Every job has a status. Currently there is a sorted set for each status, where all jobs that have that status are stored. Technically only the "queued" set needs to have a score (the priority) since that allows higher-priority jobs to get picked up first. Using sets instead of sorted sets when possible should reduce memory usage.

Job runtime error messages should include a stack trace

If you have a job which panics the only error that's recorded is a short error message. However if it is a runtime error such as a nil pointer access or incorrect use of slices then this makes things very hard to debug because you do not have the stack trace. You end up with an error like the following in the redis hash for your job. Which is very vague in the case of runtime errors.

runtime error: invalid memory address or nil pointer dereference

I've found where this happens and I think the stack should either be logged or also sent to redis as the error message. But I'd be fine if it was just logged as this won't happen often.
https://github.com/albrow/jobs/blob/master/worker.go#L39

Here's an example of logging the stack incase you've never done this before. If you want more examples search for http recovery middlewares as they all have to do this.

stack := make([]byte, 1024*8)
stack = stack[:runtime.Stack(stack, false)]
log.Println("Panic stack:", err, string(stack))

Examples?

Is it possible for an example to be created (full .go files, etc?) I'm pretty new to Go and I love this implementation (I come from using Resque). But finding it difficult to setup everything.

Tthe only example I found that could possibly help me learn was this: #28

So I've got it up and running with the above issue and I've split off workers / pools (I want to have N amount of workers on distributed machines, adding new jobs via back-end API).

However, it's a bit messy now. Is there a straight-forward "hello world" example the encompasses all current features of jobs other than #28 ?

Sorry about all this! Pretty new to Go >.>

Allow for multiple arguments to job handler functions

Currently job handler functions can only accept one argument. This is not a priority for me, but it would not be too difficult to allow for multiple parameters.

job not exec with panic

my code

var sche scheduler

type scheduler struct {
    tp           *jobs.Type
    golbalHeader map[string]string
}

func main() (err error) {
    tp, err := jobs.RegisterType(DEFAULT_JOB_TYPE, RETRY_TIMES, sche.getHandler())
    if err != nil {
        utils.Log().Error(err)
        return
    }
    sche.tp = tp
    pool, err := jobs.NewPool(nil)
    if err != nil {
        utils.Log().Error(err)
        return
    }
    defer func() {
        pool.Close()
        if err = pool.Wait(); err != nil {
            utils.Log().Error(err)
            return
        }
    }()
    if err = pool.Start(); err != nil {
        utils.Log().Error(err)
        return
    }
    return
}

... 



func (p *scheduler) getHandler() handler {
    return func(req *models.AddReq) error {
        utils.Log().Info("job start")
        post := gorequest.New().Post(req.Target)
        for k, v := range p.golbalHeader {
            post = post.Set(k, v)
        }
        for k, v := range req.Header {
            post = post.Set(k, v)
        }
        _, body, errs := post.Send(string(req.Body)).End()
        if len(errs) > 0 {
            utils.Log().Error(errs)
            return errs[0]
        }
        var rep response
        err := json.Unmarshal([]byte(body), &rep)
        if err != nil {
            utils.Log().Error(err)
            return err
        }
        if rep.Code != 0 {
            utils.Log().Error(rep.Code, rep.Message)
            return errors.New(rep.Message)
        }
        utils.Log().Info(rep)
        return nil
    }
}

Passing params / string name of job.

Similarly to Resque, you only have to pass in the name of the function ("HelloWorldJob") and a list of params { foo:"bar" }, etc.

Is this possible with jobs? (or perhaps in future #14 implementation?)

Rough example (more pseudo-code than Golang lol):

job, err := Schedule("HelloWorldJob", 100, time.Now(), "{EmailAddress: "foo@example.com"}")
if err != nil {
    // Handle err
}

instead of

job, err := welcomeEmailJobs.Schedule(100, time.Now(), &User{EmailAddress: "[email protected]"})
if err != nil {
    // Handle err
}

That means that within whatever program (whether it is an http server or whatnot), I don't have to define all jobs to say a:

var (
 sche *jobs.Type
)

everytime

Use an interface for jobs instead of a non-type-safe handler function (would break backwards-compatibility)

I would like to change the public API in a pretty major way by using an interface for jobs instead of a handler function (which is not typesafe). The current implementation feels a little bit messy and unidiomatic to me.

Jobs follows semantic versioning, which means breaking changes are fair game until version 1.0. However, I never put a warning in the README about this, so I wanted to make sure it was okay with everyone before making this breaking change. I would also be happy to hear feedback on the approach.

The basic idea is to create an interface that might look something like this:

type Job interface {
    Execute() error
    JobId() string
    SetJobId(string)
    JobStatus() Status
    SetJobStatus(Status)
}

Most of these have straightforward implementations which could be covered with an embeddable DefaultJob or JobData type, similar to the approach I use in zoom.

type DefaultJob struct {
    Id string
    Status Status
}

func (j DefaultJob) JobId() string {
    return j.Id
}

func (j DefaultJob) SetJobId(id string) {
    j.Id = id
}

// etc for other getters and setters

So job type declarations would now look like this:

type EmailJob struct {
    User *model.User
    jobs.DefaultJob
}

func (j EmailJob) Execute() error {
    msg := fmt.Sprintf("Hello, %s! Thanks for signing up for foo.com.", user.Name)
    if err := emails.Send(j.User.EmailAddress, msg); err != nil {
        return err
    }
}

I can leverage zoom as a library to easily serialize all the exported fields in any job struct. So when a job gets retrieved from the database, all the struct fields will be filled in. Then the worker will just call the Execute method.

There are a few advantages to this approach:

Type-safety: If you don't embed a DefaultJob or provide your own implementations of the methods needed, or if you don't define an Execute method, the compiler will tell you, whereas previously these types of omissions would be runtime errors. Workers can also execute jobs by calling the Execute method instead of jumping through fiery hoops with reflection.
Flexibility: The Execute function can safely access any exported properties of the job type, so in effect this solves the multiple argument problem.
Idiomaticness: Using an empty interface as an argument to RegisterJob just feels wrong.

Let me know what you think. If I don't hear any objections I'll plan on converting to the new implementation sometime in the coming weeks.

Use go generate to convert lua script files to strings

Over at albrow/zoom, it was brought to my attention that reading scripts from a file at runtime can cause problems for people using certain dependency managers, or people who want to release a binary executable. See PRs albrow/zoom#9, albrow/zoom#10, and albrow/zoom#11.

The solution I came up with was to use a small script (compatible with go generate) to read the contents from the .lua file and write them to a generated file as strings. I'm going to port a similar solution over to albrow/jobs. It will be a little bit more complicated because jobs uses templates for string constants in the lua scripts, but the general idea is the same. This is targeted for version 0.3.0 and will hopefully be released in the next couple days.

Error Handling?

Is it possible to add a common error handler for all jobs that return an error so that I can say log my errors during development and send mails during production. Because right now in development I just return the error and don't know when and where the job failed.

FindById() doesn't return error for missing jobs

FindById() will return an empty job struct(with the id set to whatever you gave it) and nil if the id doesn't exist.

Example

j, err: = jobs.FindById("Hello world")
if err != nil {
    panic(err)
}
log.Println("Found job:", j, err)

@albrow It looks like transaction.exec() doesn't check for an empty set/list error.

Error when calling Destroy() from within a job handler function

I've been seeing errors like the following lately. @albrow I was wondering if you could help me diagnose them.

ERR Error running script (call to f_7be0eed842fab7944ce1393b76bf8f46826c6656): @user_script:20: us er_script:20: attempt to concatenate local 'Status' (a boolean value)
I found the line it does this here

I'm trying to destroy the job from within my handler function once it reaches a certain state. Is it not safe to do this?

When I look up the job id I just get finished and time fields using hgetall.

redis:6379[3]> hgetall jobs:bX8Vye9LGk80SDkbawj9qt3vvkpi
1) "finished"
2) "1435185638756948580"
3) "time"
4) "1435200038500419354"
redis:6379[3]>

Is it supposed to leave the job in redis after you destroy it? From the destroy_script.lua docs it dosen't sound like it.

Support redis sentinel

@albrow Have you thought about adding sentinel support? I see you use the redigo package which is what I also use. But It hasn't been updated much lately(last commit was 3 months one before was 6 months old).

I was thinking about switching to go-redis/redis which has built in sentinel support and I think better connection pooling support. As you can see it's a very active project and they are also about to release v3 which has an even nicer api.

I know redigo has a fork which seems to support sentinel but to me it really seems like a dead project.

Add a changelog

I think it'd be nice if there was a changelog. This can either be done by hand or it can be automated if you force a syntax for commits ie changeType(component): msg. For an example of an automated changelog take a look at the angular-material project. Their changelog is generated by the commit messages and then I think they tweak any problems with it before a release. That might be a bit much for this project but it let's you know what can be done. Either way I think a basic changelog would be really nice.

@albrow What do you think about this? I think it'd definitely make this easier to use in production.

Export job.freq

It would be useful if we could see the job frequency for recurring jobs.

Use a unique machine identifier for generating pool ids

Currently, a rebooted machine would get a new pool id. If it was in the middle of executing any jobs when it rebooted, those jobs would be stale. In order to detect that the new pool will attempt to ping the old pool, then when it doesn't get a response it would re-queue the stale jobs. This is a little more work than what is necessary.

If the same machine always gets the same pool id, it would allow for a machine to quickly clean up after itself in the event that it is rebooted. On initialization, it could check for any jobs in the executing state with it's own pool id. We wouldn't need to try to ping the old pool to determine that those jobs are stale.

Implement spread out retries

Currently, if a job fails it will be immediately queued for retry. This is appropriate in some but not all circumstances. For example, if a third-party API is down for a few hours, retrying the job immediately would cause it to be retried many times before permanently failing. It would be better to spread out the retries over time. E.g. the first retry is immediate, the next one is 15 minutes later, the next one is 1 hour later, etc.

super slow

I benchmarked a few job processing libraries that use Redis as a backend: https://github.com/gocraft/work#benchmarks

Most can do about 10,000 to 20,000 jobs/second. Using default options, the albrow/jobs library clocked in at 40 jobs/second. I was able to increase that by increasing the # of works and the batch size, but I wasn't sure what you think good values for those params are.

Benchmark code: https://github.com/gocraft/work/blob/master/benches/bench_jobs/main.go

I'd love it if you could review my benchmark to see if I made a glaring mistake, and also what your thoughts are about good params for batch size.

Application can't start due to error which message is 'jobs: In scanJob: Could not find Type with name = %s'

FYI,
How to fix this problem?

There has some job haven't execute while we restart the application.
I think that may be redis callback application to run the job, but the memory hasn't the job object because of we restart.
We only to clear the data that stored in redis to fix it, but i think this is not well.

I hope you can understand what i say, sorry for my poor english.

Best regards.

FindById and job identifiers/human names

So jobs.FindById attempts to retrieve job by its unique ID, which is generated randomly - so it's not really possible to fetch a job by "some-identifier". Is there any way to list jobs by their readable name?

Add Redis Password as a Configuration Option

Currently, you cannot connect to a Redis database that is protected by a password. However, it's not hard at all to add this feature. The basic idea is to add a config variable Config.Db.Password. If set, then all connections will issue the AUTH command when they are initialized to authenticate with the database.

Intercept UNIX signals

Currently, if you cause a worker pool process to quit by sending a UNIX signal (e.g. by pressing ctrl-c), it will quit immediately without waiting for jobs to finish executing. This is good behavior for testing purposes (because it lets me simulate hard failures), but ideally the process should intercept certain types of signals and wait before exiting.

Job status is changed to executing but job has not started execution

Hi,
I have scheduled a job at 9/15/2017, 1:58:36 PM IST epoch timestamp. The job got in executing status at this time and remained in executing state for next 5 mins without starting. It started at 9/15/2017, 2:03:08 PM IST.
There were no other jobs to be executed as this was very first job.
Please have a look at this, as this much delay will lead to problems in our application.

Screen Shot of job details as job is hanged for execution

Screen shot of job when it started

Allow job handler functions to return an error

Errors could then be picked up by the worker and logged, and if appropriate the job will be queued for retry. Currently the only way to get this behavior is to panic when there are errors, which seems unidiomatic.

Panic in xen container

I have a zen container and this is the output of ip addr show :

ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: venet0: <BROADCAST,POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/void 
    inet 127.0.0.2/32 scope host venet0
    inet V.W.X.Y/32 brd V.W.X.Y scope global venet0:0

This is a default container created using proxmox, and I get a panic on the getHardwareAddr.

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x4737a4]

Depending on such thing is not a good idea, why not setting the hardware id by a method? developer should take care of uniqueness in the pool .

Make database/storage pluggable

It would be nice to be able to plug in any database that implements an interface instead of having to use redis. For example boltdb could be an interesting backend that could also be embedable.

Endless loop?

What would be the best way / practical way to have an endless loop of the pool? From the current looks of it / examples, the pool will automatically close when there are no new jobs.

I know that you can simply add:

for {
}

To the end since the pools are running as goroutines.

But shouldn't this behavior be by default? Or maybe I'm just too used to using Resque haha.

Find a better way to purge stale pools and re-queue stale jobs

Currently, there is a process during initialization in which a pool pings all the other pools to determine if any of them have gone down. If they have, any jobs that belong to that pool that were still executing are considered stale and a re-queued. This will prevent jobs from staying stale as long as any time a worker pool machine goes down, either it is rebooted or another machine takes its place.

It would be better if this process occurred periodically instead of just on initialization. The frequency of the pings should be configurable.

Reschedule with freq

Currently Reschedule() only allows you to set the start time. It doesn't let you change the frequency which means I need to destroy the original job and create a new one. This is more error prone and slower because I need to update reference job id's in my db to do this. It might also be a good idea to let us change a job's retry count.

albrow / jobs Goto Github PK

jobs's Issues

Recommend Projects

Recommend Topics

Recommend Org