scripbox / flume Goto Github PK

View Code? Open in Web Editor NEW

78.0 9.0 7.0 385 KB

A blazing fast job processing system backed by GenStage & Redis.

Elixir 99.44% Lua 0.56%

elixir-lang genstage background-jobs concurrent redis batch-processing scheduled-jobs rate-limiting

flume's People

Contributors

Stargazers

Watchers

Forkers

vasuadari ms-choudhary kousikmitra ananthakumaran sanjeevmalagi1 m1ome ryanwinchester

flume's Issues

How to cancel a scheduled job

Thank you for this amazing library.
Is there a way to cancel a scheduled job?
Thanks

Add telemetry for producer, producer_consumer & consumer

Producer

When producer starts(flume.producer.init = system_time)
No. of events fetched(flume.producer.events.fetch = 10)
Time taken to fetch events(flume.producer.events.fetch.stop)
metadata for all events
- :pipeline_name - name of the pipeline for which producer is running
- :queue_name - name of the queue in the redis
- :rate_limit_count - no. of events that can be processed by the consumer
- :rate_limit_scale - duration for which no. of events as per rate_limit_count can be processed
- :rate_limit_key - unique key for which rate limit is applied
- :max_demand - Maximum number of jobs to pull from the queue

ProducerConsumer

When producer consumer starts(flume.producer_consumer.init = system_time)
No. of events fetched(flume.producer_consumer.events.received = 10)

Consumer

When consumer starts(flume.consumer.init = system_time)

cc @nitinstp23

Add job counts API to Flume

This API can be used by apps for instrumentation purpose.

Use logger option from otp_app remove customer logger option

Currently flume defaults to Flume.DefaultLogger. And we make clients define their own logger module Flume.Logger. For some reasons we added this. We don't need it, we just need to remove these two module and use Logger library.

Return error for an invalid job

An invalid job is accepted. It fails and then gets retried for the configured number of times.
An invalid job can be defined to have one or more of the following set incorrectly

queue name
worker module
worker function

Fix

A job in the aforementioned case is bound to fail and eventually move to the dead queue.
Therefore, we can return an error before enqueing in such cases by checking the following:

Check the queue name against the configured pipelines
Check existence of the worker module
Check existence of worker function(name + arity)

Fix async pipeline control function defs

The asynchronous calls for pause/resume fail with

iex(1)> Flume.pause(:default_pipeline, async: true)
** (FunctionClauseError) no function clause matching in Flume.Pipeline.Event.Producer.pause/3

    The following arguments were given to Flume.Pipeline.Event.Producer.pause/3:

        # 1
        "default_pipeline"

        # 2
        true

        # 3
        5000

    Attempted function clauses (showing 1 out of 1):

        def pause(pipeline_name, false = _async, timeout)

    (flume) lib/flume/pipeline/event/producer.ex:29: Flume.Pipeline.Event.Producer.pause/3

The function definitions at

flume/lib/flume/pipeline/event/producer.ex

Line 25 in 1672003

def pause(pipeline_name, true = _async) do

and

flume/lib/flume/pipeline/event/producer.ex

Line 33 in 1672003

def resume(pipeline_name, true = _async) do

need to be fixed.

Support Redis Sentinels

The Flume configuration works with a single Redis server.
However, it would not work in a setup with Redis Sentinels.

Possible Solution

Extend the configuration to accept Sentinel options.
The underlying Redis library supports and manages the switching to a master server after a fail-over.

Support asynchronous pipeline pause/resume

Currently, the Flume.pause and Flume.resume issues synchronous calls to the Producer GenStage to pause/resume pipeline.
Package should also support asynchronous calls to allow for cases where GenStage calls timeout and the operation fails.

test slack integration

Inconsistent pipeline state on permanent pipeline pause/resume

On a Flume.pause("pipeline-name", _temporary = false) call, the key in Redis store is set even in the case when Producer GenStage pause call fails.

On a Flume.resume("pipeline-name", _temporary = false) call, the key in Redis store is deleted even in the case when Producer GenStage resume call fails.

This leads to an inconsistent pipeline state.

We should set/delete the Redis key on success of Producer GenStage pause/resume.

Add a test helper to mock flume APIs

One advantage is that we can avoid using redis in test environment.

Usage will be similar to this

  import Flume.Mock

  describe "enqueue/4" do
    test "mock works" do
      with_flume_mock do
        Flume.enqueue(:test, List, :last, [[1]])

        assert_receive %{
          queue: :test,
          worker: List,
          function_name: :last,
          args: [[1]]
        }
      end
    end
  end

Serialize-deseriable job args in Mocked APIs

On mocking a Flume enqueue in test env. we do not serialize the args. This is inconsistent with how the worker module receives the args on a dequeue from the queue.

Example

Operation

Flume.enqueue(:test, WorkerModule, :do_it, [%{a: 1, b: 2}])

Test env with mocks

with_flume_mock do
  Flume.enqueue(:test, WorkerModule, :do_it, [%{a: 1, b: 2}])

  assert_receive %{
    queue: :test,
    worker: WorkerModule,
    function_name: :do_it,
    args: [%{a: 1, b: 2}]
  }
end

However, in an environment without mocks the job is serialized(Jason.encode!/1) before we enqueue it.
This may differ from a possible expectation of a worker to receive serialized arguments.
The worker module executes the following equivalent of the operation

Flume.enqueue(:test, WorkerModule, :do_it, [%{"a" => 1, "b" => 2}])

So, the assertion should actually be against a serialized-deserialized version of the arguments.

with_flume_mock do
  Flume.enqueue(:test, WorkerModule, :do_it, [%{a: 1, b: 2}])

  assert_receive %{
    queue: :test,
    worker: WorkerModule,
    function_name: :do_it,
    args: [%{"a" => 1, "b" => 2}]
  }
end

Solution

Update the Mocked API to serialize and deserialize the arguments.

args = Jason.encode!(args) |> Jason.decode!()

Incorrect documentation for scheduling

Readme -> Scheduling

# 10 seconds
schedule_time = 10_000

Flume.enqueue_in(:queue_name, schedule_time, MyApp.FancyWorker, [arg_1, arg_2])

Here the schedule_time is shown to be in milliseconds. It should actually be unix time in seconds.

Also, Flume.enqueue_in/4, Flume.enqueue_in/5 and Flume.enqueue_in/6 accept argument time_in_seconds. It is confusing. A better name for this argument would be unix_time_in_seconds.

Move config options to Flume.Supervisor

The current approach require us to define config options in config.exs (or env specific files).

Moving config options to Flume.Supervisor will have these benefits -

Run multiple instances of Flume in an app with different settings
Flexibility in Flume tests, each test can run Flume with different settings like,
running a test with only one pipeline and another with multiple.