Giter Site home page Giter Site logo

jclawson / hazeltask Goto Github PK

View Code? Open in Web Editor NEW
44.0 44.0 9.0 1005 KB

Advanced distributed task distribution library for Hazelcast. Customizable task load balancing with failover. For example: Fair task execution for multi tenant systems to prevent starvation

License: GNU Lesser General Public License v3.0

Java 100.00%

hazeltask's People

Contributors

jclawson avatar sirusdv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hazeltask's Issues

Incorrect handling of task submission when no workers available.

DistributedExecutorServiceImpl::submit() incorrectly handles the case where no workers are available to receive work in submitHazeltaskTask() as it causes it to return false as well.

submit() treats this as if the task is a duplicate and it remains in the HazelcastExecutorTopologyService 's pendingTask list with no way to track it and gets resubmitted by the TaskRecoveryTimerTask with no way to cancel the resubmission.

This state should be handled in either an exception from submit() or at the very least configurable.

Support Async task submission

It may be desirable to not wait on a worker node to receive a task. If this worker is particularly busy, or in the middle of a GC cycle it can cause pauses in the submitting node. Add an optional buffer (off by default) that buffers submissions into a bounded queue that is executed by a single thread executor. This will give us some leeway in waiting for strained nodes. Define a maxWaitTime on submission operations before trying a different worker node.

TaskBatchingService API Cleanup

  • Only expose one generic type to the developer using TaskBatchingService.... the item type.
  • Create an interface to represent the return type from HazeltaskInstance with only 4 methods: startup(), shutdown(), add, createBatch
  • Add default simple batcher that requires a Runnable (SimpleBatchTask) class with a no-arg constructor and a setItems method for zero config

ListenableFuture support buggy due to GC of future references

The DistributedFutureTracker keeps references in a Guava Cache. It used the "weakValues()" option to allow futures to be GC'd out if they weren't used. (This was a "feature"). Later, ListenableFuture support was added which doesn't require a reference to be held onto. Thus, the future was available for garbage collection where ListenableFutures were used. When GC would run was highly dependent on the app, so this was a difficult bug to track down.

The fix is to not use the weakValues() option.

Credit goes to Mike Hill for finding this issue.

Fix logging issues

Logging doesn't work as expected using hazelcast's built in logging. I am guessing that this is because of the way they bind their logging configuration. Just use SLF4J for now, as its required for yammer metrics anyways.

Support task retry

If a task throws a specific exception, allow it to be retried up to a defined number of times

Can rely on issue #24 in order to implement. The downside will be that the submitting server is required to be online in order for a task to be retried. This is probably fine for the initial implementation.

Prevent deadlock on Future.get when cluster data is lost due to multiple nodes going down

It is currently possible to deadlock on Future.get if it is waiting on task completion and multiple nodes go down such that data from the Hazelcast Map backed write ahead log is lost.

  1. Have a cluster listener to detect the situation where possible data loss may have occurred. If this is not easily possible, then just when a member leaves.

  2. Each member knows what futures are waiting on what tasks. Simply check the write ahead log if the task exists. If it doesn't exist, the task was lost and the Future should be errored.

Write custom task distribution service provider

Hazelcast allows you to write custom low-level distributed services. Lets take advantage of that to allow distributing a task in the same call as writing to the write ahead log. Currently it makes 3 remote calls. 1) write to write ahead log 2) write to write ahead log backup 3) send task to worker server

Add support for future listeners

Register TaskResponseListener to be able to listen to responses from across the cluster via the DistributedFutureTracker.onMessage. This could also be where we could implement issue #9 "Support task retry"

Support Hazelcast lite and client mode

If client is a "lite member" it will not have any data partitions and can safely turn off run some cluster tasks like task recovery. "lite members" should be allowed

Support clearing a queue by group

This won't be super efficient, but:

  1. send callable to each local executor server to clean the queue
  2. each local executor service will remove the grouped queue
  3. it will iterate over the removed queue, removing each task from the write ahead log one by one

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.