Giter Site home page Giter Site logo

crontabs's Introduction

Crontabs


NOTE: I've recently discovered the Rocketry and Huey projects, which you should probably use instead of crontabs. They are just better than crontabs.


Think of crontabs as a quick-and-dirty solution you can throw into one-off python scripts to execute tasks on a cron-like schedule.

Crontabs is a small pure-python library that was inspired by the excellent schedule library for python.

In addition to having a slightly different API, crontabs differs from the schedule module in the following ways.

  • You do not need to provide your own event loop.
  • Job timing is guaranteed not to drift over time. For example, if you specify to run a job every five minutes, you can rest assured that it will always run at 5, 10, 15, etc. passed the hour with no drift.
  • The python functions are all run in child processes. A memory-friendly flag is available to run each iteration of your task in its own process thereby mitigating memory problems due to Python's high watermark issue

Why Crontabs

Python has no shortage of cron-like job scheduling libraries, so why create yet another. The honest answer is that I couldn't find one that met a simple list of criteria.

  • Simple installation with no configuration. An extremely robust and scalable solution to this problem already exists. Celery. But for quick and dirty work, I didn't want the hastle of setting up and configuring a broker, which celery requires to do its magic. For simple jobs, I just wanted to pip install and go.
  • Human readable interface. I loved the interface provided by the schedule library and wanted something similarly intuitive to use.
  • Memory safe for long running jobs. Celery workers can suffer from severe memory bloat due to the way Python manages memory. As of 2017, the recommended solution for this was to periodically restart the workers. Crontabs runs each job in a subprocess. It can optionally also run each iteration of a task in it's own process thereby mitigating the memory bloat issue.
  • Simple solution for cron-style workflow and nothing more. I was only interested in supporting cron-like functionality, and wasn't interested in all the other capabilities and guarantees offered by a real task-queue solution like celery.
  • Suggestions for improvement welcome. If you encounter a bug or have an improvement that remains within the scope listed above, please feel free to open an issue (or even better... a PR).

Installation

pip install crontabs

Usage

Schedule a single job

from crontabs import Cron, Tab
from datetime import datetime


def my_job(*args, **kwargs):
    print('args={} kwargs={} running at {}'.format(args, kwargs, datetime.now()))


# Will run with a 5 second interval synced to the top of the minute
Cron().schedule(
    Tab(name='run_my_job').every(seconds=5).run(my_job, 'my_arg', my_kwarg='hello')
).go()

Schedule multiple jobs

from crontabs import Cron, Tab
from datetime import datetime


def my_job(*args, **kwargs):
    print('args={} kwargs={} running at {}'.format(args, kwargs, datetime.now()))


# All logging messages are sent to sdtout
Cron().schedule(
    # Turn off logging for job that runs every five seconds
    Tab(name='my_fast_job', verbose=False).every(seconds=5).run(my_job, 'fast', seconds=5),

    # Go ahead and let this job emit logging messages
    Tab(name='my_slow_job').every(seconds=20).run(my_job, 'slow', seconds=20),
).go()

Schedule future job to run repeatedly for a fixed amount of time

from crontabs import Cron, Tab
from datetime import datetime


def my_job(*args, **kwargs):
    print('args={} kwargs={} running at {}'.format(args, kwargs, datetime.now()))


Cron().schedule(
    Tab(
        name='future_job'
    ).every(
        seconds=5
    ).starting(
        '12/27/2017 16:45'  # This argument can either be parsable text or datetime object.
    ).run(
        my_job, 'fast', seconds=5
    )
# max_seconds starts from the moment go is called.  Pad for future run times accordingly.
).go(max_seconds=60)

Cron API

The Cron class has a very small api

method Description
.schedule() [Required] Specify the different jobs you want using Tab instances
.go() [Required] Start the crontab manager to run all specified tasks
.get_logger() A class method you can use to get an instance of the crontab logger

Tab API with examples

The api for the Tab class is designed to be composable and readable in plain English. It supports the following "verbs" by invoking methods.

method Description
.run() [Required] Specify the function to run.
.every() [Required] Specify the interval between function calls.
.starting() [Optional] Specify an explicit time for the function calls to begin.
.lasting() [Optional] Specify how long the task will continue being iterated.
.until() [Optional] Specify an explicit time past which the iteration will stop
.during() [Optional] Specify time conditions under which the function will run
.excluding() [Optional] Specify time conditions under which the function will be inhibited

Run a job indefinitely

from crontabs import Cron, Tab
from datetime import datetime


def my_job(name):
    print('Running function with name={}'.format(name))


Cron().schedule(
    Tab(name='forever').every(seconds=5).run(my_job, 'my_func'),
).go()

Run one job indefinitely, another for thirty seconds, and another until 1/1/2030

from crontabs import Cron, Tab
from datetime import datetime


def my_job(name):
    print('Running function with name={}'.format(name))


Cron().schedule(
    Tab(name='forever').run(my_job, 'forever_job').every(seconds=5),
    Tab(name='for_thirty').run(my_job, 'mortal_job').every(seconds=5).lasting(seconds=30),
    Tab(name='real_long').run(my_job, 'long_job').every(seconds=5).until('1/1/2030'),
).go()

Run job every half hour from 9AM to 5PM excluding weekends

from crontabs import Cron, Tab
from datetime import datetime

def my_job(name):
    # Grab an instance of the crontab logger and write to it.
    logger = Cron.get_logger()
    logger.info('Running function with name={}'.format(name))


def business_hours(timestamp):
    return 9 <= timestamp.hour < 17

def weekends(timestamp):
    return timestamp.weekday() > 4


# Run a job every 30 minutes during weekdays.  Stop crontabs after it has been running for a year.
# This will indiscriminately kill every Tab it owns at that time.
Cron().schedule(
    Tab(
        name='my_job'
    ).run(
        my_job, 'my_job'
    ).every(
        minutes=30
    ).during(
        business_hours
    ).excluding(
        weekends
    )
).go(max_seconds=3600 * 24 * 365)

Run test suite with

git clone [email protected]:robdmc/crontabs.git
cd crontabs
pip install -e .[dev]
py.test -s -n 8   # Might need to change the -n amount to pass

Projects by robdmc.

crontabs's People

Contributors

robdmc avatar vshih avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

crontabs's Issues

How does the seconds argument differ?

How does the effect of the seconds argument to every differ from the seconds argument to run in :
Tab(name='my_slow_job').every(seconds=20).run(my_job, 'slow', seconds=20),

as part of

Cron().schedule(
    # Turn off logging for job that runs every five seconds
    Tab(name='my_fast_job', verbose=False).every(seconds=5).run(my_job, 'fast', seconds=5),

    # Go ahead and let this job emit logging messages
    Tab(name='my_slow_job').every(seconds=20).run(my_job, 'slow', seconds=20),
).go()

commit 949d843 broke "starting()" method

hi, robdmc. I really appreciate your work.
I found crontabs yesterday, but I couldn't get 'staring()' method working.
I digged a little and found that changes in the latest commit break the "starting()" method.
I'm now using version 0.2.1, everything works fine.

Refactor To Guard Against Memory High Water Mark

Refactor the SubProcess class to actually use threading instead of subprocesses. Then run each invocation of the scheduled function in its own process. This will force python to release memory to OS after each run of the scheduled function.

How to stop the job?

image

Used example in PyCharm 2020.1 on Windows 10

def my_job(*args, **kwargs):
    print('args={} kwargs={} running at {}'.format(args, kwargs, datetime.datetime.now()))
Cron().schedule(
        Tab(name='run_my_job').every(seconds=5).run(my_job, 'my_arg', my_kwarg='hello')
    ).go()

After trying stopping a console it looks like the job is still running but just without calling my_job

Performance on multi cronjobs

    Cron().schedule(
        # Turn off logging for job that runs every five seconds
        Tab(name='first_job', verbose=True).every(
            seconds=5).run(my_job, 'first', seconds=5),
        Tab(name='second_job', verbose=True).every(
            seconds=15).run(my_job, 'second', seconds=15),
    ).go()

image

as shown,

2022-11-16 12:13:40,003 [35667] INFO     first_job: Running first_job
args=('first',) kwargs={'seconds': 5} running at 2022-11-16 12:13:40.003657
2022-11-16 12:13:45,004 [35667] INFO     first_job: Running first_job
2022-11-16 12:13:45,005 [35668] INFO     second_job: Running second_job
args=('first',) kwargs={'seconds': 5} running at 2022-11-16 12:13:45.005105
args=('second',) kwargs={'seconds': 15} running at 2022-11-16 12:13:45.005147

second job always slower than first job, I am willing to PR but I have no idea how to improve it,
@robdmc could you share me some though on how to improve this to become asycn process task?

Fix High Water Mark Issue

I think I can get the process to re-spawn every run by
wrapping the returned self.loop in self.get_target to have the kwarg max_iter=1

API Improvement

Think about giving Tab a new method stopping_at(<datetime or string>).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.