Giter Site home page Giter Site logo

cuttlepool's Introduction

DEPRECATED

I don't have the time or desire to continue maintaining this project.

If somebody wants to maintain this project and continue publishing to PyPI, please reach out and I will relinquish the name cuttlepool on PyPI if you want it.

CuttlePool

https://travis-ci.org/smitchell556/cuttlepool.svg?branch=master

CuttlePool is a general purpose, thread-safe resource pooling implementation for use with long lived resources and/or resources that are expensive to instantiate. It's key features are:

Pool overflow
Creates additional resources if the pool capacity has been reached and will remove the overflow when demand for resources decreases.
Resource harvesting
Any resources that haven't been returned to the pool and are no longer referenced by anything outside the pool are returned to the pool. This helps prevent pool depletion when resources aren't explicitly returned to the pool and the resource wrapper is garbage collected.
Resource queuing
If all else fails and no resource can be immediately found or made, the pool will wait a specified amount of time for a resource to be returned to the pool before raising an exception.

How-to Guide

Using CuttlePool requires subclassing a CuttlePool object with optional user defined methods normalize_resource() and ping(). The example below uses mysqlclient connections as a resource, but CuttlePool is not limited to connection drivers.

>>> import MySQLdb
>>> from cuttlepool import CuttlePool
>>> class MySQLPool(CuttlePool):
...     def ping(self, resource):
...         try:
...             c = resource.cursor()
...             c.execute('SELECT 1')
...             rv = (1,) in c.fetchall()
...             c.close()
...             return rv
...         except MySQLdb.OperationalError:
...             return False
...     def normalize_resource(self, resource):
...         # For example purposes, but not necessary.
...         pass
>>> pool = MySQLPool(factory=MySQLdb.connect, db='ricks_lab', passwd='aGreatPassword')

Let's break this down line by line.

First, the MySQLdb module is imported. MySQLdb.connect will be the underlying resource factory.

CuttlePool is imported and subclassed. The ping() method is implemented, which also takes a resource as a parameter. ping() ensures the resource is functional; in this case, it checks that the MySQLdb.Connection instance is open. If the resource is functional, ping() returns True else it returns False. In the above example, a simple statement is executed and if the expected result is returned, it means the resource is open and True is returned. The implementation of this method is really dependent on the resource created by the pool and may not even be necessary.

There is an additional method, normalize_resource(), that can be implemented. It takes a resource, in this case a MySQLdb.Connection instance created by MySQLdb.connect, as a parameter and changes it's properties. This can be important because a resource can be modified while it's outside of the pool and any modifications made during that time will persist; this can have unintended consequences when the resource is later retrieved from the pool. Essentially, normalize_resource() allows the resource to be set to an expected state before it is released from the pool for use. Here it does nothing (and in this case, it's not necessary to define the method), but it's shown for example purposes.

Finally an instance of MySQLPool is made. The MySQLdb.connect method is passed to the instance along with the database name and password.

The CuttlePool object and as a result the MySQLPool object accepts any parameters that the underlying resource factory accepts as keyword arguments. There are three other parameters the pool object accepts that are unrelated to the resource factory. capacity sets the max number of resources the pool will hold at any given time. overflow sets the max number of additional resources the pool will create when depleted. All overflow resources will be removed from the pool if the pool is at capacity. timeout sets the amount of time in seconds the pool will wait for a resource to become free if the pool is depleted when a request for a resource is made.

A resource from the pool can be treated the same way as an instance created by the resource factory passed to the pool. In our example a resource can be used just like a MySQLdb.Connection instance.

>>> con = pool.get_resource()
>>> cur = con.cursor()
>>> cur.execute(('INSERT INTO garage (invention_name, state) '
...              'VALUES (%s, %s)'), ('Space Cruiser', 'damaged'))
>>> con.commit()
>>> cur.close()
>>> con.close()

Calling close() on the resource returns it to the pool instead of closing it. It is not necessary to call close() though. The pool tracks resources so any unreferenced resources will be collected and returned to the pool. It is still a good idea to call close() though, since explicit is better than implicit.

Note

Once close() is called on the resource object, it renders the object useless. The resource object received from the pool is a wrapper around the actual resource object and calling close() on it returns the resource to the pool and removes it from the wrapper effectively leaving it an empty shell to be garbage collected.

To automatically "close" resources, get_resource() can be used in a with statement.

>>> with pool.get_resource() as con:
...     cur = con.cursor()
...     cur.execute(('INSERT INTO garage (invention_name, state) '
...                  'VALUES (%s, %s)'), ('Space Cruiser', 'damaged'))
...     con.commit()
...     cur.close()

API

The API can be found at read the docs.

FAQ

How do I install it?

pip install cuttlepool

How do I use cuttlepool with sqlite3?

Don't.

SQLite does not play nice with multiple connections and threads. If you need to make concurrent writes to a database from multiple connections, consider using a database with a dedicated server like MySQL, PostgreSQL, etc.

Contributing

It's highly recommended to develop in a virtualenv.

Fork the repository.

Clone the repository:

git clone https://github.com/<your_username>/cuttlepool.git

Install the package in editable mode:

cd cuttlepool
pip install -e .[dev]

Now you're set. See the next section for running tests.

Running the tests

Tests can be run with the command pytest.

Where can I get help?

If you haven't read the How-to guide above, please do that first. Otherwise, check the issue tracker. Your issue may be addressed there and if it isn't please file an issue :)

cuttlepool's People

Contributors

spenceforce avatar nuuk42 avatar

Stargazers

Martin Carames Abente avatar

Watchers

James Cloos avatar  avatar

Forkers

nuuk42

cuttlepool's Issues

Reset cursor class on connection

A connection's cursor class can be modified when out and about which can cause undesired behavior when it is later retrieved from the pool as it will be expected to create regular cursors but will not. When it's returned to the CuttlePool object it's cursorclass attribute should be set to the base cursor class.

Clean up bare exceptions

There are a few bare exceptions like:

try:
    ...
except:  # bare exception
    ...

These should catch more specific exceptions.

Clean up tests

Test public API and use environment variables to determine which type of sql to use.

Use ping instead of try/except in _close_connection.

Using ping will check if the connection is open and it moves the burden of proper exception handling to the user. It's impossible to determine the user's needs given any SQL driver so it's best left to them to decide what's best. Related to #21

Improper use of RLock

Currently RLock is instantiated every time it is needed. The proper usage is for a connection pool object to have one RLock object that handles all locking instead.

Internally track connections?

Keep references to all connections in CuttlePool object whether they are in the queue or not?

Would make it easier to prevent improper things being passed in and would simplify dealing with the size increments/decrements.

the attribute of the sqlite3 which is "check_same_thread" not supporting?

I want to use the cuttlepool in multi thread envrionment, but it seems that it doesn't support the option of the sqlite3 ,"check_same_thread : False" .
Am I wrong or Do I have to look for other libraries?
self.pool = SQLitePool(factory=sqlite3.connect,capacity=4,database='/mnt/config/test.db',isolation_level=None,check_same_thread=False)

It is not working at all

Fix tutorial paragraph about `normalize_connection()`

Here's the paragraph:

CuttlePool is imported and subclassed. The normalize_connection() method takes a Connection object as a parameter and changes it's properties. This is important because a Connection object can be modified while it's outside of the pool and any modifications made during that time

The final sentence is unfinished.

Race-Condition

The class CuttlePool has a race-condition in its method get_resource.

Setup:

  • the pool contains one available resource
  • no resources are in use
  • two threads are using the pool

Sequence of events:

  1. thread-1:: calls "get_resource"
  2. thread-1:: the method "_get()" returns the object "_ResourceTracker-1" and move this
    object to the part of the "_reference_queue" that contains the resources that
    are in use.
  3. thread-1:: calls "self.ping" using the resource from "_ResourceTracker-1" as argument.
    Note: at this point in time a call to the method "available()" of the object
    "_ResourceTracker-1" returns "True" because the "weakref" to the wraped resource
    has not yet been established. This happens later in "get_resource" with a call to
    the method "wrap_resource".
  4. thread-2:: calls "get_resource"
  5. thread-2:: The pool's "empty()" return "True" and so "_harvest_lost_resources" is called to look
    for resources that haven been properly returned to pool.
  6. thread-2:: "_harvest_lost_resources" loops the part of the "_reference_queue" that contains the
    "_ResourceTracker" objects of resources that are in use. It finds "_ResourceTracker-1"
    and calls the "available" method which returns "True".
    The method "_harvest_lost_resources" then returns the object "_ResourceTracker-1" to the
    part of the "_reference_queue" that contains the available resources.
  7. thread-2:: The method "_get()" returns "_ResourceTracker-1"

As a result, boths thread are using the same resource.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.