Giter Site home page Giter Site logo

repoze.lru's People

Contributors

brodul avatar cosminbasca avatar jbohman avatar jul avatar mariovilas avatar mcdonc avatar tseaver avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

repoze.lru's Issues

catch the evicted key/value pair?

hi -

For the key/value pair I put into the cache, I'd like to take certain actions before OR after it is evicted from the cache. reading through the API - I didn't quite get a sense how to do that, Any help?

thanks

Releases

I don't see 0.7 or 0.7.1 on PyPI, only 0.6 which is pretty old.

LRUCache.clock wastes RAM

The LRUCache.clock is a list of dictionaries, one dictionary for each cache entry. With 64 bit Linux + Python 2.7, an empty(!) cache with 10 million entries occupies 2.8 GB of RAM, because 10 million dictionaries are instantiated.

Since each dictionary has exactly two entries ("ref" and "key"), this could just as well be implemented using two list called LRUCache.clock_keys and LRUCache.clock_refs. This way, no dictionary is needed and RAM consumption for an empty cache with 10 million entries goes down to 158 MB (even that already includes what Python itself needs).

Repeated pushing of same entry removes other cache entries

When the same cache entry is repeatedly put() into the cache, it will remove existing entries:

>>> c = LRUCache(3)
>>> c.put(1, 2)
>>> c.put(11, 22)
>>> c.put(111, 222)
>>> print c.data
{1: (0, 2), 11: (1, 22), 111: (2, 222)}

>>> c.put(1, 2)
>>> c.put(1, 2)
>>> c.put(1, 2)
>>> print c.data
{1: (2, 2)}

The reason is that put() does not check if the entry is already in the cache. Instead, every push operation inserts a new entry in self.clock, eventually replacing all other entries.

pip install fails on windows (2.7 x32)

The install works, but C:\Python27\Lib\site-packages\repoze\__init__.py is not installed, which causes import repoze.lru to fail with ImportError: No module named repoze.lru.

Creating an empty file C:\Python27\Lib\site-packages\repoze\__init__.py solves the issue, but I'd say setup.py should not skip installing the init.py.

pip install log:

...>pip install repoze.lru
Downloading/unpacking repoze.lru
  Downloading repoze.lru-0.6.tar.gz
  Running setup.py (path:c:\users\...\appdata\local\temp\pip_build_deen\repoze.lru\setup.py) egg_info for package repoze.lru

Installing collected packages: repoze.lru
  Running setup.py install for repoze.lru

    Skipping installation of C:\Python27\Lib\site-packages\repoze\__init__.py (namespace package)
    Installing C:\Python27\Lib\site-packages\repoze.lru-0.6-py2.7-nspkg.pth
Successfully installed repoze.lru
Cleaning up...

LRUCache may fail to release lock

Locking in LRUCache is done like this:

self.lock.acquire()
try:
  something()
finally:
  self.lock.release()

This will fail to release the lock in case an exception occurs after the acquire() but before the try: block is entered. Such an exception could be a KeyboardError that is caused by somebody pressing CTRL+C on a program running in a terminal.

To check that this really happens, I wrote the below program and pressed CTRL+C while it was running. Failed 7 out of 10 times for me. Much less likely to happen in a real program, but still a race condition.

#!/usr/bin/python
import threading
LOCK = threading.Lock()
def run():
    while 1:
        LOCK.acquire()
        try:
            pass
        finally:
            LOCK.release()
if __name__ == "__main__":
    try:
        run()
    finally:
        if LOCK.locked():
            print "ooops"
        else:
            print "OK"

LRUCache.put() can take multiple seconds on large caches

When put() looks for a place to store a new entry, the worst case situation is that all entries in the cache have ref==True. In this case put() walks through the entire cache, setting everything to ref==False. For a cache with 10 million entries, this process takes 3.6 seconds on my machine. It is very, very undesirable to have an application hang for this amount of time.

How to reproduce in Python shell:

>>> cache = LRUCache(10**7)
>>> for i in xrange(10**7):
...   cache.put(i, i)
... 
>>> cache.put("foo", "bar")     # <---- takes multiple seconds.

Consider deep copy for put/get

Hi,

When putting or getting mutable items to/from cache, the value should be a deep copy. Otherwise the cache contents can be modified implicitly, which produces a behavior that I find quiet unintuitive for a cache.

Here is an example:

from repoze.lru import LRUCache
cache = LRUCache(10)

# place one item in cache:
val = {"hallo": 1}
cache.put("world", val)

# implicit modification of cache:
val["new"] = 2

# item returned from cache
item = cache.get("world")
print str(item) # prints {"hallo": 1, "new": 2}, but should be {"hallo": 1}

item['third'] = 3
print cache.get("world") # prints {"hallo": 1, "new": 2}, but still should be {"hallo": 1}

The items on the cache should never change after put, unless we update them explicitly. This happens for all mutable types (dicts, lists, ...).

I cannot think of a case where someone would want to save references in the cache, apart for performance reasons. I suggest to take a deep copy by default, and add an optional parameter to allow reverting to the current behavior if somebody really needs that (and knows what this is doing).

(readme) clarify whether repoze works on python 3.3+

Readme says repoze.lru 'works under Python 2.5, Python 2.6, Python 2.7, and Python 3.2'. Python 3.6 was recently released. Seems likely it still works, but would be good to see that verified and called out in readme.

LRUCache will evict entries even though not full

The LRUCache's get() function sets the self.hand variable to just after the current position. If the same entry is read repeatedly, put() calls will always work in the same area of the cache and not reach the empty parts. As a result, the cache will start removing entries even though it is not full:

>>> c = LRUCache(1000)
>>> c.put(1, 1)
>>> c.put(2, 2)        # <---- "2" is added
>>> c.get(1)
1
>>> c.put(3, 3)        # <---- ref is set to False for "2"
>>> c.get(1)
1
>>> c.put(4, 4)        # <----- "2" is evicted and replaced by "4"
>>> c.data
{1: (0, 1), 3: (2, 3), 4: (1, 4)}   # <---- "2" is not there, even though cache is 99% empty

A solution would be to just not set self.hand in the get() method.

LRUCache is not thread safe

When the clear() method is called on a LRUCache, concurrently running get() and put() calls may throw exceptions.

The get() method may throw an exception because it tries to
self.clock[pos]['ref'] = True
but the clear() method is only slowly rebuilding the self.clock list. This results in index out of bounds.

put() may except when it is searching for a place to put an item. put() walks along self.clock and can experience index out of bounds just like the get() method.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.