Giter Site home page Giter Site logo

oldpanda / bloomfilter-py Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 0.0 45 KB

Yet another Bloomfilter implementation in Python, compatible with Java's Guava library

Home Page: https://pypi.org/project/bloomfilter-py/

License: MIT License

Python 100.00%
bloomfilter bloomfilter-python python python-library python3

bloomfilter-py's Issues

Byte decoding fails while converting to string.

Use case:

  • To convert the bloom filter object into a string in order to store and retrieve as and when required.

Problem:

  • While converting the bytes object after using the dumps() method on the bloom filter object into a string via the decode() method, it fails (in some cases when typically the integers added to the filter are larger) as below:

    File ~/.pyenv/versions/3.10.6/lib/python3.10/encodings/utf_8.py:16, in decode(input, errors)
       15 def decode(input, errors='strict'):
    ---> 16     return codecs.utf_8_decode(input, errors, True)
    
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 6: invalid start byte
    

Code to reproduce:

import random
from bloomfilter import BloomFilter
bf = BloomFilter(expected_insertions=10, err_rate=0.1)
bf.put(random.randint(100000000, 10000000000))
bf.put(random.randint(100000000, 10000000000))
bf.dumps().decode()

My observations:

The encoding being used here for serializing the bloom filter to bytes is neither of [utf-8, ascii, utf-16, utf-32] as the byte codecs seen are not supported in either of these.

Probable solutions:

  1. Use the utf-8 encoding while serializing the bloom filter object.
  2. Allow users to specify the encoding to be used.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.