Giter Site home page Giter Site logo

bloomfilter.js's People

Contributors

dey-dey avatar dmcgrath avatar ept avatar eugeneware avatar gleenn avatar jasondavies avatar pchaigno avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bloomfilter.js's Issues

Deserialization works only if filter's length in bits is divisible by 32

> bloom=new BloomFilter(100, 3)
BloomFilter {m: 100, k: 3, buckets: Int32Array[4], _locations: Uint8Array[3], locations: function…}
> bloom.add('test')
undefined
> bl2=new BloomFilter(bloom.buckets, 3)
BloomFilter {m: 128, k: 3, buckets: Int32Array[4], _locations: Uint8Array[3], locations: function…}
> bl2.test('test')
false
> bloom.test('test')
true

Use optimal m/k given n/p option

I was just wondering why you don't have an option to use the optimal m and k parameters (see here) based on n and p.

There are a couple ways I imagine this going:

  • Add an optional 3rd argument in the constructor that when present and set to true would interpret the first and second values of the constructor as n and p instead of m and k.
  • Change the constructor to accept a hash of either {m: blah, k: blah} or {n: blah, k: blah} (hopefully with better names than the single letters.
  • Add a new function, something like BloomFilter.useOptimal(n, p).

I could implement any of these, but I wanted to ask you first if there was any reason why you didn't do this.

bloom.buckets doesn't work

Hello!
var array = [].slice.call(bloom.buckets),
json = JSON.stringify(array);

This method doesn't work. How I can get serialization in your package? bloom method doesn't identify the buckets. It only hash bitview, view, serialize, and etc.

Adding filters

If I wanted to combine two filters of the same size, can I get the array forms and add the values at each respective index? (Then reserialize into another filter)

contrib/ directory containing Python version?

Hi Jason,

Great library! Thanks for writing this.

I needed to send Bloom filters to and from my webapp frontend to the Python backend (i.e., do add() in JS and test() in Python and the reverse). I ended up porting bloomfilter.js to Python by doing a line-by-line translation. Maybe I missed a note in the docs about an easier way? :)

If you're interested I can send a pull-request with a contrib/ directory with the Python version. It's a little hacky, because I used a C module to get Javascript numeric semantics (modulo and arithmetic are different in Python than Javascript).

Let me know.

Ranga

License clarification

First of all, thank you for the great implementation of Bloom Filter. We're using this package from our h2-auto-push package, and it's working great.

But I want to make sure we have no license problems by depending on this package. Your LICENSE file looks similar to BSD-3-Clause but not quite the same. Can I understand it as BSD-3-Clause?

If my understanding is correct, can you please put the license info into your package.json? Because it is missing, the npm page says "license: none": https://www.npmjs.com/package/bloomfilter. It'll be as simple as adding this line:

{ "license" : "BSD-3-Clause" }

https://docs.npmjs.com/files/package.json#license

Thank you!

fnv-plus

Hey I ran across your project as I was researching bloom filters, and I noticed this on your website:
"Unfortunately I can't use the 64-bit trick in the linked post as JavaScript only supports bitwise operations on 32 bits."

I don't know if you'd be interested in this, but a little while ago I wrote a version of fnv with an expanded keyspace (up to 1024-bit): https://github.com/tjwebb/fnv-plus.

Create function to serialize/deserialize bloom filter

Given the standalone nature of your bloomfilter, I was wondering if it would make sense to serialize/deserialize the bloomfilter bytearray/array.

My use case is that I'm trying to send a dictionary to the front-end to see if words the user types in are in it. I thought it might be faster/smaller to store the dictionary in a bloomfilter.

To Typescript?

It is worth it to convert this implementation to TypeScript?

It will be easier for others to understand as well as contribute.

False positives

I'm getting a much higher false positive rate than I would expect from a bloom filter of the size that I'm using

I'm using a 1024-bit bloomfilter with 16 hashes and 20 elements in each filter.

I'm running a test which adds 20 elements to a filter, checking before adding each that the item isn't already in the filter.

After running the test 500 times, there are ~4 collisions.

Given a bloom filter with those parameters, there should only be about a 1/1.3 billion chance of collision (https://hur.st/bloomfilter/?n=20&p=&m=1024&k=16)

Here's the short script:

    let collisions = 0
    for(let i=0; i< 500; i++) {
      const filter = new BloomFilter(1024, 16)
      const dict = {}
      for(let j=0; j< 20; j++) {
        const str = Math.floor(Math.random() * 1000000000).toString(16)
        if(filter.test(str) && dict[str] !== true){
          console.log("COLLISION: ", str)
          collisions++
        }
        filter.add(str)
        dict[str] = true
      }
      console.log(i)
    }
    console.log('done: ', collisions)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.