Giter Site home page Giter Site logo

hyperbloom's Introduction

HyperBloom

Build Status godoc complete

Bloom

A collection of high performance bloom filter data structures for use in Go. They all use the 64 bit version of Google's XXHASH "extremely fast non-cryptographic" hashing algorithm. Detailed documentation is available via godoc.

BloomFilter

A textbook implementation of a bloom filter. Like StripedBloomFilter, it uses an array of unsigned 64 bit integers. However, it uses centralized locking (via a RWMutex) in place of sharded locking. In addition, it supports non-locking inserts and lookups (InsertAsync) and (LookupAsync). Use if you plan on doing mostly reads and not many writes OR if you plan on using the bloomfilter in a single-threaded scenario (make sure to use InsertAsync and LookupAsync to bypass the mutex in this case)

StripedBloomFilter

A bloom filter implemented using an array of unsigned 64 bit integers which is divided into 'n' shards. Each shard contains its own mutex, allowing for a high degree of concurrent insert and lookup throughput. One restriction is that the filter size must be a multiple of the number of shards.

Assuming 16 cores, a 32 shard StriedBloomFilter provides a three order of magnitude decrease in lock contention over a standard bloom filter.

Use if you plan on a roughly equal balance of writes and reads and are using the bloomfilter in a multi-threaded scenario. This variant is especially good for large (1 billion buckets or more) filters.

NaiveBloomFilter

A bloom filter implemented using a byte array (with each bit in the filter assigned to a byte). This reduces the number of instructions necessary to perform a lookup in the go 1.4 and latest gcc-go compilers. As a result, the NaiveBloomFilter will almost always be faster than the standard variants at the cost of an 8x penalty in memory usage.

This version uses centralized locking (via a RWMutex) and is perfect for a filter that will mainly be used for reads or in a single threaded context (use the LookupAsync and InsertAsync functions to bypass the mutex in this case). For very small (<20 million buckets) bloom filters, the NaiveBloomFilter can yield enormous performance boosts since most of the filter fits in the processor cache.

NaiveStripedBloomFilter

A bloom filter implemented using a byte array (with each bit in the filter assigned to a byte) but with distributed locking over 'n' shards. This provides increased concurrent throughput. This is the perfect choice for filters where read performance over multiple threads needs to be maximized (you get a performance gain from not bit mangling).

hyperbloom's People

Contributors

alfredocogo avatar carbocation avatar dafaqdhruv avatar iamthebot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.