Giter Site home page Giter Site logo

cuckoofilter's Introduction

Cuckoo Filter

GitHub go.mod Go version of a Go module GoDoc GoReportCard

Well-tuned, production-ready cuckoo filter that performs best in class for low false positive rates (at around 0.01%). For details, see full evaluation.

Background

Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space.

Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).

"Cuckoo Filter: Better Than Bloom" by Bin Fan, Dave Andersen and Michael Kaminsky

Implementation details

The paper cited above leaves several parameters to choose. In this implementation

  1. Every element has 2 possible bucket indices
  2. Buckets have a static size of 4 fingerprints
  3. Fingerprints have a static size of 16 bits

1 and 2 are suggested to be the optimum by the authors. The choice of 3 comes down to the desired false positive rate. Given a target false positive rate of r and a bucket size b, they suggest choosing the fingerprint size f using

f >= log2(2b/r) bits

With the 16 bit fingerprint size in this repository, you can expect r ~= 0.0001. Other implementations use 8 bit, which correspond to a false positive rate of r ~= 0.03.

Example usage

import (
	"fmt"

	cuckoo "github.com/panmari/cuckoofilter"
)

func Example() {
	cf := cuckoo.NewFilter(1000)

	cf.Insert([]byte("pizza"))
	cf.Insert([]byte("tacos"))
	cf.Insert([]byte("tacos")) // Re-insertion is possible.

	fmt.Println(cf.Lookup([]byte("pizza")))
	fmt.Println(cf.Lookup([]byte("missing")))

	cf.Reset()
	fmt.Println(cf.Lookup([]byte("pizza")))
	// Output:
	// true
	// false
	// false
}

For more examples, see the example tests. Operations on a filter are not thread safe by default. See this example for using the filter concurrently.

cuckoofilter's People

Contributors

seiflotfy avatar panmari avatar chessman avatar trichner avatar shabbyrobe avatar codehuntio avatar virrages avatar jnishikawa-carbonblack avatar marreck avatar martinpinto avatar mholt avatar sckelemen avatar glaslos avatar jerry-vite avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.