Giter Site home page Giter Site logo

bloomfeld's Introduction

Bloomfeld

Travis codecov.io JitPack

Simple Bloom Filter implementation. Please read about what it is if you do not know what a Bloom Filter does or why it does work the way it does.

How-To

Create filter

Creating a Bloom Filter optimaly is the trickiest part. There are some parameters which you should calculate first, depending on your use case. Also choosing the right hash function is difficult - you can use Murmur or other fast hashes, which are favored over MDA5 and similar.

For example, lets say we want to store 1000 strings and have a false positivity probability of 0.1 (10%). According to the results obtained from calculation utility (BloomFilterCalculations), we should create a Bloom Filter with 4793 bits and 3 hash functions. You can also use an online calculator. For the sake of simplicity, we will just use the standard hashCode. To actually get three different hash functions, we can just transform the string into lower- or upper-case.

So the filter definition would look like this:

BloomFilter<String> filter = new DefaultBloomFilter<String>(
    1000,
    s -> s.hashCode(),
    s -> s.toLowerCase().hashCode(),
    s -> s.toUpperCase().hashCode()
);

Query filter

To query a filter, call its probablyContains method.

boolean contains = filter.probablyContains("hello");

As the name suggests, you should really pay attention to what the returned value means:

  • TRUE = element is PROBABLY in the set (might introduce false positives)
  • FALSE = element is not in the set (always correct)

In other words: you can say for certain that an element is NOT in the set, but you cannot say for sure that element REALLY IS in the set. That is the limitation of a Bloom filter. It still has very nice use cases though!

Put elements in filter

Simply call the add method.

filter.add("hello");

Usage

You can include this library in your Maven project using the Jitpack service.

This has two steps. Step one, include this repository:

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

Step two, add this dependency (you can find the latest version in pom.xml file):

<dependency>
    <groupId>com.github.voho</groupId>
    <artifactId>bloomfeld</artifactId>
    <version>{SPECIFY_VERSION_HERE}</version>
</dependency>

bloomfeld's People

Contributors

voho avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.