jbapple / libfilter Goto Github PK

View Code? Open in Web Editor NEW

30.0 5.0 6.0 2.17 MB

High-speed Bloom filters and taffy filters for C, C++, and Java

License: Apache License 2.0

Makefile 3.10% C++ 41.18% C 31.66% Java 16.81% CMake 2.15% Go 2.25% Python 2.85%

c cpp data-structures library bloom-filter bloom-filters bloomfilter java

libfilter's People

Contributors

Stargazers

Watchers

Forkers

enddynayn zabrane tenzir zyc1314 ttsugriy

libfilter's Issues

Sort coupled (aka static) filter before peeling

See the Binary Fuse Filter paper for the rationale. It increases locality.

Support Union and Intersection for more filter types

[Java] Block filter has high fpp

I noticed the com.github.jbapple.libfilter.TaffyBlockFilter has a high fpp (100% at some point). Likely there is a simple bug:

private void MakeMask(long hash, int[] result) {
    // The loops are done individually for easier auto-vectorization by the JIT
    for (int i = 0; i < 8; ++i) {
      result[0] = (int) hash * INTERNAL_HASH_SEEDS[i]; <== BUG HERE
    }
    for (int i = 0; i < 8; ++i) {
      result[i] = result[i] >>> (32 - 5);
    }
    for (int i = 0; i < 8; ++i) {
      result[i] = 1 << result[i];
    }
  }

To me it looks like it should be result[i] instead of result[0].

In Java XorFilter, migrate from arrays of structs to structs of arrays

There's no need for null values in the internal arrays used for constructing filter via peeling

Benchmark for different filters

Thanks for the amazing work.
Can we please also add some benchmarking results for different filters.
Trying to understand the Build latency , query latency , fpp rates and bits per value for these different filters to choose something that fits under some of opur sla requirements.
use case is upper bound 50-100 million values to be added , for a given bits/value , what would be the memory size for the filter and build latency when adding huge number of elements to the filter

SIMD for cuckoo filters

API documentation

Support multi-threading

Ensure binary compatibility between Java and C filters

Rename XorFilter to StaticFilter in Java

Submit to PyPI

SIMD for peeling filters

It is possible cache the bloom filter to NVMe? (support filters larger than 2^38 bytes)

libfilter is very fast.
I have about 2^36 items to add in bf, it is possible cache the bloom filter to NVMe?

Save the filter to SSD

Thank you for implementing this awesome filter, I have a question I would like to ask, I wonder if you can give me some pointers.
I want to use C/C++ to save the filter to SSD so that it can be load again next time, or directly use mmap the filter to the memory instead of reading the entire filter into the memory.
grateful.

serialization and cross language/platform support

Are these filters serializable ?
can we load and serialize filters in a compatible way in different languages and platform ?

Submit to pkg.go.dev

_mm256_testc_si256 performance is bad

@jbapple I test the blocked bloom filter (which is https://github.com/FastFilter/fastfilter_cpp/blob/master/src/bloom/simd-block.h version) in my codes. The vtune result shows that Find avx2 implementation's _mm256_testc_si256 opcode's performance is bad.

So, do we have other tricks to improve it ? Maybe a prefetch will help ?

Java Taffy Block Filter bug

Hi! I've found an important bug in TaffyBlockFilter Java realization which causes creation of new Block Filters with the same fpp.
I guess, in this line there should be x instead of cursor.

libfilter/java/libfilter/src/main/java/com/github/jbapple/libfilter/TaffyBlockFilter.java

Line 95 in 4ebeaef

ndv, fpp / Math.pow(cursor + 1, 2) * 6 / Math.pow(Math.PI, 2));

split block bloom filters implementation

While i am going through some amazing work you are doing , wanted to get your thoughts on some implementation details or caveats

IIUC SBBF is same as in https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h ? Are there any implementation difference or caveats in your particular approach (any significant perf difference) ? Bit confused around terminology here between block and split block so wanted to understand better if your implementation is something additional on top of classical blocked bloom filter.

I found your repo through : https://arxiv.org/pdf/2101.01719.pdf , it says for the split block bloom fpp are somewhat fixed due to pre-determined number of hash functions used . In your repo example is see double fpp = 0.0065; , is this because we are using a fixed number of hashes and bits per value (pre-configured at the given fpp level) . Am i able to tune this for lower fpp by using more hash functions and may be more bits per value ? What is the current default bits per value set in current SBBF implementation ? which parameter is that ?

In the apache kudu project which uses block bloom filter , here : https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h#L66 it says we are able to get 0.1% fpp with 15 bits per value , is it possible to achieve this in your split block BF implementation as well , or are we stuck to 0.4% - 19% range only ,

thanks for bearing with my naive questions as i am new into this area and still trying to figure out best filter for my case where i have strict latency requirement and need to insert ~10-100 million ids , these ids are not uuid but sequential range [0-100M] , so was trying to use roaring bitmaps before but now looking into this probabilistic filters since constructing filters is a potential bottleneck for me .