go-ewah
is a pure Go implementation of the EWAH bitmap compression format with no dependencies other than the assertion library for the tests.
- Read and write EWAH bitmaps using the serialization format used by git.
- Fast read and writes.
go get github.com/erizocosmico/go-ewah
bitmap, err := bitmap.FromBytes(somebytes, binary.BigEndian)
if err != nil {
// check error
}
value, err := bitmap.Get(6)
if err != nil {
// check error
}
fmt.Println(value) // either `true` or `false`
bitmap, err := bitmap.FromReader(somereader, binary.BigEndian)
if err != nil {
// check error
}
// the rest is the same as the previous example
By default all bits are 0, so we only call Set(bitPos)
to mark the ones. Once a bit N has been set, you can't set a bit M whose index is lower than the index of N.
b := bitmap.New()
if err := b.Set(6); err != nil {
// handle error
}
b := bitmap.New()
// fill the bitmap
w := bytes.NewBuffer(nil)
bytesWritten, err := b.Write(w, binary.BigEndian)
if err != nil {
// handle error
}
For more details regarding the compression format, please see Section 3 of the following paper:
Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes. Data & Knowledge Engineering 69 (1), pages 3-28, 2010. http://arxiv.org/abs/0901.3751
Or check the Java reference implementation.
$ go test . -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: github.com/erizocosmico/go-ewah
BenchmarkBitmapGet-4 100000000 17.5 ns/op 0 B/op 0 allocs/op
BenchmarkBitmapWrite-4 5000000 338 ns/op 64 B/op 9 allocs/op
BenchmarkBitmapRead-4 3000000 537 ns/op 272 B/op 12 allocs/op
BenchmarkBitmapSet-4 200000000 8.14 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/erizocosmico/go-ewah 44.797s
Benchmarks run on a MacBook Pro Retina (mid-2014) with a 2,6 GHz Intel Core i5 processor running macOS High Sierra 10.13.3.
MIT, see LICENSE