dgraph-io / badger Goto Github PK
View Code? Open in Web Editor NEWFast key-value DB in Go.
Home Page: https://dgraph.io/badger
License: Apache License 2.0
Fast key-value DB in Go.
Home Page: https://dgraph.io/badger
License: Apache License 2.0
heya, I'm trying out badger as an alternative to leveldb, when I ported my code over I find that there is no data being synced to disk:
total 252K
drwxrwxr-x 2 peter peter 4.0K Jun 28 14:28 .
drwxrwxrwt 710 root root 244K Jun 28 14:28 ..
-rw-rw-r-- 1 peter peter 0 Jun 28 14:28 000000.vlog
-rw-rw-r-- 1 peter peter 0 Jun 28 14:28 clog
note that the log.Println("wrote", item.ID)
line get's called for each item and no error is produced for log.Println("write error", err)
.
I have tested the encoding/decoding logic and it works fine with leveldb, so don't worry about that
any help would be appreciated :)
// open database connection
conn := &badger.Connection{}
conn.Open(leveldbPath)
defer conn.Close()
package badger
import "github.com/dgraph-io/badger/badger"
// Connection - Connection
type Connection struct {
DB *badger.KV
}
// Open - open connection and set up
func (c *Connection) Open(path string) {
opt := badger.DefaultOptions
opt.Dir = path
opt.ValueDir = opt.Dir
db, err := badger.NewKV(&opt)
if err != nil {
panic(err)
}
c.DB = db
}
// Close - close connection and clean up
func (c *Connection) Close() {
c.DB.Close()
}
package badger
import (
"encoding/binary"
"log"
"math"
"github.com/dgraph-io/badger/badger"
"github.com/missinglink/gosmparse"
)
// WriteCoord - encode and write lat/lon pair to db
func (c *Connection) WriteCoord(item gosmparse.Node) error {
// encode id
key := make([]byte, 8)
binary.BigEndian.PutUint64(key, uint64(item.ID))
// encode lat
lat := make([]byte, 8)
binary.BigEndian.PutUint64(lat, math.Float64bits(item.Lat))
// encode lon
lon := make([]byte, 8)
binary.BigEndian.PutUint64(lon, math.Float64bits(item.Lon))
// value
value := append(lat, lon...)
// write to db
err := c.DB.Set(key, value)
log.Println("wrote", item.ID)
if err != nil {
log.Println("write error", err)
return err
}
return nil
}
// ReadCoord - read lat/lon pair from db and decode
func (c *Connection) ReadCoord(id int64) (*gosmparse.Node, error) {
// encode id
key := make([]byte, 8)
binary.BigEndian.PutUint64(key, uint64(id))
// read from db
var item badger.KVItem
err := c.DB.Get(key, &item)
if err != nil {
return nil, err
}
data := item.Value()
// decode item
return &gosmparse.Node{
ID: id,
Lat: math.Float64frombits(binary.BigEndian.Uint64(data[:8])),
Lon: math.Float64frombits(binary.BigEndian.Uint64(data[8:])),
}, nil
}
$ go version
go version go1.8.3 linux/amd64
$ uname -a
Linux peterpro 4.6.0-040600-generic #201606100558 SMP Fri Jun 10 10:01:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/issue
Ubuntu 16.04.2 LTS \n \l
$ cd ${GOPATH}/src/github.com/dgraph-io/badger/
$ git log -1
commit 8809f1c3328a7f17a5bf89a1a14f4ee11124e4c6
Author: Jason Chu <[email protected]>
Date: Wed Jun 28 15:02:09 2017 +0800
2017/06/14 09:26:25 error.go:62: Arena too small, toWrite:18 newTotal:134217742 limit:134217728
github.com/dgraph-io/badger/y.AssertTruef
/home/ubuntu/go/src/github.com/dgraph-io/badger/y/error.go:62
github.com/dgraph-io/badger/skl.(*Arena).PutKey
/home/ubuntu/go/src/github.com/dgraph-io/badger/skl/arena.go:70
github.com/dgraph-io/badger/skl.newNode
/home/ubuntu/go/src/github.com/dgraph-io/badger/skl/skl.go:120
github.com/dgraph-io/badger/skl.(*Skiplist).Put
/home/ubuntu/go/src/github.com/dgraph-io/badger/skl/skl.go:308
github.com/dgraph-io/badger/badger.(*KV).writeToLSM
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/kv.go:445
github.com/dgraph-io/badger/badger.(*KV).writeRequests
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/kv.go:498
github.com/dgraph-io/badger/badger.(*KV).doWrites
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/kv.go:534
runtime.goexit
/usr/lib/go-1.8/src/runtime/asm_amd64.s:2197
Sometimes I get this error while restarting Dgraph for development.
2017/07/07 13:31:12 error.go:55: Assert failed
github.com/dgraph-io/badger/y.AssertTrue
/home/pawan/go/src/github.com/dgraph-io/badger/y/error.go:55
github.com/dgraph-io/badger.getKeyRange
/home/pawan/go/src/github.com/dgraph-io/badger/compaction.go:51
github.com/dgraph-io/badger.(*levelsController).fillTablesL0
/home/pawan/go/src/github.com/dgraph-io/badger/levels.go:343
github.com/dgraph-io/badger.(*levelsController).doCompact
/home/pawan/go/src/github.com/dgraph-io/badger/levels.go:478
github.com/dgraph-io/badger.(*levelsController).runWorker
/home/pawan/go/src/github.com/dgraph-io/badger/levels.go:161
Hi
I am very glad to see this project and want to try it on my own project. I need the snapshot feature to dump the database concurrently but now Badger misses it, do you have any plan to support this?
Thanks ๐
Hi there
I just started trying out badger and it seems very nice so far!
I'm not sure about the behavior of the iterator at one point.
package badger
import (
"testing"
"fmt"
"github.com/dgraph-io/badger/badger"
"strconv"
)
func TestWrite(t *testing.T) {
opts := badger.DefaultOptions
opts.Dir = "data"
opts.ValueDir = "data"
kv, err := badger.NewKV(&opts)
if err != nil{
panic(err)
}
defer kv.Close()
for i := 0; i < 15000; i++{
err = kv.Set([]byte("test_"+strconv.Itoa(i)), []byte("value_"+strconv.Itoa(i)))
if err != nil{
panic(err)
}
}
}
func TestIterator(t *testing.T) {
opts := badger.DefaultOptions
opts.Dir = "data"
opts.ValueDir = "data"
kv, err := badger.NewKV(&opts)
if err != nil{
panic(err)
}
defer kv.Close()
optsi := badger.DefaultIteratorOptions
it := kv.NewIterator(optsi)
var kvItem *badger.KVItem
it.Rewind() // it.Next() produces error without it.Rewind()
for it.Next();it.Valid();it.Next(){
kvItem = it.Item()
fmt.Println("k", string(kvItem.Key()), "v", string(kvItem.Value()), "c", kvItem.Counter())
}
}
I called TestWrite and then TestIterator. Without it.Rewind() before the loop you should encounter an error.
My question is pretty simple. Do I need to call rewind at this point or is it a bug?
We evaluated badger (mostly for fun) when discussing external storage for CockroachDB, and it didn't go well. There are probably many low-hanging fruit there; apologies for leaving relatively unpolished work here, but I thought it could be a good starting point for making badger more competitive. See
cockroachdb/cockroach#16069 (comment)
and
cockroachdb/cockroach#16343 (the branch is still available).
Unfortunately, the standard invocation
make bench PKG=./pkg/sql/distsqlrun TESTS=TestDiskSorts TESTFLAGS=-v
already doesn't compare badger against the other options; it was abandoned before that. However, the history of the branch does have a func sortBadger(t *testing.T, input RowSource, out procOutputHelper)
, so it's probably just a matter of reverting appropriately.
What's implemented too is (ss *sortAllStrategy) sortBadger(s *sorter)
, though @asubiotto would have to remind us how to actually exercise that code. My understanding is that this bit is best ignored.
I don't think we're going to put any real work into this, but hope it's useful to you in some way.
Firstly, I love what you guys have made, I think Badger is amazing and I really appreciate the creation of a really fast pure Go key-value store. Thank you so much for it.
EDIT: Sorry! The first problem is actually documented, my bad! The second problem I had still is unexplained though. I've edited the video link to skip to it.
So it appears that concurrent reads then a write can sometimes not trigger a CAS mismatch error, I've made a screen capture to demonstrate this: https://youtu.be/iMOskJT--2k?t=3m1s
Here's the program I wrote to demonstrate this: https://gist.github.com/1lann/30cc4c981449fa13db7e8d9943dfc2e6
I also have a question, is there a defined way to use CompareAndSet
but for non-existent documents? For example if I read a document, and I find that it doesn't exist, so I decide to create it, is there a way to check nothing else created it concurrently in-between me checking it and writing it? Thanks.
If you need it, here is my Go version and environment:
# go version
go version go1.8.3 darwin/amd64
# go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/jason/Workspace/Go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.8.3/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.8.3/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/mj/9h_k_k_1751_7c85z0b2_l0w0000gn/T/go-build899730521=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
This happens regularly on Travis macOS when running tests for Dgraph. Have to check how to reproduce it locally.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x43a9a7a]
goroutine 30675 [running]:
github.com/dgraph-io/badger/skl.(*Skiplist).IncrRef(0x0)
/Users/travis/gopath/src/github.com/dgraph-io/badger/skl/skl.go:95 +0xa
github.com/dgraph-io/badger.(*KV).getMemTables(0xc42011c600, 0x0, 0x0, 0x0, 0x0)
/Users/travis/gopath/src/github.com/dgraph-io/badger/kv.go:316 +0x10c
github.com/dgraph-io/badger.(*KV).get(0xc42011c600, 0xc4604419d0, 0x10, 0x10, 0x0, 0x0, 0x0, 0x4a00, 0x0, 0x0)
/Users/travis/gopath/src/github.com/dgraph-io/badger/kv.go:359 +0x7d
github.com/dgraph-io/badger.(*KV).Get(0xc42011c600, 0xc4604419d0, 0x10, 0x10, 0xc474544b80, 0xa71eb65795429c54, 0x402c316)
/Users/travis/gopath/src/github.com/dgraph-io/badger/kv.go:374 +0x67
github.com/dgraph-io/dgraph/posting.getNew(0xc4604419d0, 0x10, 0x10, 0xc42011c600, 0x0)
/Users/travis/gopath/src/github.com/dgraph-io/dgraph/posting/list.go:185 +0x1d0
github.com/dgraph-io/dgraph/posting.GetOrCreate(0xc4604419d0, 0x10, 0x10, 0xc400000001, 0x0, 0x0)
/Users/travis/gopath/src/github.com/dgraph-io/dgraph/posting/lists.go:345 +0x1a3
github.com/dgraph-io/dgraph/worker.processTask(0x6363230, 0xc47a0966c0, 0xc4b7961500, 0x1, 0x0, 0x0, 0x0)
/Users/travis/gopath/src/github.com/dgraph-io/dgraph/worker/task.go:292 +0xac8
github.com/dgraph-io/dgraph/worker.ProcessTaskOverNetwork(0x6363230, 0xc47a0966c0, 0xc4b7961500, 0x0, 0x0, 0x0)
/Users/travis/gopath/src/github.com/dgraph-io/dgraph/worker/task.go:63 +0x8a9
github.com/dgraph-io/dgraph/query.ProcessGraph(0x4c8ed00, 0xc47a0966c0, 0xc4538186c0, 0xc4d4443d40, 0xc47a096960)
/Users/travis/gopath/src/github.com/dgraph-io/dgraph/query/query.go:1528 +0x260a
created by github.com/dgraph-io/dgraph/query.ProcessGraph
/Users/travis/gopath/src/github.com/dgraph-io/dgraph/query/query.go:1758 +0x110f
Hello,
I made a benchmark writing 1M records to badger, which took about 18 minutes. I forgot to close the database. The database size on disk was 0, i.e. badger didn't write anything to disk.
Doesn't badger sync anything to the disk in 18 minutes and 1M records?
And isn't 18 minutes quite slow for operations in memory only?
PS: default options, simple Put() of random strings of the form "event RANDOM_INTEGER".
Got this while running some tests in dgraph with the race flag.
==================
WARNING: DATA RACE
Write at 0x00c4201345d0 by goroutine 23:
sync/atomic.CompareAndSwapInt32()
/usr/local/go/src/runtime/race_amd64.s:293 +0xb
github.com/dgraph-io/badger/skl.(*Skiplist).Put()
/home/pawan/go/src/github.com/dgraph-io/badger/skl/skl.go:313 +0x2bc
github.com/dgraph-io/badger.(*KV).writeToLSM()
/home/pawan/go/src/github.com/dgraph-io/badger/kv.go:461 +0x30f
github.com/dgraph-io/badger.(*KV).writeRequests()
/home/pawan/go/src/github.com/dgraph-io/badger/kv.go:520 +0x495
github.com/dgraph-io/badger.(*KV).doWrites()
/home/pawan/go/src/github.com/dgraph-io/badger/kv.go:556 +0x369
Previous read at 0x00c4201345d0 by goroutine 101:
github.com/dgraph-io/badger/skl.(*Skiplist).findNear()
/home/pawan/go/src/github.com/dgraph-io/badger/skl/skl.go:192 +0x68
github.com/dgraph-io/badger/skl.(*Skiplist).Get()
/home/pawan/go/src/github.com/dgraph-io/badger/skl/skl.go:373 +0x7e
github.com/dgraph-io/badger.(*KV).get()
/home/pawan/go/src/github.com/dgraph-io/badger/kv.go:363 +0x121
github.com/dgraph-io/badger.(*KV).Get()
/home/pawan/go/src/github.com/dgraph-io/badger/kv.go:374 +0x7a
github.com/dgraph-io/dgraph/posting.getNew()
/home/pawan/go/src/github.com/dgraph-io/dgraph/posting/list.go:185 +0x2bc
github.com/dgraph-io/dgraph/posting.GetOrCreate()
/home/pawan/go/src/github.com/dgraph-io/dgraph/posting/lists.go:341 +0x203
github.com/dgraph-io/dgraph/worker.runMutations()
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/mutation.go:74 +0x48a
github.com/dgraph-io/dgraph/worker.(*node).processMutation()
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/draft.go:462 +0x18c
github.com/dgraph-io/dgraph/worker.(*node).process()
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/draft.go:515 +0x410
Goroutine 23 (running) created at:
github.com/dgraph-io/badger.NewKV()
/home/pawan/go/src/github.com/dgraph-io/badger/kv.go:231 +0x114f
github.com/dgraph-io/dgraph/cmd/dgraph.prepare()
/home/pawan/go/src/github.com/dgraph-io/dgraph/cmd/dgraph/main_test.go:72 +0x1b9
github.com/dgraph-io/dgraph/cmd/dgraph.TestMain()
/home/pawan/go/src/github.com/dgraph-io/dgraph/cmd/dgraph/main_test.go:1271 +0x38
main.main()
github.com/dgraph-io/dgraph/cmd/dgraph/_test/_testmain.go:88 +0x20f
Goroutine 101 (finished) created at:
github.com/dgraph-io/dgraph/worker.(*node).processApplyCh()
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/draft.go:594 +0x453
Hi is it normal Badger consumes 392mb in resident memory when my database holds 5mb of stuff?
Also, CPU looks quite high.
Thanks!
Sorry to be a bother but I was scanning through the code base and noticed these mutex lock / unlocks without a defer on the unlock. Not sure if it was a bug or if there's a reason behind it.
func (c *Closer) SignalAll() {
c.RLock()
c.RUnlock() // <---- I would have expected a defer here.
for _, l := range c.levels {
l.Signal()
}
}
func (c *Closer) WaitForAll() {
c.RLock()
c.RUnlock() // <---- I would have expected a defer here.
2017/05/03 01:00:54 invalid argument
github.com/dgraph-io/badger/y.Wrap
/home/ubuntu/go/src/github.com/dgraph-io/badger/y/error.go:83
github.com/dgraph-io/badger/y.Check
/home/ubuntu/go/src/github.com/dgraph-io/badger/y/error.go:43
github.com/dgraph-io/badger/table.(*Table).DecrRef
/home/ubuntu/go/src/github.com/dgraph-io/badger/table/table.go:82
github.com/dgraph-io/badger/table.(*TableIterator).Close
/home/ubuntu/go/src/github.com/dgraph-io/badger/table/iterator.go:211
github.com/dgraph-io/badger/badger.(*levelHandler).get
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/levels.go:660
github.com/dgraph-io/badger/badger.(*levelsController).get
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/levels.go:613
github.com/dgraph-io/badger/badger.(*KV).get
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/kv.go:261
github.com/dgraph-io/badger/badger.(*valueLog).rewrite.func1
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/value.go:184
github.com/dgraph-io/badger/badger.(*valueLog).rewrite.func2
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/value.go:223
github.com/dgraph-io/badger/badger.(*logFile).iterate
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/value.go:154
github.com/dgraph-io/badger/badger.(*valueLog).rewrite
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/value.go:225
github.com/dgraph-io/badger/badger.(*valueLog).doRunGC
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/value.go:714
github.com/dgraph-io/badger/badger.(*valueLog).runGCInLoop
/home/ubuntu/go/src/github.com/dgraph-io/badger/badger/value.go:589
I noticed that even if I set verbose to false, I get some noise like: "LOG Compact. Iteration to generate one table took: 110.092ยตs"
Badger is directly calling fmt.Printf
which is not configurable client side.
Thanks
I am wondering if this is going to be integrated into dgraph ?
Partly because I find dgraph awesome but also to see usage examples to get an idea of good approaches. I want to try out using go generate on top of this for some ideas.
I want to try this out and help improve it, so I was wondering if it's worth making a basic roadmap ? For example the WAL looks really interesting for doing data sync / reconciliation between many instances.
Congrats on the release!
I've run a few benchmarks of iteration performance but ran into a few divergent results from those on the blog post. Is it possible to publish additional details on the test rig/ test code used to compare with RocksDB?
No point in prefetching extra data, when the values are scattered all over.
Currently we are using uint32
for Fid
and uint64
for Offset
. It's enough to have uint16
and uint32
respectively.
Do you recommend using badger in production?
Some time ago I've played with badger and between some commits, the file format had changed and it was not possible to read the previous data.
Also it would be nice to have some way to get updated on badger.
See if we need to implement bloom filters, after implementing Succinct paper.
Hello,
OS: MacOSX
Badger version: 20170624
First process:
Second process:
Thank you
I was looking at trying this out as it seems like it might fit a good usecase for me, but i'm not sure exactly how it should be used. Specifically, there doesnt seem to be any sort of error handling, what happens when things go wrong? A Set
call can't always succeed, how will i know when it doesnt?
Experiment with Succinct compression algorithm, to see if it could used to compress tables.
After crash we're replaying entries from value log. We are handling only the case when value should be stored in the LSM and omit the case when value is stored as value pointer in the tree.
See: https://github.com/dgraph-io/badger/blob/master/badger/kv.go#L181
I tried to do some stress test and hit assertion failure, I managed to simplify the test program to reproduce this:
Seeking at value pointer: {Fid:0 Len:0 Offset:0}
1170637 ... Fid: 0 Data status={total:100.0005292892456 keep:0 discard:99.99941825866699}
=====> REWRITING VLOG 0
rewrite called
1173587 ... 2017/05/23 18:10:41 Assert failed
hangxie/vendor/github.com/dgraph-io/badger/y.Errorf
/go/src/hangxie/vendor/github.com/dgraph-io/badger/y/error.go:103
hangxie/vendor/github.com/dgraph-io/badger/y.AssertTrue
/go/src/hangxie/vendor/github.com/dgraph-io/badger/y/error.go:68
hangxie/vendor/github.com/dgraph-io/badger/badger.(*valueLog).writeToKV
/go/src/hangxie/vendor/github.com/dgraph-io/badger/badger/value.go:305
hangxie/vendor/github.com/dgraph-io/badger/badger.(*valueLog).rewrite
/go/src/hangxie/vendor/github.com/dgraph-io/badger/badger/value.go:264
hangxie/vendor/github.com/dgraph-io/badger/badger.(*valueLog).doRunGC
/go/src/hangxie/vendor/github.com/dgraph-io/badger/badger/value.go:848
hangxie/vendor/github.com/dgraph-io/badger/badger.(*valueLog).runGCInLoop
/go/src/hangxie/vendor/github.com/dgraph-io/badger/badger/value.go:734
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:2197
exit status 1
While wiring up badger to use it in IPFS I've ran into an issue which looks like some sort of race condition, after which there is a panic from badger internals after nil pointer dereference. It's hard for me to create minimum test case for that, though I'm able to reproduce it with 100% 'success' rate.
Here is all the info I was able to gather about that issue: https://gist.github.com/magik6k/b9634d85e84fe90906744de0127dbc20
Scenario A is the easiest one to reproduce, though go race detector doesn't detect anything related to it.
Info on messages:
Has
is implemented using GetQuery
is implemented using NewIterator
I hope this is enough data to solve this issue, if it's not I'm happy to provide more.
When accessing a badger kv store from a goroutine other than the one it got created on, even in a synchronized manner (using a sync.Mutex
), I get:
panic: send on closed channel
goroutine 31 [running]:
github.com/dgraph-io/badger/badger.(*KV).BatchSet(0xc4200be580, 0xc42926e000, 0x1, 0x1, 0x0, 0x0)
/.../github.com/dgraph-io/badger/badger/kv.go:541 +0x125
github.com/dgraph-io/badger/badger.(*KV).Set(0xc4200be580, 0xc4250ec020, 0x10, 0x40, 0xc42001e090, 0x10, 0x40, 0xc420032e98, 0x43c6c4)
The code can be found here.
I think it would be good to have BoltDB style documentation in the README of Badger. We have docs in godocs, so can use that for reference.
Are there any plans to add support for batched writes?
We store Fid on 2 bytes. We can address roughly 65k files this way. We need to implement reusing Fids.
Why CAS is uint16
? Is the kv store consistent for this CAS at that specific time? Any lazy (eventual) behavior there?
BTW, your kv's performance is amazing even on conventional (not SSD) hard disks! Nice job!
$ go get -u -a "github.com/dgraph-io/badger/badger"
# github.com/dgraph-io/badger/table
./builder.go:113: constant 4294967295 overflows int
./builder.go:192: constant 4294967295 overflows int
./iterator.go:162: constant 4294967295 overflows int
$ go version
go1.8.1 linux/386
Symptom of sloppy coding. You might want to grep -RI "int$" .
and verify that all variables have correct primitives and bounds.
Trying this out, and i'm quite happy with read and write performance, but i've noticed that iterating through items in the datastore is significantly slower than what we're currently using. Is there going to be work on improving that performance?
We should be able to break away the waiting of requests in BatchSet, into a separate goroutine, which can wait and then run a callback function. This can be used in Dgraph for getting rid of syncCh
.
Currently to test if a key exists or not I use Get
and then test if the Value is nil or not.
Does that mean we hit the value log every time or we only stay in the memtable and LSM ?
It seems that we're creating files but not calling fsync() (or in Golang: .Sync()
) on the directory containing said files. If we don't do that, a system crash can result in the directory entry (in other words, the file) going missing.
Hello, I've just installed using go get github.com/dgraph-io/badger
. I then went to the badger
folder and ran go test
. I got the error message:
# github.com/dgraph-io/badger
kv_test.go:29:2: cannot find package "github.com/stretchr/testify/assert" in any of:
/usr/local/go/src/github.com/stretchr/testify/assert (from $GOROOT)
/Users/carlca/Code/go/src/github.com/stretchr/testify/assert (from $GOPATH)
FAIL github.com/dgraph-io/badger [setup failed]
It would be handly if, at least, there were some installation instructions detailing such requirements.
Cheers,
Carl
I'm trying to sweep through by KV store and delete all keys with a particular prefix. I assume at some point I may value millions of keys. So, I've set up the deletion so that it will sweep through a prefix, collect X keys, then delete them and try again until no more keys are found.
I seem to be seeing the same keys multiple times. In one instance the code seems to go into an infinite loop. I haven't been able to replicate that instance, but the code below seems to insert 1000 entries with a prefix and then delete 1204 of them.
package main
import (
"os"
"fmt"
"log"
"bytes"
"github.com/dgraph-io/badger/badger"
)
func main() {
DB_PATH := "./test.db"
_, err := os.Stat(DB_PATH)
if os.IsNotExist(err) {
os.Mkdir(DB_PATH, 0700)
}
opts := badger.DefaultOptions
opts.Dir = DB_PATH
opts.ValueDir = DB_PATH
kv, err := badger.NewKV(&opts)
if err != nil {
log.Printf("Error: %s", err)
}
PREFIX := "pre"
for i := 0; i < 1000; i++ {
k := []byte(fmt.Sprintf("%s%d", PREFIX, i))
kv.Set(k,[]byte{})
}
for i := 0; i < 1000; i++ {
k := []byte(fmt.Sprintf("%s%d", "ALT", i))
kv.Set(k,[]byte{})
}
DELETE_BLOCK_SIZE := 100
deleted := 0
for found := true; found; {
found = false
wb := make([]*badger.Entry, 0, DELETE_BLOCK_SIZE)
opts := badger.DefaultIteratorOptions
opts.FetchValues = false
it := kv.NewIterator(opts)
for it.Seek([]byte(PREFIX)); it.Valid() && bytes.HasPrefix(it.Item().Key(), []byte(PREFIX)) && len(wb) < DELETE_BLOCK_SIZE; it.Next() {
wb = badger.EntriesDelete(wb, it.Item().Key())
}
it.Close()
if len(wb) > 0 {
log.Printf("Deleting: %d", len(wb))
deleted += len(wb)
kv.BatchSet(wb)
found = true
}
}
log.Printf("Total Deleted:%d", deleted)
kv.Close()
}
Then outputs
$ go run db_del.go
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 100
2017/06/18 23:10:59 Deleting: 4
2017/06/18 23:10:59 Total Deleted:1204
Saw these logs on Dgraph Travis. Not reproducible though. Just wanted to bring it to notice.
TESTING: github.com/dgraph-io/dgraph/gql
2017/06/07 12:10:37 error.go:45: EOF
Unable to replay value log: "/tmp/000000.vlog"
github.com/dgraph-io/badger/badger.(*valueLog).Replay
/home/travis/gopath/src/github.com/dgraph-io/badger/badger/value.go:574
github.com/dgraph-io/badger/badger.NewKV
/home/travis/gopath/src/github.com/dgraph-io/badger/badger/kv.go:226
github.com/dgraph-io/dgraph/gql.TestMain
/home/travis/gopath/src/github.com/dgraph-io/dgraph/gql/parser_test.go:3359
main.main
github.com/dgraph-io/dgraph/gql/_test/_testmain.go:520
runtime.main
/home/travis/.gimme/versions/go1.8.linux.amd64/src/runtime/proc.go:185
runtime.goexit
/home/travis/.gimme/versions/go1.8.linux.amd64/src/runtime/asm_amd64.s:2197
exit status 1
FAIL github.com/dgraph-io/dgraph/gql 0.021s
I build an app using Go and I try to use badger as the key-value store. I have successfully set and get to the store, but I found some weird error when I stop the binary and start the binary again. This is my error message
2017/05/29 13:12:41 Assert failed
github.com/dgraph-io/badger/y.Errorf
/go/src/github.com/dgraph-io/badger/y/error.go:98
github.com/dgraph-io/badger/y.AssertTrue
/go/src/github.com/dgraph-io/badger/y/error.go:63
github.com/dgraph-io/badger/badger.(*levelHandler).replaceTables
/go/src/github.com/dgraph-io/badger/badger/levels.go:131
github.com/dgraph-io/badger/badger.(*levelsController).doCompact.func2
/go/src/github.com/dgraph-io/badger/badger/levels.go:506
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:2197
How to reproduce this error:
badger.NewKV(badger.DefaultOptions)
method in the main functiondefer kv.Close()
in the main function tooI have tried some cases but it seems like if I call badger.NewKV
and then close it without any insertion to the store and then calling badger.NewKV
again will trigger the error.
If I delete the store directory, my apps can start again but the data is gone bcause it is deleted.
Did I do anything wrong ? I can't really debug it because the error message does not contains much information about what is happening. Thanks for the help
What would be the consequences of having the value log in another directory then the rest of the files ?
I'm thinking to have two kind of disk, one for the LSM tree and one for the value log. I was thinking to use a slower disk for the value log to lower the cost again.
I dived into the code a bit and have a PR ready to enable his feature if you are interested.
Is it possible to add this feature in badger?
Hi,
I'm trying to use badger, but I've got erros with undefined sort.Slice().
When I tried to run
go get -t -v github.com/dgraph-io/badger
then,
src/github.com/dgraph-io/badger/level_handler.go:64: undefined: sort.Slice
src/github.com/dgraph-io/badger/level_handler.go:69: undefined: sort.Slice
src/github.com/dgraph-io/badger/levels.go:200: undefined: sort.Slice
src/github.com/dgraph-io/badger/levels.go:363: undefined: sort.Slice
src/github.com/dgraph-io/badger/value.go:271: undefined: sort.Slice
src/github.com/dgraph-io/badger/value.go:493: undefined: sort.Slice
I'm running golang 1.7.6 on Fedora 25
I am trying to compile a system with
src/vendor/github.com/dgraph-io/badger/kv.go:139: constant 2147483648 overflows int
You can repro it for any 32-bit system.
I'm trying badger as a replacement for github.com/syndtr/goleveldb and one thing that hit me immediately is that there are no errors in the badger Go API. Any error will just exit the process through log.Fatalf
. Is that a design decision or just temporary?
Is there a hard coded value size limit for badger?
Since badger saves keys and pointer to values, it seems it would not have problems with a large value size - say 2 MB.
for example i need to do something like this SQL:
SELECT value
FROM table
WHERE key >= 'key1'
AND key <= 'key2'
does badger have a function or command to do this?
Running go test -race ./...
causes some tests to fail.
I get this crash when running my program, which embeds badger @ bc04380.
panic: runtime error: index out of range
goroutine 55 [running]:
github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger.(*KV).updateOffset(0xc420107680, 0x0, 0x0, 0x0)
/Users/fjl/develop/eth/src/github.com/ethereum/go-ethereum/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger/kv.go:316 +0x10c
github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger.(*KV).writeRequests(0xc420107680, 0xc42560beb8, 0x1, 0xa)
/Users/fjl/develop/eth/src/github.com/ethereum/go-ethereum/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger/kv.go:392 +0x407
github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger.(*KV).doWrites(0xc420107680, 0xc42013a450)
/Users/fjl/develop/eth/src/github.com/ethereum/go-ethereum/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger/kv.go:420 +0x2b7
created by github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger.NewKV
/Users/fjl/develop/eth/src/github.com/ethereum/go-ethereum/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/dgraph-io/badger/badger/kv.go:152 +0x593
To reproduce:
git clone --branch badger-exp https://github.com/fjl/go-ethereum
cd go-ethereum
make
./build/bin/geth --badgerdb --datadir=DATA
geth
is a client for the Ethereum cryptocurrency network and is probably a really good
smoke test for your KV store because it's RW heavy. When you run it, blockchain data will be downloaded from the p2p network and stored in badger (at least that's what should happen, it doesn't really though because of the crash). The commit that adds experimental support for badger is fjl/go-ethereum@a7f7238.
Steps to reproduce
Compile and run dgraph on branch badger
.
./dgraph
Load goldendata using dgraphloader.
./dgraphloader -r ~/go/src/github.com/dgraph-io/benchmarks/data/goldendata.rdf.gz
As soon as dgraphloader finishes, issue a shutdown command to server.
curl localhost:8080/admin/shutdown
Start Dgraph again.
./dgraph
Now try taking a backup.
curl localhost:8080/admin/backup
Dgraph panics with error
panic: runtime error: slice bounds out of range
goroutine 19899 [running]:
github.com/dgraph-io/dgraph/x.Parse(0xc43331498a, 0x6, 0x6, 0x2)
/home/pawan/go/src/github.com/dgraph-io/dgraph/x/keys.go:195 +0x236
github.com/dgraph-io/dgraph/worker.backup(0x1, 0xcd16da, 0x6, 0x0, 0x0)
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/backup.go:254 +0x85e
github.com/dgraph-io/dgraph/worker.handleBackupForGroup(0x7f6d936a4970, 0xc4200105b0, 0x627d3dd58c69d20, 0xc400000001, 0x0)
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/backup.go:330 +0xa83
github.com/dgraph-io/dgraph/worker.BackupOverNetwork.func1(0xc42b4f6d20, 0x7f6d936a4970, 0xc4200105b0, 0x1)
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/backup.go:445 +0x4b
created by github.com/dgraph-io/dgraph/worker.BackupOverNetwork
/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/backup.go:446 +0x138
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.