steveyen / gkvlite Goto Github PK
View Code? Open in Web Editor NEWSimple, ordered, key-value persistence library for the Go Language
License: MIT License
Simple, ordered, key-value persistence library for the Go Language
License: MIT License
Transactions (for any combination of reading & writing during the transaction) maintain a consistent view of the DB during some operation (typically upsert-like ones - i.e. read value V under key K, update part of V so, that it's based on its previous value, and finally save V under K).
Having more go routines (or even real threads from a pool) whereas one is a mutator and the others are readers makes transactions a requirement.
If one needs just non-persistent transactions, then synchronization primitives might be enough (though not easy to grasp from the code). But if one expects transactions take longer, then such blocking behavior is not an option and persistent transactions become a requirement.
Library should also have autoexpiration for defined keys.
If I understand correctly, how it works, it should be:
diff --git a/collection.go b/collection.go
index 6e92191..1a91b0c 100644
--- a/collection.go
+++ b/collection.go
@@ -245,6 +245,7 @@ func (t *Collection) EvictSomeItems() (numEvicted uint64) {
})
if i != nil && err != nil {
t.store.ItemDecRef(t, i)
+ numEvicted++
}
return numEvicted
}
Thanks for the awesome library!
I'm trying to reuse a byte slice as a key in a for loop, and it appears that gkvlite keeps using the same key.
Some quick code to show an example:
b := make([]byte, 4)
var i uint32 = 0
for {
if i >= 10 {
break
}
binary.LittleEndian.PutUint32(b, i)
c.Set(b, []byte(""))
i++
}
This seems to keep reusing the same key (the file size is equal to that of one entry). If you create a new byte slice though on each iteration and use that, it seems to work as expected:
var i uint32 = 0
for {
if i >= 10 {
break
}
b := make([]byte, 4)
binary.LittleEndian.PutUint32(b, i)
c.Set(b, []byte(""))
i++
}
Any ideas?
I would like to be able to evict an item (or at least its value) from memory.
Reopening a database as advised in #4 (comment) is suboptimal, as it introduces long pauses and memory usage is still rather high, because items are stored in memory all the time until I call Write
, Flush
and reopen the database.
Can you provide the following methods, please?
func (t *Collection) EvictValue(i *Item)
func (t *Collection) EvictItem(i *Item)
I would call one of them after I call Set
or Get
to reduce memory usage.
I haven't confirmed, but I can't see how this thing could possibly return an error.
In general, I think anything more than two or three levels should be a sign that something is not right. In this case, there's something like nine levels of conditionals nested, each declaring a distinct err
var and throwing it away.
It seems like it should be possible to make a correct implementation that's correct and less... wide.
Hey! While working on a downstream project, I traced a problem that I was experiencing back to what I believe is a corruption bug that manifests when using snapshots. I haven't had time to burrow into it yet, but I came away with this fairly minimal reproduction:
package main
import (
"fmt"
"github.com/luci/gkvlite"
)
func main() {
// To trigger: secondKey < firstKey, secondKey < lookup
firstKey, secondKey, lookup := "c", "a", "b"
s, _ := gkvlite.NewStore(nil)
c := s.SetCollection("", nil)
c.Set([]byte(firstKey), []byte("foo"))
// If these next two lines don't run, everything works great.
snap1 := s.Snapshot()
snap1.Close()
c.Set([]byte(secondKey), []byte("bar"))
v, err := c.Get([]byte(lookup))
fmt.Println(v, err)
}
Prints: [] missing item after item.read() in GetItem()
If the triggering conditions aren't met, if I remove the two snap1
lines, or if I don't close the snapshot, everything works as expected. It looks like there is a corruption of some sort caused by creating and closing snapshots.
I'll take a look deeper tomorrow; just reporting this since now I have a repro.
Hi there,
Our continuous build started failing occasionally with the following error:
store_test.go:1111: expected concurrent set to work on key: [65], got: concurrent mutation attempted
store_test.go:1111: expected concurrent set to work on key: [50], got: concurrent mutation attempted
So far I haven't been able to reproduce on a desktop machine, but perhaps there's a race that is being exposed by changes in Go1.5?
Hello :-)
Can you explain how to make this correct with gkvlite:
We have key "Key", with value "100". Two clients try to change value of "Key" by extracting 100(booth read this value simultaneously), but by application logic "Key" value cannot be less then 0, and just one of two can make extraction.
WBR Paul.
I have a large database I am trying to compact. For this I am using CopyTo as described in the readme.
The problem is it blows the memory on the machine I am running it on (RaspberryPi)*
Some digging later and I believe the issue is in
func (o *Store) visitNodes
It recursively calls itself and while doing so keeps the pointers choiceT and choiceF both valid. This seems to mean that every Itemloc structure that the visitNodes traverses during the compact phase has to be held in memory until the end of this recursive traversal.
I have experimented with some fixes that reduce the memory required, but fundamentally this recursive process limits the size of the database that can be traversed to some function of available system memory.
I've attached a memory profile diagram of the program captured shortly before it crashed due to lack of memory.
*yes I know I could in theory run this on a machine with more memory - but it's an easy way to show the scalability issue in question here.
i know they are similar to each other, and i think the design is superior to boltdb, but why did couchbase and bleve, etc go with bolt over this ?
i guess there were reasons, and i am a bit curious what they were.
I am asking because i am looking for a kv store.
i have written a tool to convert map reduce computations such that you can write it in standard golang, parse the AST, and produce a distributed CFG (computation flow graph). Hence allowing a computation to meet its SLA, by throwing more hardware at it.
Now, for each node (process or server), i need a backing store because each needs some memory / file store for long running computations. Bring the computation to the data way of thinking.
Its for machine learning type event sourcing.
Hi,
I am doing something wrong when I use gkvlite in this
example, but I'm not sure what. I thought that periodic
calls to the Collection.Write and Store.Flush would limit
the in-memory size of the program, but that doesn't seem
to be the case.
Could I ask you what I'm doing wrong in the following
self contained example?
http://play.golang.org/p/RHmZWRIDet
An example of the memory usage as the number of keys
inserted is increased:
$ for n in 100 200 400 800 1600 3200 6400 128000 256000 512000; do /usr/bin/time -f '%C %MkB' ./grow -n ${n}; done
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile047548036/mem.pprof
./grow -n 100 6752kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile335212671/mem.pprof
./grow -n 200 6928kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile506557173/mem.pprof
./grow -n 400 7424kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile956633517/mem.pprof
./grow -n 800 7840kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile050576843/mem.pprof
./grow -n 1600 9824kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile875078316/mem.pprof
./grow -n 3200 11776kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile891669337/mem.pprof
./grow -n 6400 15648kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile244818683/mem.pprof
./grow -n 128000 159392kB
2014/01/22 18:51:24 profile: memory profiling enabled, /tmp/profile906229215/mem.pprof
./grow -n 256000 280112kB
2014/01/22 18:51:37 profile: memory profiling enabled, /tmp/profile813873422/mem.pprof
./grow -n 512000 580768kB
The memory profile appears to indicate the memory is all in gkvlite:
Adjusting heap profiles for 1-in-4096 sampling rate
Welcome to pprof! For help, type 'help'.
(pprof) top10
Total: 16.0 MB
7.8 48.7% 48.7% 7.8 48.7% github.com/steveyen/gkvlite.(_Collection).mkNode
5.0 31.6% 80.3% 12.8 80.3% github.com/steveyen/gkvlite.(_Collection).Set
1.2 7.8% 88.1% 1.2 7.8% github.com/steveyen/gkvlite.(_itemLoc).write
1.2 7.8% 95.8% 1.2 7.8% github.com/steveyen/gkvlite.(_nodeLoc).write
0.7 4.2% 100.0% 15.8 99.3% main.main
I wrote a code to test the JSON output. But I got a strange result. Here is the code
package main
import(
"os"
"fmt"
"github.com/steveyen/gkvlite"
)
func main(){
f,_ := os.Create( "test.db" )
s,_ := gkvlite.NewStore(f)
c := s.SetCollection("car", nil)
c.Set([]byte("Honda"), []byte("good"))
s.Flush()
f.Sync()
b,_ := c.MarshalJSON()
fmt.Printf( string(b) )
s.Close()
f.Close()
}
Then output is
{"o":23,"l":52}
What does this mean? I expect the MarshalJSON function returns a Json maybe like
{"Honda":"good"}
What would you think about using Varints (http://golang.org/pkg/encoding/binary/) to encode all numerical values in gkvlite's datafile?
Advantages:
2**63-2
without bloating databases which don't use this functionality (2**63-2
because the header needs to encode len(key)+len(value)+len(priority)).
Disadvantages:
StoreCallbacks.ItemAlloc
interfaceI also realize that this project hasn't really changed much in 2 years, so maybe this question will be unseen :). However, would you be interested in a pull request that would do this? I think the biggest hurdle would be the datafile incompatibility. If that's a hard-stop for accepting the pull request, then I probably wouldn't work on it for quite a while.
For now, I've forked and simply bumped the key size up to 32 bits (luci/gkvlite@cf7fa95) (I needed up-to 2MB keys :/).
Storing a value under the empty string fails silently:
import "testing"
import "bytes"
import "os"
import "github.com/steveyen/gkvlite"
func TestSimpleEmptyStore(t *testing.T){
f, err := os.Create("/tmp/test.gkvlite")
if err != nil {
t.Errorf("Failed to create Storage File: %v",err)
}
res, err := gkvlite.NewStore(f)
if err != nil {
t.Errorf("Failed to create Storage Object: %v",err)
}
db := res.SetCollection("test",nil)
key := []byte{}
text := []byte("fnord")
db.Set(key,text)
readback,err := db.Get(key)
if err != nil {
t.Errorf("Failed to read back %v",err)
}
if bytes.Compare(text,readback) != 0 {
t.Errorf("Failed to read back, got: %v expected: %v",readback,text)
}
}
The test case fails after returning the empty string instead of the stored value:
--- FAIL: TestSimpleEmptyStore (0.00 seconds)
gkvlite_test.go:26: Failed to read back, got: [] expected: [102 110 111 114 100]
Hello,
I tried compiling the example program and got this error:
./g.go:31: cannot use func literal (type func(*gkvlite.Item) bool) as type bool in function argument
./g.go:31: not enough arguments in call to c.VisitItemsAscend
./g.go:48: undefined: bytes
I think the example program is out-of-sync with the present release.
The source for Collection.go indicates a different number of arguments than in the example:
func (t *Collection) VisitItemsAscend(target []byte, withValue bool, v ItemVisitor)
3 arguments
example:
c.VisitItemsAscend([]byte("ford"), func(i *gkvlite.Item) bool {
2 arguments
Any chance you could update the example program, if I am correct? :-)
Oh, I am running on Fedora 18, go 1.1
go version go1.1 linux/amd64
Thanks,
Glen
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.