Giter Site home page Giter Site logo

gkvlite's People

Contributors

d2g avatar dustin avatar steveyen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gkvlite's Issues

Any plans to add transactions?

Transactions (for any combination of reading & writing during the transaction) maintain a consistent view of the DB during some operation (typically upsert-like ones - i.e. read value V under key K, update part of V so, that it's based on its previous value, and finally save V under K).

Having more go routines (or even real threads from a pool) whereas one is a mutator and the others are readers makes transactions a requirement.

If one needs just non-persistent transactions, then synchronization primitives might be enough (though not easy to grasp from the code). But if one expects transactions take longer, then such blocking behavior is not an option and persistent transactions become a requirement.

EvictSomeItems does not count leaf node in numEvicted

If I understand correctly, how it works, it should be:

diff --git a/collection.go b/collection.go
index 6e92191..1a91b0c 100644
--- a/collection.go
+++ b/collection.go
@@ -245,6 +245,7 @@ func (t *Collection) EvictSomeItems() (numEvicted uint64) {
        })
        if i != nil && err != nil {
                t.store.ItemDecRef(t, i)
+               numEvicted++
        }
        return numEvicted
 }

Reusing byte slice

Thanks for the awesome library!

I'm trying to reuse a byte slice as a key in a for loop, and it appears that gkvlite keeps using the same key.

Some quick code to show an example:

b := make([]byte, 4)

var i uint32 = 0
for {
    if i >= 10 {
        break
    }

    binary.LittleEndian.PutUint32(b, i)
    c.Set(b, []byte(""))

    i++
}

This seems to keep reusing the same key (the file size is equal to that of one entry). If you create a new byte slice though on each iteration and use that, it seems to work as expected:

var i uint32 = 0
for {
    if i >= 10 {
        break
    }

    b := make([]byte, 4)
    binary.LittleEndian.PutUint32(b, i)
    c.Set(b, []byte(""))

    i++
}

Any ideas?

manually evict items from memory

I would like to be able to evict an item (or at least its value) from memory.
Reopening a database as advised in #4 (comment) is suboptimal, as it introduces long pauses and memory usage is still rather high, because items are stored in memory all the time until I call Write, Flush and reopen the database.

Can you provide the following methods, please?

func (t *Collection) EvictValue(i *Item)
func (t *Collection) EvictItem(i *Item)

I would call one of them after I call Set or Get to reduce memory usage.

union code can't be correct

I haven't confirmed, but I can't see how this thing could possibly return an error.

In general, I think anything more than two or three levels should be a sign that something is not right. In this case, there's something like nine levels of conditionals nested, each declaring a distinct err var and throwing it away.

It seems like it should be possible to make a correct implementation that's correct and less... wide.

Edit

Data structure corruption caused by snapshot open/close.

Hey! While working on a downstream project, I traced a problem that I was experiencing back to what I believe is a corruption bug that manifests when using snapshots. I haven't had time to burrow into it yet, but I came away with this fairly minimal reproduction:

package main

import (
        "fmt"

        "github.com/luci/gkvlite"
)

func main() {
        // To trigger: secondKey < firstKey, secondKey < lookup
        firstKey, secondKey, lookup := "c", "a", "b"

        s, _ := gkvlite.NewStore(nil)

        c := s.SetCollection("", nil)
        c.Set([]byte(firstKey), []byte("foo"))

        // If these next two lines don't run, everything works great.
        snap1 := s.Snapshot()
        snap1.Close()

        c.Set([]byte(secondKey), []byte("bar"))
        v, err := c.Get([]byte(lookup))
        fmt.Println(v, err)
}

Prints: [] missing item after item.read() in GetItem()

If the triggering conditions aren't met, if I remove the two snap1 lines, or if I don't close the snapshot, everything works as expected. It looks like there is a corruption of some sort caused by creating and closing snapshots.

I'll take a look deeper tomorrow; just reporting this since now I have a repro.

flaky on Go1.5?

Hi there,

Our continuous build started failing occasionally with the following error:

store_test.go:1111: expected concurrent set to work on key: [65], got: concurrent mutation attempted
store_test.go:1111: expected concurrent set to work on key: [50], got: concurrent mutation attempted

So far I haven't been able to reproduce on a desktop machine, but perhaps there's a race that is being exposed by changes in Go1.5?

CAS

Hello :-)

Can you explain how to make this correct with gkvlite:
We have key "Key", with value "100". Two clients try to change value of "Key" by extracting 100(booth read this value simultaneously), but by application logic "Key" value cannot be less then 0, and just one of two can make extraction.

WBR Paul.

Memory usage during CopyTo

I have a large database I am trying to compact. For this I am using CopyTo as described in the readme.
The problem is it blows the memory on the machine I am running it on (RaspberryPi)*

Some digging later and I believe the issue is in
func (o *Store) visitNodes
It recursively calls itself and while doing so keeps the pointers choiceT and choiceF both valid. This seems to mean that every Itemloc structure that the visitNodes traverses during the compact phase has to be held in memory until the end of this recursive traversal.

I have experimented with some fixes that reduce the memory required, but fundamentally this recursive process limits the size of the database that can be traversed to some function of available system memory.

I've attached a memory profile diagram of the program captured shortly before it crashed due to lack of memory.

*yes I know I could in theory run this on a machine with more memory - but it's an easy way to show the scalability issue in question here.
mem

curious about differences between boltdb and gkvlite

i know they are similar to each other, and i think the design is superior to boltdb, but why did couchbase and bleve, etc go with bolt over this ?
i guess there were reasons, and i am a bit curious what they were.

I am asking because i am looking for a kv store.
i have written a tool to convert map reduce computations such that you can write it in standard golang, parse the AST, and produce a distributed CFG (computation flow graph). Hence allowing a computation to meet its SLA, by throwing more hardware at it.

Now, for each node (process or server), i need a backing store because each needs some memory / file store for long running computations. Bring the computation to the data way of thinking.
Its for machine learning type event sourcing.

Limiting memory use

Hi,

I am doing something wrong when I use gkvlite in this
example, but I'm not sure what. I thought that periodic
calls to the Collection.Write and Store.Flush would limit
the in-memory size of the program, but that doesn't seem
to be the case.

Could I ask you what I'm doing wrong in the following
self contained example?

http://play.golang.org/p/RHmZWRIDet

An example of the memory usage as the number of keys
inserted is increased:

$ for n in 100 200 400 800 1600 3200 6400 128000 256000 512000; do /usr/bin/time -f '%C %MkB' ./grow -n ${n}; done
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile047548036/mem.pprof
./grow -n 100 6752kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile335212671/mem.pprof
./grow -n 200 6928kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile506557173/mem.pprof
./grow -n 400 7424kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile956633517/mem.pprof
./grow -n 800 7840kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile050576843/mem.pprof
./grow -n 1600 9824kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile875078316/mem.pprof
./grow -n 3200 11776kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile891669337/mem.pprof
./grow -n 6400 15648kB
2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile244818683/mem.pprof
./grow -n 128000 159392kB
2014/01/22 18:51:24 profile: memory profiling enabled, /tmp/profile906229215/mem.pprof
./grow -n 256000 280112kB
2014/01/22 18:51:37 profile: memory profiling enabled, /tmp/profile813873422/mem.pprof
./grow -n 512000 580768kB

The memory profile appears to indicate the memory is all in gkvlite:

Adjusting heap profiles for 1-in-4096 sampling rate
Welcome to pprof! For help, type 'help'.
(pprof) top10
Total: 16.0 MB
7.8 48.7% 48.7% 7.8 48.7% github.com/steveyen/gkvlite.(_Collection).mkNode
5.0 31.6% 80.3% 12.8 80.3% github.com/steveyen/gkvlite.(_Collection).Set
1.2 7.8% 88.1% 1.2 7.8% github.com/steveyen/gkvlite.(_itemLoc).write
1.2 7.8% 95.8% 1.2 7.8% github.com/steveyen/gkvlite.(_nodeLoc).write
0.7 4.2% 100.0% 15.8 99.3% main.main

What does MarshalJSON return?

I wrote a code to test the JSON output. But I got a strange result. Here is the code

package main

import(
        "os"
        "fmt"
        "github.com/steveyen/gkvlite"
)

func main(){
    f,_ := os.Create( "test.db" )
    s,_ := gkvlite.NewStore(f)
    c := s.SetCollection("car", nil)
    c.Set([]byte("Honda"), []byte("good"))
    s.Flush()
    f.Sync()

    b,_ := c.MarshalJSON()
    fmt.Printf( string(b) )
    s.Close()
    f.Close()

}

Then output is

{"o":23,"l":52}

What does this mean? I expect the MarshalJSON function returns a Json maybe like

{"Honda":"good"}

Use Varints for persistent serialization format to unlimit the key/value lengths.

What would you think about using Varints (http://golang.org/pkg/encoding/binary/) to encode all numerical values in gkvlite's datafile?

Advantages:

  • would allow support for keys and values up to 2**63-2 without bloating databases which don't use this functionality (2**63-2 because the header needs to encode len(key)+len(value)+len(priority)).
    • [though if the priority were encoded as a fixed size, it's size could be implied].
  • would save a bunch of bytes / item for databases with many small records

Disadvantages:

  • would change the binary format of the database, so old DB files wouldn't be loadable.
  • would change the StoreCallbacks.ItemAlloc interface
  • could add up-to ~8 bytes per record for databases with large key or value records (e.g. records over 268,435,455 bytes would take more that 4 bytes to encode the length header), or with large Priority values (over the same quantity). From examining the code it looks like there are 4 of these integer quantities per item, and MaxVarintLen64 is 10 bytes (2 more than the fixed width encoding).
  • variable-size integers are a bit trickier to deal with in the data file format (though looking at the code I don't think it would be too hard).

I also realize that this project hasn't really changed much in 2 years, so maybe this question will be unseen :). However, would you be interested in a pull request that would do this? I think the biggest hurdle would be the datafile incompatibility. If that's a hard-stop for accepting the pull request, then I probably wouldn't work on it for quite a while.

For now, I've forked and simply bumped the key size up to 32 bits (luci/gkvlite@cf7fa95) (I needed up-to 2MB keys :/).

Empty string is not a valid key

Storing a value under the empty string fails silently:

import "testing"
import "bytes"
import "os"
import "github.com/steveyen/gkvlite"

func TestSimpleEmptyStore(t *testing.T){
    f, err := os.Create("/tmp/test.gkvlite")
    if err != nil {
        t.Errorf("Failed to create Storage File: %v",err)
    }
    res, err := gkvlite.NewStore(f)
    if err != nil {
        t.Errorf("Failed to create Storage Object: %v",err)
    }
    db := res.SetCollection("test",nil)
    key :=  []byte{}
    text := []byte("fnord")
    db.Set(key,text)
    readback,err := db.Get(key)
    if err != nil {
        t.Errorf("Failed to read back %v",err)
    }
    if bytes.Compare(text,readback) != 0 {
        t.Errorf("Failed to read back, got: %v expected: %v",readback,text)
    }
}

The test case fails after returning the empty string instead of the stored value:

--- FAIL: TestSimpleEmptyStore (0.00 seconds)
    gkvlite_test.go:26: Failed to read back, got: [] expected: [102 110 111 114 100]

Compiling example program

Hello,

I tried compiling the example program and got this error:

./g.go:31: cannot use func literal (type func(*gkvlite.Item) bool) as type bool in function argument
./g.go:31: not enough arguments in call to c.VisitItemsAscend
./g.go:48: undefined: bytes

I think the example program is out-of-sync with the present release.
The source for Collection.go indicates a different number of arguments than in the example:
func (t *Collection) VisitItemsAscend(target []byte, withValue bool, v ItemVisitor)

3 arguments

example:
c.VisitItemsAscend([]byte("ford"), func(i *gkvlite.Item) bool {
2 arguments

Any chance you could update the example program, if I am correct? :-)

Oh, I am running on Fedora 18, go 1.1
go version go1.1 linux/amd64

Thanks,
Glen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.