Giter Site home page Giter Site logo

philippgille / gokv Goto Github PK

View Code? Open in Web Editor NEW
669.0 9.0 65.0 758 KB

Simple key-value store abstraction and implementations for Go (Redis, Consul, etcd, bbolt, BadgerDB, LevelDB, Memcached, DynamoDB, S3, PostgreSQL, MongoDB, CockroachDB and many more)

License: Mozilla Public License 2.0

Go 98.54% PowerShell 0.58% Shell 0.88%
go golang key-value key-value-store library package abstraction simple redis bolt

gokv's Introduction

gokv

Go Reference Build status Go Report Card codecov GitHub Releases Mentioned in Awesome Go

Simple key-value store abstraction and implementations for Go

Contents

  1. Features
    1. Simple interface
    2. Implementations
    3. Value types
    4. Marshal formats
    5. Roadmap
  2. Usage
    1. Examples
  3. Project status
  4. Motivation
  5. Design decisions
  6. Related projects
  7. License

Features

Simple interface

Note: The interface is not final yet! See Project status for details.

type Store interface {
    Set(k string, v any) error
    Get(k string, v any) (found bool, err error)
    Delete(k string) error
    Close() error
}

There are detailed descriptions of the methods in the docs and in the code. You should read them if you plan to write your own gokv.Store implementation or if you create a Go package with a method that takes a gokv.Store as parameter, so you know exactly what happens in the background.

Implementations

Some of the following databases aren't specifically engineered for storing key-value pairs, but if someone's running them already for other purposes and doesn't want to set up one of the proper key-value stores due to administrative overhead etc., they can of course be used as well. In those cases let's focus on a few of the most popular though. This mostly goes for the SQL, NoSQL and NewSQL categories.

Feel free to suggest more stores by creating an issue or even add an actual implementation - PRs Welcome.

For differences between the implementations, see Choosing an implementation.
For the Godoc of specific implementations, see https://pkg.go.dev/github.com/philippgille/gokv#section-directories.

Again:
For differences between the implementations, see Choosing an implementation.
For the Godoc of specific implementations, see https://pkg.go.dev/github.com/philippgille/gokv#section-directories.

Value types

Most Go packages for key-value stores just accept a []byte as value, which requires developers for example to marshal (and later unmarshal) their structs. gokv is meant to be simple and make developers' lifes easier, so it accepts any type (with using any/interface{} as parameter), including structs, and automatically (un-)marshals the value.

The kind of (un-)marshalling is left to the implementation. All implementations in this repository currently support JSON and gob by using the encoding subpackage in this repository, which wraps the core functionality of the standard library's encoding/json and encoding/gob packages. See Marshal formats for details.

For unexported struct fields to be (un-)marshalled to/from JSON/gob, the respective custom (un-)marshalling methods need to be implemented as methods of the struct (e.g. MarshalJSON() ([]byte, error) for custom marshalling into JSON). See Marshaler and Unmarshaler for JSON, and GobEncoder and GobDecoder for gob.

To improve performance you can also implement the custom (un-)marshalling methods so that no reflection is used by the encoding/json / encoding/gob packages. This is not a disadvantage of using a generic key-value store package, it's the same as if you would use a concrete key-value store package which only accepts []byte, requiring you to (un-)marshal your structs.

Marshal formats

This repository contains the subpackage encoding, which is an abstraction and wrapper for the core functionality of packages like encoding/json and encoding/gob. The currently supported marshal formats are:

More formats will be supported in the future (e.g. XML).

The stores use this encoding package to marshal and unmarshal the values when storing / retrieving them. The default format is JSON, but all gokv.Store implementations in this repository also support gob as alternative, configurable via their Options.

The marshal format is up to the implementations though, so package creators using the gokv.Store interface as parameter of a function should not make any assumptions about this. If they require any specific format they should inform the package user about this in the GoDoc of the function taking the store interface as parameter.

Differences between the formats:

Roadmap

  • Benchmarks!
  • CLI: A simple command line interface tool that allows you create, read, update and delete key-value pairs in all of the gokv storages
  • A combiner package that allows you to create a gokv.Store which forwards its call to multiple implementations at the same time. So for example you can use memcached and s3 simultaneously to have 1) super fast access but also 2) durable redundant persistent storage.
  • A way to directly configure the clients via the options of the underlying used Go package (e.g. not the redis.Options struct in github.com/philippgille/gokv, but instead the redis.Options struct in github.com/go-redis/redis)
    • Will be optional and discouraged, because this will lead to compile errors in code that uses gokv when switching the underlying used Go package, but definitely useful for some people
  • More stores (see stores in Implementations list with unchecked boxes)
  • Maybe rename the project from gokv to SimpleKV?
  • Maybe move all implementation packages into a subdirectory, e.g. github.com/philippgille/gokv/store/redis?

Usage

First, download the module you want to work with:

  • For example when you want to work with the gokv.Store interface:
    • go get github.com/philippgille/gokv@latest
  • For example when you want to work with the Redis implementation:
    • go get github.com/philippgille/gokv/redis@latest

Then you can import and use it.

Every implementation has its own Options struct, but all implementations have a NewStore() / NewClient() function that returns an object of a sctruct that implements the gokv.Store interface. Let's take the implementation for Redis as example, which is the most popular distributed key-value store.

package main

import (
    "fmt"

    "github.com/philippgille/gokv"
    "github.com/philippgille/gokv/redis"
)

type foo struct {
    Bar string
}

func main() {
    options := redis.DefaultOptions // Address: "localhost:6379", Password: "", DB: 0

    // Create client
    client, err := redis.NewClient(options)
    if err != nil {
        panic(err)
    }
    defer client.Close()

    // Store, retrieve, print and delete a value
    interactWithStore(client)
}

// interactWithStore stores, retrieves, prints and deletes a value.
// It's completely independent of the store implementation.
func interactWithStore(store gokv.Store) {
    // Store value
    val := foo{
        Bar: "baz",
    }
    err := store.Set("foo123", val)
    if err != nil {
        panic(err)
    }

    // Retrieve value
    retrievedVal := new(foo)
    found, err := store.Get("foo123", retrievedVal)
    if err != nil {
        panic(err)
    }
    if !found {
        panic("Value not found")
    }

    fmt.Printf("foo: %+v", *retrievedVal) // Prints `foo: {Bar:baz}`

    // Delete value
    err = store.Delete("foo123")
    if err != nil {
        panic(err)
    }
}

As described in the comments, that code does the following:

  1. Create a client for Redis
    • Some implementations' stores/clients don't require to be closed, but when working with the interface (for example as function parameter) you must call Close() because you don't know which implementation is passed. Even if you work with a specific implementation you should always call Close(), so you can easily change the implementation without the risk of forgetting to add the call.
  2. Call interactWithStore(), which requires a gokv.Store as parameter. This method then:
    1. Stores an object of type foo in the Redis server running on localhost:6379 with the key foo123
    2. Retrieves the value for the key foo123
      • The check if the value was found isn't needed in this example but is included for demonstration purposes
    3. Prints the value. It prints foo: {Bar:baz}, which is exactly what was stored before.
    4. Deletes the value

Now let's say you don't want to use Redis but Consul instead. You just have to make three simple changes:

  1. Replace the import of "github.com/philippgille/gokv/redis" by "github.com/philippgille/gokv/consul"
  2. Replace redis.DefaultOptions by consul.DefaultOptions
  3. Replace redis.NewClient(options) by consul.NewClient(options)

Everything else works the same way. interactWithStore() is completely unaffected.

Examples

See the examples directory for more code examples.

Project status

Note: gokv's API is not stable yet and is under active development. Upcoming releases are likely to contain breaking changes as long as the version is v0.x.y. This project adheres to Semantic Versioning and all notable changes to this project are documented in CHANGELOG.md.

Planned interface methods until v1.0.0:

  • List(any) error / GetAll(any) error or similar

The interface might even change until v1.0.0. For example one consideration is to change Get(string, any) (bool, error) to Get(string, any) error (no boolean return value anymore), with the error being something like gokv.ErrNotFound // "Key-value pair not found" to fulfill the additional role of indicating that the key-value pair wasn't found. But at the moment we prefer the current method signature.

Also, more interfaces might be added. For example so that there's a SimpleStore and an AdvancedStore, with the first one containing only the basic methods and the latter one with advanced features such as key-value pair lifetimes (deletion of key-value pairs after a given time), notification of value changes via Go channels etc. But currently the focus is simplicity, see Design decisions.

Motivation

When creating a package you want the package to be usable by as many developers as possible. Let's look at a specific example: You want to create a paywall middleware for the Gin web framework. You need some database to store state. You can't use a Go map, because its data is not persisted across web service restarts. You can't use an embedded DB like bbolt, BadgerDB or SQLite, because that would restrict the web service to one instance, but nowadays every web service is designed with high horizontal scalability in mind. If you use Redis, MongoDB or PostgreSQL though, you would force the package user (the developer who creates the actual web service with Gin and your middleware) to run and administrate the server, even if she might never have used it before and doesn't know how to configure them for high performance and security.

Any decision for a specific database would limit the package's usability.

One solution would be a custom interface where you would leave the implementation to the package user. But that would require the developer to dive into the details of the Go package of the chosen key-value store. And if the developer wants to switch the store, or maybe use one for local testing and another for production, she would need to write multiple implementations.

gokv is the solution for these problems. Package creators use the gokv.Store interface as parameter and can call its methods within their code, leaving the decision which actual store to use to the package user. Package users pick one of the implementations, for example github.com/philippgille/gokv/redis for Redis and pass the redis.Client created by redis.NewClient(...) as parameter. Package users can also develop their own implementations if they need to.

gokv doesn't just have to be used to satisfy some gokv.Store parameter. It can of course also be used by application / web service developers who just don't want to dive into the sometimes complicated usage of some key-value store packages.

Initially it was developed as storage package within the project ln-paywall to provide the users of ln-paywall with multiple storage options, but at some point it made sense to turn it into a repository of its own.

Before doing so I examined existing Go packages with a similar purpose (see Related projects), but none of them fit my needs. They either had too few implementations, or they didn't automatically marshal / unmarshal passed structs, or the interface had too many methods, making the project seem too complex to maintain and extend, proven by some that were abandoned or forked (splitting the community with it).

Design decisions

  • gokv is primarily an abstraction for key-value stores, not caches, so there's no need for cache eviction and timeouts.
    • It's still possible to have cache eviction. In some cases you can configure it on the server, or in case of Memcached it's even the default. Or you can have an implementation-specific Option that configures the key-value store client to set a timeout on some key-value pair when storing it in the server. But this should be implementation-specific and not be part of the interface methods, which would require every implementation to support cache eviction.
  • The package should be usable without having to write additional code, so structs should be (un-)marshalled automatically, without having to implement MarshalJSON() / GobEncode() and UnmarshalJSON() / GobDecode() first. It's still possible to implement these methods to customize the (un-)marshalling, for example to include unexported fields, or for higher performance (because the encoding/json / encoding/gob package doesn't have to use reflection).
  • It should be easy to create your own store implementations, as well as to review and maintain the code of this repository, so there should be as few interface methods as possible, but still enough so that functions taking the gokv.Store interface as parameter can do everything that's usually required when working with a key-value store. For example, a boolean return value for the Delete method that indicates whether a value was actually deleted (because it was previously present) can be useful, but isn't a must-have, and also it would require some Store implementations to implement the check by themselves (because the existing libraries don't support it), which would unnecessarily decrease performance for those who don't need it. Or as another example, a Watch(key string) (<-chan Notification, error) method that sends notifications via a Go channel when the value of a given key changes is nice to have for a few use cases, but in most cases it's not required.
    • Note: In the future we might add another interface, so that there's one for the basic operations and one for advanced uses.

  • Similar projects name the structs that are implementations of the store interface according to the backing store, for example boltdb.BoltDB, but this leads to so called "stuttering" that's discouraged when writing idiomatic Go. That's why gokv uses for example bbolt.Store and syncmap.Store. For easier differentiation between embedded DBs and DBs that have a client and a server component though, the first ones are called Store and the latter ones are called Client, for example redis.Client.
  • All errors are implementation-specific. We could introduce a gokv.StoreError type and define some constants like a SetError or something more specific like a TimeoutError, but non-specific errors don't help the package user, and specific errors would make it very hard to create and especially maintain a gokv.Store implementation. You would need to know exactly in which cases the package (that the implementation uses) returns errors, what the errors mean (to "translate" them) and keep up with changes and additions of errors in the package. So instead, errors are just forwarded. For example, if you use the dynamodb package, the returned errors will be errors from the "github.com/aws/aws-sdk-go package.
  • Keep the terminology of used packages. This might be controversial, because an abstraction / wrapper unifies the interface of the used packages. But:
    1. Naming is hard. If one used package for an embedded database uses Path and another Directory, then how should be name the option for the database directory? Maybe Folder, to add to the confusion? Also, some users might already have used the packages we use directly and they would wonder about the "new" variable name which has the same meaning.
      Using the packages' variable names spares us the need to come up with unified, understandable variable names without alienating users who already used the packages we use directly.
    2. Only few users are going to switch back and forth between gokv.Store implementations, so most user won't even notice the differences in variable names.
  • Each gokv implementation is a Go module. This differs from repositories that contain a single Go module with many subpackages, but has the huge advantage that if you only want to work with the Redis client for example, the go get will only fetch the Redis dependencies and not the huge amount of dependencies that are used across the whole repository.

Related projects

  • libkv
    • Uses []byte as value, no automatic (un-)marshalling of structs
    • No support for Redis, BadgerDB, Go map, MongoDB, AWS DynamoDB, Memcached, MySQL, ...
    • Not actively maintained anymore (3 direct commits + 1 merged PR in the last 10+ months, as of 2018-10-13)
  • valkeyrie
    • Fork of libkv
    • Same disadvantage: Uses []byte as value, no automatic (un-)marshalling of structs
    • No support for BadgerDB, Go map, MongoDB, AWS DynamoDB, Memcached, MySQL, ...
  • gokvstores
    • Only supports Redis and local in-memory cache
    • Not actively maintained anymore (4 direct commits + 1 merged PR in the last 10+ months, as of 2018-10-13)
    • 13 stars (as of 2018-10-13)
  • gokv
    • Requires a json.Marshaler / json.Unmarshaler as parameter, so you always need to explicitly implement their methods for your structs, and also you can't use gob or other formats for (un-)marshaling.
    • No support for Consul, etcd, bbolt / Bolt, BadgerDB, MongoDB, AWS DynamoDB, Memcached, MySQL, ...
    • Separate repo for each implementation, which has advantages and disadvantages
    • No releases (makes it harder to use with package managers like dep)
    • 2-7 stars (depending on the repository, as of 2018-10-13)

Others:

  • gladkikhartem/gokv: No Delete() method, no Redis, embedded DBs etc., no Git tags / releases, no stars (as of 2018-11-28)
  • bradberger/gokv: Not maintained (no commits in the last 22 months), no Redis, Consul etc., no Git tags / releases, 1 star (as of 2018-11-28)
    • This package inspired me to implement something similar to its Codec.
  • ppacher/gokv: Not maintained (no commits in the last 22 months), no Redis, embedded DBs etc., no automatic (un-)marshalling, 1 star (as of 2018-11-28)
    • Nice CLI!
  • kapitan-k/gokvstore: Not actively maintained (no commits in the last 10+ months), RocksDB only, requires cgo, no automatic (un-)marshalling, no Git tags/ releases, 1 star (as of 2018-11-28)

License

gokv is licensed under the Mozilla Public License Version 2.0.

Dependencies might be licensed under other licenses.

gokv's People

Contributors

dependabot[bot] avatar fossabot avatar glimchb avatar janisz avatar peczenyj avatar philippgille avatar tdakkota avatar yuce avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gokv's Issues

Evaluate using wire for dependency injection

I haven't worked with wire yet and on the first glance it doesn't look like it makes gokv easier than it already is (for example it looks like code generation is involved), but go-cloud uses wire and go-cloud is similar (at least in its goal to offer an abstraction layer to cloud storage) to gokv.

Also, GitHub user @gedw99 suggested using wire in #72.

Add implementation for MySQL

While PostgreSQL is more popular among Gophers and maybe generally among projects with higher requirements (performance, features), MySQL is still the most popular open source relational database (management system).

It's SQL, so not a key-value store, but that doesn't keep us from creating a table like Item with a k text column as primary key and v blob column, or something like that.

It might be of use for people who already run MySQL and want to use gokv for simple key-value storage.

Also, TiDB is compatible with the MySQL protocol, so as long as there aren't any major differences (some required client-side configuration for example) and it works, this would be a plus (TiDB is a popular "NewSQL" databases).

(Un-)marshal to/from gobs as alternative to JSON

Currently all gokv.Store implementations in this repo (un-)marshal to/from JSON. This is nice because:

  1. In case a distributed store is used, other applications can easily work with the same data (e.g. a Java web service can get a value from Redis and deal with the JSON with a simple JSON library)
  2. When you want to examine some values manually you can use some client (like some web admin dashboard for Redis for example) and see and understand the data without having to decode anything

But:

  1. The marshalled JSON data is probably bigger than gob in its size
  2. According to the gob documentation the (un-)marshalling to/from gob is extremely fast, so using gob should improve the (un-)marshalling performance

So: Implement gob as alternative to JSON for (un-)marshalling for all currently existing gokv.Store implementations in this repo. Make this optional (via the Options struct in each implementation package)!

The issue already existed back when gokv was still part of ln-paywall as package storage, so maybe have a look at the issue created for that repo as well: philippgille/ln-paywall#32

Add implementation for AWS DynamoDB

So far all gokv.Storage implementations where for self-hosted open source databases. Some projects have all their infrastructure in AWS and instead of starting and administrating another couple of EC2 instances for a custom database installation they probably prefer to use the database-as-a-service offers by AWS.

There's SimpleDB, for which I'll probably create another issue in the future, but DynamoDB is pushed a lot by AWS and seems to be favored (by AWS at least), so the support, documentation etc. might be better, as well as customer's willingness to adopt it is likely to be higher.

Add option to keep JSON readable for some implementations

Currently, when JSON is used as MarshalFormat, it will always be converted to []byte. This makes the result unreadable as a human, unless interpreted as / converted back to string. So for example when using DynamoDB, and a struct is marshalled into a JSON {"Foo":"bar"}, saved to DynamoDB with gokv, and then you look at the value via the AWS console, it's just eyJGb28iOiJiYXIifQ==, the Base64 encoding of the JSON string.

That's because in the dynamodb implementation we use a awsdynamodb.AttributeValue for the value where we assign the value to a field B, which is for []byte, and has the Base64 encoding described in the comment:

// An attribute of type Binary. For example:
//
// "B": "dGhpcyB0ZXh0IGlzIGJhc2U2NC1lbmNvZGVk"
//
// B is automatically base64 encoded/decoded by the SDK.
B []byte `type:"blob"`

In this case, we could look at the option provided by the package user and instead of B, use S and assign the plain JSON string to it.

This should also be done in a similar way for other implementations. It only makes sense for some of them though!

BoltClient tests can't be executed repeatedly on the same machine

generateRandomTempDbPath() generates new numbers when being called multiple times within the same process, but when called in a new process it starts with the same number. This leads to the same DB being used when executing the tests again. Thus leading to one of the tests failing that expects no result for a given key, but a result is found (from a previous test run).

Errors occur when reading a value from the store implementation for bbolt / Bolt DB

The recent Travis CI builds have some errors:
Example 1 from https://travis-ci.org/philippgille/gokv/builds/445281693:

=== RUN   TestStoreConcurrent
--- FAIL: TestStoreConcurrent (3.38s)
	test.go:74: invalid character '\x00' looking for beginning of value
	test.go:74: invalid character '\x00' looking for beginning of value
	test.go:74: invalid character '\x00' looking for beginning of value
FAIL
coverage: 81.1% of statements
FAIL	github.com/philippgille/gokv/bolt	3.420s

Example 2 from https://travis-ci.org/philippgille/gokv/builds/445303063:

=== RUN   TestStoreConcurrent
unexpected fault address 0x7f361ca600ef
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x7f361ca600ef pc=0x5c2444]
goroutine 25 [running]:
runtime.throw(0x627b33, 0x5)
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/panic.go:619 +0x81 fp=0xc420053c48 sp=0xc420053c28 pc=0x453861
runtime.sigpanic()
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/signal_unix.go:395 +0x211 fp=0xc420053c98 sp=0xc420053c48 pc=0x468cf1
encoding/json.checkValid(0x7f361ca600ef, 0xa, 0xa, 0xc420192260, 0x631b60, 0xc420053d70)
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/encoding/json/scanner.go:27 +0x144 fp=0xc420053d00 sp=0xc420053c98 pc=0x5c2444
encoding/json.Unmarshal(0x7f361ca600ef, 0xa, 0xa, 0x5e34e0, 0xc42002c860, 0xc4200c5500, 0x0)
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/encoding/json/decode.go:102 +0xbd fp=0xc420053d70 sp=0xc420053d00 pc=0x5addfd
github.com/philippgille/gokv/util.FromJSON(0x7f361ca600ef, 0xa, 0xa, 0x5e34e0, 0xc42002c860, 0x0, 0xc42002c860)
	/home/travis/gopath/src/github.com/philippgille/gokv/util/util.go:12 +0x65 fp=0xc420053dc8 sp=0xc420053d70 pc=0x5c8e25
github.com/philippgille/gokv/bolt.Store.Get(0xc4200e0000, 0x627f2d, 0x7, 0x631832, 0x1, 0x5e34e0, 0xc42002c860, 0xc420053ee8, 0x437b08, 0x10)
	/home/travis/gopath/src/github.com/philippgille/gokv/bolt/bolt.go:55 +0x203 fp=0xc420053e80 sp=0xc420053dc8 pc=0x5c9423
github.com/philippgille/gokv/bolt.(*Store).Get(0xc4200f21e0, 0x631832, 0x1, 0x5e34e0, 0xc42002c860, 0x0, 0x0, 0x0)
	<autogenerated>:1 +0xb3 fp=0xc420053ef8 sp=0xc420053e80 pc=0x5ca403
github.com/philippgille/gokv/test.InteractWithStore(0x646980, 0xc4200f21e0, 0x631832, 0x1, 0xc4200d6000, 0xc4200da130)
	/home/travis/gopath/src/github.com/philippgille/gokv/test/test.go:72 +0x297 fp=0xc420053fb0 sp=0xc420053ef8 pc=0x5cc9a7
runtime.goexit()
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc420053fb8 sp=0xc420053fb0 pc=0x4837f1
created by github.com/philippgille/gokv/bolt_test.TestStoreConcurrent
	/home/travis/gopath/src/github.com/philippgille/gokv/bolt/bolt_test.go:44 +0x241

[...]

goroutine 244 [runnable]:
github.com/philippgille/gokv/test.InteractWithStore(0x646980, 0xc42000de20, 0xc420018bc9, 0x3, 0xc4200d6000, 0xc4200da130)
	/home/travis/gopath/src/github.com/philippgille/gokv/test/test.go:58
created by github.com/philippgille/gokv/bolt_test.TestStoreConcurrent
	/home/travis/gopath/src/github.com/philippgille/gokv/bolt/bolt_test.go:44 +0x241
FAIL	github.com/philippgille/gokv/bolt	0.089s

Example 3 from https://travis-ci.org/philippgille/gokv/builds/445310684:

=== RUN   TestStoreConcurrent
--- FAIL: TestStoreConcurrent (2.76s)
	test.go:74: invalid character '\x01' looking for beginning of value
FAIL
coverage: 81.1% of statements
FAIL	github.com/philippgille/gokv/bolt	2.791s

While reading about BadgerDB as one of the next gokv.Store implementations I read their warning about the validity of data from within transactions, which is:

Please note that values returned from Get() are only valid while the transaction is open. If you need to use a value outside of the transaction then you must use copy() to copy it to another byte slice.

And I remembered that I read something similar regarding bbolt, which is written in their GoDocs: https://godoc.org/go.etcd.io/bbolt#hdr-Caveats (or to have a future working link, here.

To quote:

Keys and values retrieved from the database are only valid for the life of the transaction. When used outside the transaction, these byte slices can point to different data or can point to invalid memory which will cause a panic.

Yet, the current implementation in gokv doesn't take this into account:

var data []byte
c.db.View(func(tx *bolt.Tx) error {
	b := tx.Bucket([]byte(c.bucketName))
	data = b.Get([]byte(k))
	// [...]
	return nil
})
// ... continue to work with data

So this could very well be the reason for the errors we see in the Travis CI log.

This needs to be fixed as soon as possible!

Add benchmark for comparing all store implementations

Some people already have a strong preference for a specific store (for example because they're already running Redis for their web service, so they don't want to set up and manage anything else), but others are open for any new key-value store as long as its performance is great.

We should add a benchmark that compares the different gokv.Store implementations with each other. This will especially help in deciding between read vs write optimized stores, embedded vs remote stores and self-hosted vs cloud-hosted stores.

Clean up stores after test

The DBs and data that's created during the tests is not cleaned up properly. Especially in the badgerdb package, when a store is created, it creates a 2 GB file on the filesystem. When running all tests this means that about 10 GB of space is filled up. But it's not only about these files, but also when using a client for one of the DB servers, the tests should have a "tear down" phase where created data is deleted, so the DBs can be used further, instead of having to create a new Docker container for example.

For BadgerDB, one of the requirements was the ability to Close() the store, so the file handle is released and the file can be removed. Close() was recently implemented (#36).

Support Go modules

Go 1.13 will use Go modules by default, so all Go packages should be updated to supporting Go modules. But it shouldn't only be done for the sake of adhering to the best practices / up-to-date tools in the Go ecosystem, but it's actually useful, too. Especially because there are so many gokv subpackages and almost all of them they have dependencies to third-party packages, and currently no dependency is pinned to a specific version in any way. Go modules allows pinning dependencies to specific versions, without the need to vendor their source code, but with checksums to make sure the dependencies haven't been tampered with when fetching them anew.

Useful links:

Add more gokv.Store implementations

Instead of creating a new issue as soon as someone thinks a new implementation makes sense, let's use this ticket for collecting ideas which implementations could make sense in the future.

Only when getting more serious about a specific implementation and starting to more thoroughly evaluate the key-value store / DB, a new issue specifically for that store should be created.


Add implementation for Azure Table Storage

There's another ticket for Azure Cosmos DB: #41.

Azure Cosmos DB offers multiple APIs and because the Table Storage API is advertised as the proper one for key-value pairs, that's what we'll use.

So when implementing a Table Storage client anyway, then maybe let's start with this one and then reuse it in the Cosmos DB implementation.

Also, the "Azure SDK for Go" doesn't seem to have gotten much focus on the Table Storage part and there's no specific client for Cosmos DB, so it should definitely work for Table Storage itself, but maybe there are issues when using it for Cosmos DB.

Add implementation for Memcached

Memcached is meant to be used as a cache and not as persistent key-value store, but some people might still like to use the simple gokv interface for using Memcached, or maybe they have configured persistency (despite Memcached discouraging to do this).

Also, some other databases are compatible with the Memcached protocol, for example Couchbase (see here) and Apache Ignite (see here).

Add implementation for MongoDB

MongoDB is not a dedicated key-value store, but it's probably the most popular NoSQL database, so in many projects there's already a running instance, so instead of forcing developers in those projects to set up and administrate another database, it makes sense to utilize what's already there.

Instead of storing the value for the key, a wrapping type probably needs to be created which contains the key as _id and the value as value attribute. The type could be called KVpair, goKVpair or something similar.

There's no official Go SDK for MongoDB, but the official documentation recommends a fork of an open source project:

Add implementation for Google Cloud Datastore

Amazon DynamoDB and Azure Table Storage are already supported, now GCP is missing for support of the "big three" cloud providers.

Note: Cloud Firestore (not Firebase) has a "Datastore mode" which makes it compatible to the Datastore API. Firestore might superseed Datastore in the future. But it's currently marked as beta:

Add implementation for Azure Cosmos DB

Azure Cosmos DB is described as "multi model" database, supporting the MongoDB API, Cassandra API, SQL queries, Gremlin (graph), and Azure Table Storage API.

Cosmos DB is the counterpart to AWS DynamoDB, but the "multi model" seems to be unique.

Microsoft describes Azure Table Storage as "NoSQL key-value store", so maybe that's the best API to work with (see here).

General info:

Implementation info:

Add gokv.Store implementation for OrientDB

OrientDB claims to be the fastest graph database (explicitly mentioning to be faster than Neo4j). It's also a multi-model DB, with the website saying that it supports key-value pairs as well. Maybe just with a document that only has a single value, similar to MongoDB and ArangoDB - I didn't go through the documentation to find this out yet.

Add gokv.Store implementation for Hazelcast

Hazelcast is advertised as IMDG (in-memory data grid), but at its core is an in-memory cache. We should add support for it to have another distributed cache option next to Memcached.

The supported data types are not key-value directly, but instead a single distributed map can be used for storing key-value pairs.

Add gokv.Store implementation for Alibaba Cloud Table Store

Next to AWS, Azure and GCP, Alibaba Cloud is a big cloud provider, especially in Asia.

We should implement one of their database services as key-value store as well.

Their "Table Store" seems to fit the bill:

Evaluate adding a gokv.Store implementation for ElasticSearch

As with some other implementations, ElasticSearch is not meant to be used as simple key-value store. It's for uploading and then indexing data, that can later be searched. But the data are (usually?) JSON documents and there's an API for PUT ("Index"), GET and DELETE via an ID, which serves as key. So 1) it's probably usable as key-value storage, and 2) as mentioned, like with other gokv.Store implementations, if users are already running an ElasticSearch cluster, why not use it, instead of having to set up and administrate another service like Redis?

So, check out if ElasticSearch's PUT, GET and DELETE APIs are actually usable for our purpose.

Create CLI to examine existing key-value pairs

gokv will mostly be used within CLIs or web services, so the developer can't easily examine existing key-value pairs without writing his own mini CLI with similar code that he uses in the actual project. There are of course management dashboards for most key-value stores, similar to phpMyAdmin for MySQL, but this requires additional setup and is often overkill for developers who just want to have a quick look at a value for a given key.

The CLI should be usable like this: gokv get "someKey123"
The configuration (which store implementation, URL / passwords (depending on implementation) etc.) should be located in a gokv.yml or something similar. Maybe use a library like viper to make this as flexible as possible.

The code should be located in a directory called "gokv" within this gokv repository, so that when installing the CLI via go get it can be called as gokv.

Move every store implementation into its own package

When using go get github.com/philippgille/gokv, currently all its dependencies are downloaded as well, independent of the store implementation.

When each implementation is in its own package, this shouldn't be the case anymore, making initial go gets, as well as CI builds and Docker image builds much faster, and Docker images smaller.

Queries

Graphql means you don't need a query language because graphql is an agnostic one.

Just write non join queries in each graphql resolver. Good match for KV stores !

Seems like a perfect match for this project.

It maybe that these basic queries can be done agnostically in the resolvers too at runtime using reflection. Storm led the way there. I don't think it's that huge a task if things are restricted like storm does.

If a developer does not want the performance penalty of reflection based query builders they don't have to use it. In graphql the resolvers are very simple so there are no side effects either of this developer choice

https://github.com/asdine/storm

Insummary the two together gets us an agnostic query layer

Evaluate adding gokv.Store implementation for AWS S3

AWS S3 is made to store files and it might not support strong consistency (every read after a write contains the written data), but we marshal to []byte anyway and it might be one of the cheapest cloud storage solutions.

Also, S3 is not only for AWS. Many other cloud providers and also self-hosted open source products support the S3 protocol.

Add implementation for Consul

Consul is probably mostly known for being a service registry in a microservice deployment, but it's explicitly advertised as key-value store as well. And because it's one of the most popular service registries this means it's already running in a lot of deployments and software developers might prefer to use their existing infrastructure instead of having to set up something new like Redis.

So gokv should add an implementation for Consul as key-value store.

Add gokv.Store implementation for local file storage

In some cases it might be useful to store each key-value pair as a separate local file. The key is the filename, the value is the file content.

  • Key must be escaped
  • Concurrent use of the store must be possible
    • Keep one lock object per key
    • Lock for each file access

Add Close() method to interface and all implementations

While some implementations don't need a close method, many do, and to make the latter ones properly work, the Close() method must be added to the interface, so that package creators who use a gokv.Store can properly close it, no matter what the package users pass as implementation.

Document for each package how important the call is, because in some cases where a developer uses gokv not to satisfy a gokv.Store parameter but just to have an easy way to interact with some key-value storage, he's in full control and if he uses an implementation that doesn't need to be closed, than he should know that he doesn't need to call Close().

Implement method do get all existing key-value pairs in a store

As mentioned in the README already, something like List(interface{}) error or GetAll(interface{}) error should be implemented.

Example of how a user could pass a slice which the List method then populates with values:

package main

import (
	"encoding/json"
	"fmt"
)

type foo struct {
	Bar string
}

// myFunc is meant to populate the passed slice of pointers
func myFunc(vals interface{}) {
	j := []byte(`[{"Bar":"baz1"},{"Bar":"baz2"}]`)

	err := json.Unmarshal(j, vals)
	if err != nil {
		panic(err)
	}
}

func main() {
	fmt.Println("Hello world!")

	vals := make([]foo, 0)
	myFunc(&vals)
	fmt.Println("vals:")
	for _, v := range vals {
		fmt.Printf("%+v\n", v)
	}
}

Output:

Hello world!
vals:
{Bar:baz1}
{Bar:baz2}

Make file ending optional in file implementation

When using the file package, files are stored as <key>.json or <key>.gob. But in case the client switches back and forth between the marshal formats, while using the same keys, this can lead to redundant and/or stale data.
Redundant: Save "a":"x" as JSON, then as gob
Stale: Save "a":"x" as JSON, then change it to "a":"y" and save as gob, then switch back to JSONand load "a" -> results in "x", which is old

Add gokv.Store implementation for CockroachDB

Multi user

Embedded stores and non embedded stores ( like my SQL ) are run differently.

I had an idea to make switching adnostic, in that boltdb or badger can be used from many processes, by wrapping access with a connection poller

Also using this lib with the Google go- cloud project and wire IOC mechanism would make this much cleaner too.

Add gokv.Store implementation for Apache Ignite

Apache Ignite seems to be one of the most popular multi-model open source databases. It has a key-value store mode, which seems to be meant to be used as cache, but Apache Ignite seems to be doing everything in-memory first, and then use their "durable memory" or "persistence" components to achieve durability.

The key-value store mode is JCache compliant, see:

Is Ignite a key-value store?

Yes. Ignite provides a feature rich key-value API, that is JCache (JSR-107) compliant and supports Java, C++, and .NET.

And: https://ignite.apache.org/use-cases/database/key-value-store.html

The latter link includes the following bullet point regarding the JCache specification:

  • Pluggable Persistence

So this seems to be the optimal way to use Ignite, but on the other hand there don't seem to be any Go packages for JCache. But then again, Ignite supports the Redis protocol (see here), has its own binary protocol (see here) and even a REST API (see here).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.