mosuka / cete Goto Github PK
View Code? Open in Web Editor NEWCete is a distributed key value store server written in Go built on top of BadgerDB.
License: Apache License 2.0
Cete is a distributed key value store server written in Go built on top of BadgerDB.
License: Apache License 2.0
github.com/bbva/raft-badger
has been renamed to github.com/BBVA/raft-badger
. Note the BBVA is now upper case.
Cete looks pretty strong.
I have a few thoughts on some things that can be easily add.
Following the standard patterns for storage and databases written in Golang, the context should be passed per operation and not per connection.
The current setup works in the following way:
cete.NewGRPCClientWithContext(..., ctx)
and Get / Set / Delete / etc. operation are executed this way:
cli.Set(req)
the expected call is the following:
cli.Set(ctx, req)
and this is useful when we want to limit, for example, how long a Set or Get should takes before timing out:
ctx, cancel := context.WithTimeout(ctx, 3*time.Second)
defer cancel()
cli.Set(ctx, req)
this useful to control timeouts, cascade cancellation, and similar operations.
Also, as a final note, the context variable in Golang is normally expected to be the first parameter of the function, this is not a written rule but a very common and popular standard:
cete.NewGRPCClientWithContext(ctx, ...)
See https://golang.org/src/net/dial.go#L369 as an example.
During our investigation of why the size of our database was endlessly growing, even when no data was being written to cete, we figured out that there is an important design flaw in how BadgerDB and Raft interact.
The flaw is explained as follows:
Here starts the issue:
Set()
which stores the data in BadgerTL;DR: every time the server is restarted all kv pairs are replayed in badger, causing a massive increase in the size of the database and eventually leading to a disk full.
Please note that while KV pairs are being replayed, the garbage collector is not useful. This also causes a massive consumption of resources (CPU, RAM, I/O) at startup time. The situation is even worse when a Kubernetes environment is taken into account where probes could kill the process if it takes to long to start - causing an exponential growth of the issue.
The three options that I could think of to solve this issue are the following:
config.NoSnapshotRestoreOnStart = true
, but can be executed manually in order to recover from disasters (which is what we use since we are running on a single node)db.DropAll()
and the snapshot is used to re-populate the database (RAM, CPU, I/O intensive)We are using a Helm chart in production, which deploys a healthy cete cluster. We would be happy to contribute with our code.
Our environment has been tested for a single-node cluster in production but can be scaled up.
Maybe migrating cete to its own organization along with its dependencies could be an idea (this is simply a personal suggestion).
An example of how to setup the repository is the following (from dgraph): https://github.com/dgraph-io/charts
You have same bug in a LOT of places. For example in kvs/raft_server.go .
The order is not correct. In case of error, client variable is nil:
client, err := NewGRPCClient(string(node.GrpcAddr))
defer func() {
err := client.Close()
if err != nil {
s.logger.Printf("[ERR] %v", err)
}
}()
if err != nil {
s.logger.Printf("[ERR] %v", err)
return nil
}
Here you do it correctly: cmd/cete/cluster.go and other places
client, err := kvs.NewGRPCClient(grpcAddr)
if err != nil {
return err
}
defer func() {
err := client.Close()
if err != nil {
_, err = fmt.Fprintln(os.Stderr, err)
}
}()
In NewGRPCClient() function in case of error you return nil object and and error. Here is relevant code:
func NewGRPCClient(address string) (*GRPCClient, error) {
.........
conn, err := grpc.DialContext(ctx, address, dialOpts...)
if err != nil {
cancel()
return **nil**, err
}
I tried to launch cete with docker-compose file.
Tried multiple solutions for it but every time, I run docker-compose up
it gives port error or permission denied error for creating folder for external directory.
I also tried to add volumes instead of direct directory path still the same.
I am using Mac M1
version: '3.9'
services:
node_1:
platform: linux/amd64
image: mosuka/cete:latest
network_mode: host
# command: "start --id=node_1 --raft-address=:7000 --grpc-address=:9000 --http-address=:8000"
ports:
- 2000:7000
- 8000:8000
- 9000:9000
volumes:
- data:/tmp/cete/data/node_1:delegated
environment:
- CETE_ID=node_1
- CETE_DATA_DIRECTORY=node_1
- CETE_RAFT_ADDRESS=7000
- CETE_GRPC_ADDRESS=9000
- CETE_HTTP_ADDRESS=8000
cete-v0.3.1.windows-amd64
window 7 OS
cete start --id=node1 --raft-address=:7000 --grpc-address=:9000 --http-address=:8000 --data-directory=node1
it's ok when I intially start.
cete server start ok and client put and get ok.
but it's error when I restart cete.
the error output is:
I:\Cete\bin>cete start --id=node1 --raft-address=:7000 --grpc-address=:9000 --http-address=:8000 --data-directory=node1
{"level":"error","timestamp":"2022-08-17T11:19:30.762+0800","name":"cete",
"caller":"storage/kvs.go:28","message":"failed to open database","opts":{"Di
r":"node1\kvs","ValueDir":"node1\kvs","SyncWrites":false,"TableLoadingMode":2,
"ValueLogLoadingMode":2,"NumVersionsToKeep":1,"ReadOnly":false,"Truncate":false,
"Logger":null,"Compression":1,"EventLogging":true,"MaxTableSize":67108864,"Level
SizeMultiplier":10,"MaxLevels":7,"ValueThreshold":32,"NumMemtables":5,"BlockSize
":4096,"BloomFalsePositive":0.01,"KeepL0InMemory":true,"MaxCacheSize":1073741824
,"NumLevelZeroTables":5,"NumLevelZeroTablesStall":10,"LevelOneSize":268435456,"V
alueLogFileSize":1073741823,"ValueLogMaxEntries":1000000,"NumCompactors":2,"Comp
actL0OnClose":true,"LogRotatesToFlush":2,"VerifyValueChecksum":false,"Encryption
Key":"","EncryptionKeyRotationDuration":864000000000000,"ChecksumVerificationMod
e":0},"error":"During db.vlog.open: Value log truncate required to run DB. This
might result in data loss","errorVerbose":"Value log truncate required to run DB
. This might result in data loss\ngithub.com/dgraph-io/badger/v2.init\n\t/Users/
m-osuka/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/errors.go:103\nruntime.
doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5414\nruntime.do
Init\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime.doIn
it\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime.doInit
\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime.doInit\n
\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime.main\n\t/u
sr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:190\nruntime.goexit\n\t/usr/
local/Cellar/go/1.14/libexec/src/runtime/asm_amd64.s:1373\nDuring db.vlog.open\n
github.com/dgraph-io/badger/v2/y.Wrapf\n\t/Users/m-osuka/go/pkg/mod/github.com/d
graph-io/badger/[email protected]/y/error.go:82\ngithub.com/dgraph-io/badger/v2.Open\n\t
/Users/m-osuka/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/db.go:357\ngithu
b.com/mosuka/cete/storage.NewKVS\n\t/Users/m-osuka/go/src/github.com/mosuka/cete
/storage/kvs.go:26\ngithub.com/mosuka/cete/server.NewRaftFSM\n\t/Users/m-osuka/g
o/src/github.com/mosuka/cete/server/raft_fsm.go:36\ngithub.com/mosuka/cete/serve
r.NewRaftServer\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/server/raft_serv
er.go:44\ngithub.com/mosuka/cete/cmd.glob..func11\n\t/Users/m-osuka/go/src/githu
b.com/mosuka/cete/cmd/start.go:55\ngithub.com/spf13/cobra.(*Command).execute\n\t
/Users/m-osuka/go/pkg/mod/github.com/spf13/[email protected]/command.go:838\ngithub.c
om/spf13/cobra.(*Command).ExecuteC\n\t/Users/m-osuka/go/pkg/mod/github.com/spf13
/[email protected]/command.go:943\ngithub.com/spf13/cobra.(*Command).Execute\n\t/User
s/m-osuka/go/pkg/mod/github.com/spf13/[email protected]/command.go:883\ngithub.com/mo
suka/cete/cmd.Execute\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/cmd/root.g
o:16\nmain.main\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/main.go:10\nrunt
ime.main\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:203\nruntime.g
oexit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/asm_amd64.s:1373"}
{"level":"error","timestamp":"2022-08-17T11:19:30.787+0800","name":"cete",
"caller":"server/raft_fsm.go:38","message":"failed to create key value store
","path":"node1\kvs","error":"During db.vlog.open: Value log truncate required
to run DB. This might result in data loss","errorVerbose":"Value log truncate re
quired to run DB. This might result in data loss\ngithub.com/dgraph-io/badger/v2
.init\n\t/Users/m-osuka/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/errors.
go:103\nruntime.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:
5414\nruntime.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:54
09\nruntime.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409
\nruntime.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\n
runtime.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nru
ntime.main\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:190\nruntime
.goexit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/asm_amd64.s:1373\nDurin
g db.vlog.open\ngithub.com/dgraph-io/badger/v2/y.Wrapf\n\t/Users/m-osuka/go/pkg/
mod/github.com/dgraph-io/badger/[email protected]/y/error.go:82\ngithub.com/dgraph-io/ba
dger/v2.Open\n\t/Users/m-osuka/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/
db.go:357\ngithub.com/mosuka/cete/storage.NewKVS\n\t/Users/m-osuka/go/src/github
.com/mosuka/cete/storage/kvs.go:26\ngithub.com/mosuka/cete/server.NewRaftFSM\n\t
/Users/m-osuka/go/src/github.com/mosuka/cete/server/raft_fsm.go:36\ngithub.com/m
osuka/cete/server.NewRaftServer\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/
server/raft_server.go:44\ngithub.com/mosuka/cete/cmd.glob..func11\n\t/Users/m-os
uka/go/src/github.com/mosuka/cete/cmd/start.go:55\ngithub.com/spf13/cobra.(*Comm
and).execute\n\t/Users/m-osuka/go/pkg/mod/github.com/spf13/[email protected]/command.
go:838\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/Users/m-osuka/go/pkg/mod/
github.com/spf13/[email protected]/command.go:943\ngithub.com/spf13/cobra.(*Command).
Execute\n\t/Users/m-osuka/go/pkg/mod/github.com/spf13/[email protected]/command.go:88
3\ngithub.com/mosuka/cete/cmd.Execute\n\t/Users/m-osuka/go/src/github.com/mosuka
/cete/cmd/root.go:16\nmain.main\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/
main.go:10\nruntime.main\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.g
o:203\nruntime.goexit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/asm_amd64
.s:1373"}
{"level":"error","timestamp":"2022-08-17T11:19:30.789+0800","name":"cete",
"caller":"server/raft_server.go:46","message":"failed to create FSM","path":
"node1\kvs","error":"During db.vlog.open: Value log truncate required to run DB
. This might result in data loss","errorVerbose":"Value log truncate required to
run DB. This might result in data loss\ngithub.com/dgraph-io/badger/v2.init\n\t
/Users/m-osuka/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/errors.go:103\nr
untime.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5414\nrun
time.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nrunti
me.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime
.doInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime.d
oInit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:5409\nruntime.mai
n\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:190\nruntime.goexit\n
\t/usr/local/Cellar/go/1.14/libexec/src/runtime/asm_amd64.s:1373\nDuring db.vlog
.open\ngithub.com/dgraph-io/badger/v2/y.Wrapf\n\t/Users/m-osuka/go/pkg/mod/githu
b.com/dgraph-io/badger/[email protected]/y/error.go:82\ngithub.com/dgraph-io/badger/v2.O
pen\n\t/Users/m-osuka/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/db.go:357
\ngithub.com/mosuka/cete/storage.NewKVS\n\t/Users/m-osuka/go/src/github.com/mosu
ka/cete/storage/kvs.go:26\ngithub.com/mosuka/cete/server.NewRaftFSM\n\t/Users/m-
osuka/go/src/github.com/mosuka/cete/server/raft_fsm.go:36\ngithub.com/mosuka/cet
e/server.NewRaftServer\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/server/ra
ft_server.go:44\ngithub.com/mosuka/cete/cmd.glob..func11\n\t/Users/m-osuka/go/sr
c/github.com/mosuka/cete/cmd/start.go:55\ngithub.com/spf13/cobra.(*Command).exec
ute\n\t/Users/m-osuka/go/pkg/mod/github.com/spf13/[email protected]/command.go:838\ng
ithub.com/spf13/cobra.(*Command).ExecuteC\n\t/Users/m-osuka/go/pkg/mod/github.co
m/spf13/[email protected]/command.go:943\ngithub.com/spf13/cobra.(*Command).Execute\n
\t/Users/m-osuka/go/pkg/mod/github.com/spf13/[email protected]/command.go:883\ngithub
.com/mosuka/cete/cmd.Execute\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/cmd
/root.go:16\nmain.main\n\t/Users/m-osuka/go/src/github.com/mosuka/cete/main.go:1
0\nruntime.main\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/proc.go:203\nru
ntime.goexit\n\t/usr/local/Cellar/go/1.14/libexec/src/runtime/asm_amd64.s:1373"}
Error: During db.vlog.open: Value log truncate required to run DB. This might re
sult in data loss
Usage:
cete start [flags]
Flags:
--certificate-file string path to the client server TLS certificate fil
e
--common-name string certificate common name
--config-file string config file. if omitted, cete.yaml in /etc an
d home directory will be searched
--data-directory string data directory which store the key-value stor
e data and Raft logs (default "/tmp/cete/data")
--grpc-address string gRPC server listen address (default ":9000")
-h, --help help for start
--http-address string HTTP server listen address (default ":8000")
--id string node ID (default "node1")
--key-file string path to the client server TLS key file
--log-compress compress a log file
--log-file string log file (default "/dev/stderr")
--log-level string log level (default "INFO")
--log-max-age int max age of a log file in days (default 30)
--log-max-backups int max backup count of log files (default 3)
--log-max-size int max size of a log file in megabytes (default
500)
--peer-grpc-address string listen address of the existing gRPC server in
the joining cluster
--raft-address string Raft server listen address (default ":7000")
Cete is really awesome and i am using it as a Store for JSON data.
But it would be amazing if it would offer SQL Select functionality.
API: https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-select.html
Implementation example here: https://github.com/minio/minio/blob/master/pkg/s3select/select.go
Just like redigo/go-redis or aerospike-client-go
As a distributed KV store, I think distributed transaction is needed.
Will cete implement this?
Support Basic Auth.
Refer to following issue:
mosuka/blast#45
Cete really works well as KV, however, I'd like to be able to do basic queries also. BadgerHold provides querying a Badger database. Could this be integrated into Cete thank you
Where("field").Eq(value)
Where("field").Ne(value)
Where("field").Gt(value)
Where("field").Lt(value)
Where("field").Le(value)
Where("field").Ge(value)
Where("field").In(val1, val2, val3)
Where("field").IsNil()
Where("field").RegExp(regexp.MustCompile("ea"))
Where("field").MatchFunc(func(ra *RecordAccess) (bool, error))
Where("field").Eq(value).Skip(10)
Where("field").Eq(value).Limit(10)
Where("field").Eq(value).SortBy("field1", "field2")
Where("field").Eq(value).SortBy("field").Reverse()
Where("field").Eq(value).Index("indexName")
Allow CLI options to be read from the configuration file.
First of all, I would like to thank you for the amazing work you did here, outstanding!
This ticket is related to issues we encountered in a cloud environment, and more specifically in Kubernetes deployments.
A little background about our setup: we are migrating around 10 million records from ETCD to CETE for a total of ca. 30 GB of data. The data migrated contains JSON, HTML, and other formats.
I would like to address the issue of logs as currently too much traffic/data is being generated and logs are too verbose:
I will update this ticket with more information about other logs as we encounter them.
Hi,
Is it possible to use Cete instead of Redis without changing the client library that it uses
https://redis.io/topics/protocol
?
Basic Set, Get, Del commands are enough.
I am looking for a tool to replace Celery Broker RabbitMQ / Redis
https://docs.celeryproject.org/en/latest/getting-started/brokers/index.html#broker-overview
I think Cete is great for this. Only currently there is no client library to easily integrate Cete with Celery :/
Hello,
I am interested in Cete for possible use in a highly-scalable P2P system and was wandering if Cete does replication or horizontal-scaling or both?
I will start soon working again on a working cete cluster on Kubernetes and was wondering if anyone here already achieved something of a sort ๐ค
Against standard Redis and Aerospike for example.
Hi!
Is there any way to secure the cluster? Because running it all plain open is a complete dealbreaker...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.