talent-plan / tinykv Goto Github PK

View Code? Open in Web Editor NEW

3.3K 3.3K 972.0 1.43 MB

A course to build distributed key-value service based on TiKV model

License: Apache License 2.0

Makefile 0.44% Go 99.16% Shell 0.40%

tinykv's People

Contributors

Stargazers

Watchers

Forkers

iosmanthus ninglin-p andylokandy nrc yangkeao log-e-e fengzixu eadral loxp jiangyuzhao yangsongbao yujuncen goarchiver boldmanq wangwangwar xxsgo junkrat77 wfnuser woodgear jychen7 kana112233 coding-product mstch strrl zeta1999 shunpeizhang winoros shooterit brucechin shaojintian qw4990 yisaer ming-fork xiaodong-ji liukun4515 chihiro2014 kuguobing honor100 gmshepard baymaxhwy ieverx hawkingrei ekalinin tangwz liyiheng arthuryangcs fengberlin fk2469 zeling jmpotato xiongjiwei longfangsong perfect6566 yj511608130 peterxiao doggeral fire-sale 0xff-96 marmanson hailanwhu skyliulu lambda7xx shaohv ywandy tallguys whatot lllyteins sdjdd neurostep shizy818 doytsujin mashenjun onesizefitsquorum cofyc kevinl8890 sa3036 chlk zjs1224522500 piu1283 reminiscent jfz ox1o5 xhcom-ui jzeng9 komolebi zhangxiang17 gogoyao miamia0 garder-m yeya24 amasser githubear monsooooon lenicore fallensouls yiminmin junlong-gao greyireland rubb1sh williamgrt

tinykv's Issues

which project TestPersistPartitionUnreliable2B belong to? it's comment seems imply it belong to 3A

https://github.com/pingcap-incubator/tinykv/blob/c7fa725b057c4be04194d6250a37048867495683/kv/test_raftstore/test_test.go#L441

code may cause redundant change peer requsest

https://github.com/pingcap-incubator/tinykv/blob/b6d7c442e3c9d1eac7192c3bff8372b16a7ae18b/kv/test_raftstore/scheduler.go#L393-L406
code between 396 and 401 does not make sense, it will always return false unless it is so lucky that the first peer in region is the added peer. Also, if there is no peer in region or only one peer that is added peer, code will reach the unreachable part.

should heartbeat RPC update commit?

TestHandleHeartbeat2AA expect commit would change.

but later I encounter a bug like this

an old leader is separate away and receives a log(assume index 10) from a client(no chance to commit).
while the majority elect a new leader and commit is up to index 15.
when the old leader comes back to the cluster and receives a heartbeat RPC says commit is 15, it will apply the wrong log index 10(this log should be overwritten).

some test not belong any of project

for example tests under peer_storage_test.go

Possible typo in the README

the "overview of the code section" starts with

Same as the architect TiDB + TiKV + PD

should it not be this?

Same as the architecture of TiDB + TiKV + PD

when /tmp is under tmpfs (in arch linux) opts.DB.IngestExternalFiles(externalFiles) will block forever

snap

directio open file issue

change dir in cluster.go could fix this.

Failed to perform `make` command under macOS with go 1.13.6

After I performed the make command to experience this project. I faced a problem of golang compline. The error msg as below

GO111MODULE=on go build  -tags codes -ldflags '-L/usr/local/opt/icu4c/lib -X "main.gitHash=`git rev-parse HEAD`" ' -o bin/tinykv-server kv/tinykv-server/main.go
# command-line-arguments
flag provided but not defined: -L/usr/local/opt/icu4c/lib

Then I checked the Makefile under the root directory

LDFLAGS             += -X "main.gitHash=`git rev-parse HEAD`"

It seems like just adding the version info. The L/usr/local/opt/icu4c/lib wasn't added by tinykv. So I checked the value of this env LDFLAGS on my host.

The answer is here

# echo $LDFLAGS                                                                                                                                                                                               
-L/usr/local/opt/icu4c/lib

LDFLAGS has a initial value which affected this project.

My solution is easy:

export LDFLAGS = ''

Actually, I am not sure if this initial value is necessary for tinykv. Maybe we can clear this initial value for LDFLAGS to avoid errors. It will be friendly for our students.

TestRawDelete1 in server_test doesn't prepare test data

In TestRawDelete1, we didn't prepare test data before we delete it.

snapshot did not close file after using

https://github.com/pingcap-incubator/tinykv/blob/d13ccb8785702ecdf17368bf2350f7f82c58cab5/kv/raftstore/snap/snap.go#L379-L385

this part is using for building snapshot, it opens metafile and 3 cf files, but it seems the files did not closed. Also, in NewSnapForSending, NewSnapForReceiving, NewSnapForApplying are not close file too.
Or, these files has been closed and I miss something?

region meta in Scheduler and TinyKV has same address

https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/test_raftstore/scheduler.go#L345-L348
From the code, it seems that region meta in Scheduler and TinyKV should be independence.
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer.go#L345-L352
but it send a pointer to scheduler. It will cause problem when change peer. Maybe it should be cloned first.

eraftpb.pb.go fileDescriptor inconsistence

I used make proto locally, then I used git status to check if any files have been modified,and the eraftpb.pb.go is the only one.As shown in the figure below,I found that the fileDescriptor of eraftpb.proto generated by protoc has been changed from 6ff387e2d76c3034 to 2f2e0bcef614736b.

Actually I think these generated code should not be pushed to git repo,but as you have pushed these files,is it better to make them consistent?Do I need to fix this?Or this is just because the difference of environment?

TestLeaderElectionOverwriteNewerLogs2AB forget to setup peers information?

https://github.com/pingcap-incubator/tinykv/blob/edb12af93f17321030036ffcfcc7723239c3f8ad/raft/raft_test.go#L121
should be something like
cfg := func(c *Config) { c.peers = idsBySize(5) }

transactions: implement request types

`newTestPeerStorageFromEnts` not modify applyState in memory, only in DB

it only changes applyState in DB
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_storage_test.go#L34-L40

but provided function PeerStorage.truncatedIndex and PeerStorage.truncatedTerm is using in memory applyState
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_storage.go#L219-L221

so in TestPeerStorageTerm test 2 and 3, Term will return ErrCompacted, since truncatedIndex is initially 5

should we modify PeerStorage.truncatedIndex and PeerStorage.truncatedTerm function?

How about renaming `InnerServer` to `Storage`

InnerServer actually gives some operations on the underlying storage, maybe we can rename InnerServer and DBReader to a meaningful name, like Storage and StorageReader

Didn't see file close

https://github.com/pingcap-incubator/tinykv/blob/b6d7c442e3c9d1eac7192c3bff8372b16a7ae18b/kv/raftstore/snap/snap.go#L680-L685
cannot find where the file closed

Maybe Iter need to be closed at the end?

https://github.com/pingcap-incubator/tinykv/blob/08a4067af7076fd3f55bd1e56f2263c813eda828/kv/server/server_test.go#L309

storage: move command tests to commands

Currently not possible due to creating import cycles. It would be good to address this somehow (even just combining things into one package).

`TestRawNodeRestart2AC` & `TestRawNodeRestartFromSnapshot2C` expected result don't have softstate and hardstate

DeepEqual will fail because want and rd differ in softstate and hardstate.

maybe we need something like

want := Ready{
		&SoftState{
			Lead:      None,
			RaftState: StateFollower,
		},
		pb.HardState{
			Term:   st.Term,
			Commit: st.Commit,
		},
		[]pb.Entry{},
		pb.Snapshot{},
		// commit up to commit index in st
		entries[:st.Commit],
		make([]pb.Message, 0), // Messages []pb.Message should be empty slice or nil?
	}

Implement Coprocessor API

adjust the implementation https://github.com/ngaut/unistore/blob/master/tikv/server.go#L509 to TinyKV code base https://github.com/pingcap-incubator/tinykv/blob/master/kv/tikv/server.go#L180

transactions: multi-threaded transactions

Currently, the scheduler executes all requests sequentially on a single thread. We should execute them concurrently. This will require synchronization of the storage layer using latches.

Make split unbatched

Remove BatchSplitRequest and BatchSplitResponse and add SplitResponse. https://github.com/pingcap-incubator/tinykv/blob/master/proto/proto/raft_cmdpb.proto#L86.
Then make OnReadySplitRegion and ExecBatchSplit into an unbatched manner(only one split key).

storage: refactor tests

The command tests are very repetitive with lots of boilerplate, there must be a better way

create peer twice may fatal

It may create peer twice when region split:

when region split, it should call createPeer
when applying AdminCmdType_Split, there is no way to find out whether the new peer is creating or not
store_worker may call replicatePeer to create new peer when recive msg from other node

so, it may create peer twice at same time

error occur when create peer at same time

createPeer will init raftLocalState.LastIndex and raftLocalState.LastTerm to 5 and replicatePeer will init them to 0. and write them to storage.

Imaging this: if createPeer write raftLocalState.LastIndex and raftLocalState.LastTerm with 5 to storage and replicatePeer get them, but when init applyState, replicatePeer get nothing and init applyState to 0. it will encounter bug since raftLocalState and applyState is not match.

Am I right? or this situation will nerver happened and I miss something?

Add comments for runner handler

there are some runners in raftstore to do async tasks, please add comments to describe what it is used for and how it does.

sm.Prs[2].Next - 1 might overflow

https://github.com/pingcap-incubator/tinykv/blob/3513f5d1b42e9a50eafe54b002370b8851d517a8/raft/raft_test.go#L1033-L1034

sm.Prs[2].Next is a uint64 variable，after sm.Prs[2].Next=0， sm.Prs[2].Next - 1might overflow .

Scope/requirements

I would like to define what is in and out of scope for the initial iteration of TinyKV. I'll start to make a list here, please comment if I got something wrong or if there are other things which should be in or out.

cc @siddontang, @Connor1996

In scope

Test framework
Distribution using Raft (Raft implemented as part of TinyKV, not using a library)
Raw and transactional API
Optimistic transactions (Percolator-style)
PD to balance regions and provide timestamps
Interoperable with TinySQL

Out of scope

Any kind of engine abstraction
Debug or diagnostics services (for development only)
Metrics/statistics
Any GC
Pessimistic transactions
backup/restore; sst import
profiler
status server
coprocessor
batched commands
security
interoperable with real TiDB and real PD

raft_test doesn't Step through messages generate by Raft.tick()?

from what I understant both local messages and network messages should be append into Raft.msgs, and some code is responsible for loop through Raft.msgs and call Raft.Step.
Is that correct?

so the local messages(MessageType_MsgHup and MessageType_MsgBeat) generate by Raft.tick() just sit in msgs, waiting to be process, until then it's state will not change.

if that is the case, the following code will not terminate
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/raft/raft_test.go#L797-L799
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/raft/raft_test.go#L853-L855

raftstore: add integration tests

LAB2: kvraft

PartA: raft

raft related test

PartB: raftstore + apply

port MIT6.824 3A #63

PartC: snapshot + raftloggc

port MIT6.824 3B #72

LAB3: multiraft

PartA: support conf change in raft

raft related test

PartB: implement conf change in raftstore

test_simple_conf_change #75
test_split_brain

PartC: split

test_base_split_region #83
test_split_epoch_not_match

Finish lab document

lab1
lab2
lab3
lab4

storage: rename or split up kvstore package

The idea with the package is to provide abstractions for interacting with a low-level kv store. However, transaction and lock/write do kind of different things and it is really a package for lowering TinySQL abstractions to KV abstractions.

TestRawNodeStart2C or TestRawNodeRestart2AC?

master branch TestRawNodeStart2C
course branch TestRawNodeRestart2AC

Simplify Scheduler code base

Schedule Part

At present, all sent messages follow these flow:

request -> (get store/bind stream/get region) -> raft cluster HandleRegionHeartbeat -> Dispatch

drivePushOperator -> PushOperators -> Dispatch -> SendScheduleCommand

PromoteWaitingOperator -> addOperatorLocked -> SendScheduleCommand

All operators are created in these flow:

Scheduler runScheduler -> AddWaitingOperator 

patrolRegions -> AddWaitingOperator

We can simplify scheduler in these points:

Remove merge operator and all related changes (because merge operation has two operator)
Simplify the total routine. Maybe the simplest way: request -> operator. Delete waiting queue, pushDriver... Should we have a waiting queue (or simply cancel all exceeded operation?)

schedule:

checker: make replica checker only checks down peer and offline peer( or just remove replica_checker) #61
filter: remove unnecessary filters #62
operator: only keep TransferLeader, AddPeer, RemovePeer #61
opt: remove these options #62
placement: remove it #61
remove waiting queue, pushDriver #61

schedulers:

remove label scheduler #61
simplify balance region scheduler #62

Corrupted snap files due to hard linking received snapshots

When I worked on a solution, sometimes I get error related to applying snapshot, complaining some of the sst files are of invalid size 0.

This is strange as snapshots are validated on reception. If it passes reception, it should not have an invalid size.

After digging into the snapshot applying code path, I noticed that DB.IngestExternalFiles is just hardlinking the sst files into badger without removing the received file here:
https://github.com/pingcap-incubator/tinykv/blob/fc36c301fbf92d05e30e629453a7c3282399af75/kv/raftstore/snap/snap.go#L692

so what can happen is sst files are getting compacted as part of lsm compaction, and its size is reduced to 0 (so compaction literally does unmap/truncate to free space I assume?). when snapshot is re-sent, the follower will find the snapshot already exists. then when the follower applies the snapshot, it blows up on size mismatch.

we should either:

make a copy when applying the snapshot, or better
remove the link after ingesting the snapshot file (and the entire snapshot file info including the manifest)?

I did approach 1) in my solution:
https://github.com/junlong-gao/tinykv/blob/0fc6c28acffee911a38b2c043809e6977d7d2822/kv/raftstore/snap/snap.go#L692
and the issue no longer surfaces.

Could pingcap team please take a look at this and validate if my theory here makes sense?

transaction layer

This issue is for the layer between the raft store and the server API, i.e., the server and storage modules in TiKV.

This layer must support the following:

translate protobuf data into internal data and commands
schedule and run commands
ensure that commands executing in parallel are serializable
handle locks and locking of data
manage locks (handle lock timeouts, etc)
delegate actual storage of key/values to raftstore
provide consistent reads (from snapshots)
handle rollbacks in the event of conflicts

At the highest level, this should be implemented in kv/tikv/server.go, by handling gRPC messages by calling the methods of InnerServer.

What does storage.Start() do

I think this function initializes storage, but it is not called in kv/server/server_test.go. So what does this function do?

message from old term should ignore or not?

In file raft_test.go (line 535) TestHandleMessageType_MsgAppend2AB test case 8 and 9.
The message term is 1, and they are sent to a follower which in term 2, and expected the follower could append them and update state. From what I understand, I think this message should be ignore because this two message's term is slower than the follower. someone can explain this? thx. :)

Possible typo in the README

Course material
Please follow the course material to learn the background knowledge and finish code step by step.

Should be Course Materials?

Consider implementing error for KeyError

just like how RegionError does, we can implement error for kvrpcpb.KeyError like this:

type KeyError struct {
	Err *kvrpcpb.KeyError
}

func (ke *KeyError) Error() string {
	return ke.Err.String()
}

so function can unify return value of kvrpcpb.KeyError into error. For example

func (p *Prewrite) prewriteMutation(txn *mvcc.MvccTxn, mut *kvrpcpb.Mutation) (*kvrpcpb.KeyError, error)  {}
->
func (p *Prewrite) prewriteMutation(txn *mvcc.MvccTxn, mut *kvrpcpb.Mutation) error {}

func (scan *Scanner) Next() ([]byte, []byte, interface{}) {}
-> 
func (scan *Scanner) Next() ([]byte, []byte, error) {}

Implement OpLock for TinyKV

panic in NewTestServer

I encounter a bug in test testClusterSuite.TestConcurrentHandleRegion3C. when call NewTestServer, it will panic, error stack

PANIC: cluster_test.go:376: testClusterSuite.TestConcurrentHandleRegion3C

... Panic: invalid page type: 0: 4 (PC=0x43F645)

/usr/lib/go-1.14/src/runtime/panic.go:969
  in gopanic
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/cursor.go:250
  in Cursor.search
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/cursor.go:159
  in Cursor.seek
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/bucket.go:165
  in Bucket.CreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/tx.go:108
  in Tx.CreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/backend/batch_tx.go:72
  in batchTx.UnsafeCreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/kvstore.go:149
  in NewStore
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/watchable_store.go:78
  in newWatchableStore
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/watchable_store.go:73
  in New
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/etcdserver/server.go:543
  in NewServer
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/embed/etcd.go:211
  in StartEtcd
server.go:163
  in Server.startEtcd
server.go:300
  in Server.Run
testutil.go:48
  in NewTestServer
cluster_test.go:379
  in testClusterSuite.TestConcurrentHandleRegion3C
/usr/lib/go-1.14/src/reflect/value.go:321
  in Value.Call
/home/xjw/go/pkg/mod/github.com/pingcap/[email protected]/check.go:850
  in suiteRunner.forkTest.func1
/home/xjw/go/pkg/mod/github.com/pingcap/[email protected]/check.go:739
  in suiteRunner.forkCall.func1
/usr/lib/go-1.14/src/runtime/asm_amd64.s:1373
  in goexit

do not konw whether it is my fault.

shall we expose ErrKeyNotFound error when we invoke GetCF Method ?

I encountered a pitfall in kv/server/server_test.go

Firstly

the test case TestRawGetAfterRawDelete1 in line 169 says we need to expose ErrKeyNotFound err when we try to get a non-exists key.

	resp, err := server.RawGet(nil, get)
	assert.Nil(t, err)
	assert.True(t, resp.NotFound)

Secondly

the code in TestRawDelete1 says we should return nil when we try to get a key that have already beed deleted.

	_, err := server.RawDelete(nil, req)
	assert.Nil(t, err)
	val, err := Get(s, cf, []byte{99})
	assert.Equal(t, nil, err)
	assert.Equal(t, []byte(nil), val)

In the end

I am able to pass the make project1 when I change this line .
from

	assert.Equal(t, nil, err)

	assert.Equal(t, badger.ErrKeyNotFound, err)

What I want to know

So, I wonder it this a bug ? or there is a missing part of my code that I should continue to implement ?

Snapshot will always unavailable if error occur when generate snapshot

https://github.com/pingcap-incubator/tinykv/blob/d13ccb8785702ecdf17368bf2350f7f82c58cab5/kv/raftstore/runner/region_task.go#L81-L88
https://github.com/pingcap-incubator/tinykv/blob/d13ccb8785702ecdf17368bf2350f7f82c58cab5/kv/raftstore/peer_storage.go#L157-L162

ps.snapState.Receiver will never recive msg if error happend. Snapshot method will always return ErrSnapshotTemporarilyUnavailable

running `go mod tidy` or `make project1` gives me an error

go version go1.14.3 linux/amd64
I receive the following error when I run any of the above commands:

go: github.com/pingcap/[email protected] requires
github.com/pingcap-incubator/[email protected]: invalid version: unknown revision f660f80

TestUnreliable2B seems could not achieve the point which it want to test

https://github.com/pingcap-incubator/tinykv/blob/16b8058220d700d282e3456a573ec4dd46c9e10b/kv/test_raftstore/test_test.go#L352

Cleanup implementation

clean up the implementation that should be filled by students and move to a new branch

lab1
lab2
lab3
lab4

Forget to start storages before using them in some tests

In TestRawGetNotFound1, we create a standalone storage, but we don't start it(invoke storage.Start()) before using it.

firstIndex never used

https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_msg_handler.go#L153

Access global context is racy

for example, storeMeta is accessed in both newRaftWorker (mutation for conf change) and newStoreWorker (reading for heartbeating to pd):

https://github.com/pingcap-incubator/tinykv/blob/fc36c301fbf92d05e30e629453a7c3282399af75/kv/raftstore/raftstore.go#L91

I was working on a solution and noticed inconsistent region info was sent to the scheduler (peer and region epoch field in the storeMeta was inconsistent) probably due to the fact that I was mutating it in place, updating fields one after another, and heartbeat worker reads seeing 2 fields one before and one after the mutation.

I avoided this issue in my solution by doing a deep copy:

https://github.com/junlong-gao/tinykv/blob/0fc6c28acffee911a38b2c043809e6977d7d2822/kv/raftstore/peer_msg_handler.go#L79

yet it is still racy.

Perhaps we should kindly add a mutex in the global context so folks working on this will notice the race condition here?

Replace MemStorage with StandaloneStorage

memInnerServer is used in transaction tests, we can just remove it and use standaloneInnerServer