Giter Site home page Giter Site logo

tinykv's People

Contributors

connor1996 avatar dependabot[bot] avatar fengzixu avatar huaouo avatar jldataku avatar jychen7 avatar kikimo avatar learndatastorage avatar michaeldesteven avatar mind1949 avatar moonm3n avatar nasnoisaac avatar niebayes avatar ninglin-p avatar nothatdinger avatar nrc avatar onesizefitsquorum avatar oocococo avatar rapiz1 avatar smith-cruise avatar super-long avatar trafalgarricardolu avatar unconsolable avatar wenyxu avatar willendless avatar xiongjiwei avatar yangkeao avatar yanguwan avatar zhuo1angt avatar zyfzyf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinykv's Issues

should heartbeat RPC update commit?

TestHandleHeartbeat2AA expect commit would change.

but later I encounter a bug like this

  • an old leader is separate away and receives a log(assume index 10) from a client(no chance to commit).
  • while the majority elect a new leader and commit is up to index 15.
  • when the old leader comes back to the cluster and receives a heartbeat RPC says commit is 15, it will apply the wrong log index 10(this log should be overwritten).

Possible typo in the README

the "overview of the code section" starts with

Same as the architect TiDB + TiKV + PD

should it not be this?

Same as the architecture of TiDB + TiKV + PD

Failed to perform `make` command under macOS with go 1.13.6

After I performed the make command to experience this project. I faced a problem of golang compline. The error msg as below

GO111MODULE=on go build  -tags codes -ldflags '-L/usr/local/opt/icu4c/lib -X "main.gitHash=`git rev-parse HEAD`" ' -o bin/tinykv-server kv/tinykv-server/main.go
# command-line-arguments
flag provided but not defined: -L/usr/local/opt/icu4c/lib

Then I checked the Makefile under the root directory

LDFLAGS             += -X "main.gitHash=`git rev-parse HEAD`" 

It seems like just adding the version info. The L/usr/local/opt/icu4c/lib wasn't added by tinykv. So I checked the value of this env LDFLAGS on my host.

The answer is here

# echo $LDFLAGS                                                                                                                                                                                               
-L/usr/local/opt/icu4c/lib

LDFLAGS has a initial value which affected this project.

My solution is easy:

export LDFLAGS = ''

Actually, I am not sure if this initial value is necessary for tinykv. Maybe we can clear this initial value for LDFLAGS to avoid errors. It will be friendly for our students.

region meta in Scheduler and TinyKV has same address

https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/test_raftstore/scheduler.go#L345-L348
From the code, it seems that region meta in Scheduler and TinyKV should be independence.
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer.go#L345-L352
but it send a pointer to scheduler. It will cause problem when change peer. Maybe it should be cloned first.

eraftpb.pb.go fileDescriptor inconsistence

I used make proto locally, then I used git status to check if any files have been modified,and the eraftpb.pb.go is the only one.As shown in the figure below,I found that the fileDescriptor of eraftpb.proto generated by protoc has been changed from 6ff387e2d76c3034 to 2f2e0bcef614736b.
image
Actually I think these generated code should not be pushed to git repo,but as you have pushed these files,is it better to make them consistent?Do I need to fix this?Or this is just because the difference of environment?

`newTestPeerStorageFromEnts` not modify applyState in memory, only in DB

it only changes applyState in DB
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_storage_test.go#L34-L40

but provided function PeerStorage.truncatedIndex and PeerStorage.truncatedTerm is using in memory applyState
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_storage.go#L219-L221

so in TestPeerStorageTerm test 2 and 3, Term will return ErrCompacted, since truncatedIndex is initially 5

should we modify PeerStorage.truncatedIndex and PeerStorage.truncatedTerm function?

`TestRawNodeRestart2AC` & `TestRawNodeRestartFromSnapshot2C` expected result don't have softstate and hardstate

DeepEqual will fail because want and rd differ in softstate and hardstate.

maybe we need something like

want := Ready{
		&SoftState{
			Lead:      None,
			RaftState: StateFollower,
		},
		pb.HardState{
			Term:   st.Term,
			Commit: st.Commit,
		},
		[]pb.Entry{},
		pb.Snapshot{},
		// commit up to commit index in st
		entries[:st.Commit],
		make([]pb.Message, 0), // Messages []pb.Message should be empty slice or nil?
	}

transactions: multi-threaded transactions

Currently, the scheduler executes all requests sequentially on a single thread. We should execute them concurrently. This will require synchronization of the storage layer using latches.

storage: refactor tests

The command tests are very repetitive with lots of boilerplate, there must be a better way

create peer twice may fatal

  1. It may create peer twice when region split:
  • when region split, it should call createPeer
  • when applying AdminCmdType_Split, there is no way to find out whether the new peer is creating or not
  • store_worker may call replicatePeer to create new peer when recive msg from other node

so, it may create peer twice at same time

  1. error occur when create peer at same time
  • createPeer will init raftLocalState.LastIndex and raftLocalState.LastTerm to 5 and replicatePeer will init them to 0. and write them to storage.

Imaging this: if createPeer write raftLocalState.LastIndex and raftLocalState.LastTerm with 5 to storage and replicatePeer get them, but when init applyState, replicatePeer get nothing and init applyState to 0. it will encounter bug since raftLocalState and applyState is not match.

Am I right? or this situation will nerver happened and I miss something?

Add comments for runner handler

Scope/requirements

I would like to define what is in and out of scope for the initial iteration of TinyKV. I'll start to make a list here, please comment if I got something wrong or if there are other things which should be in or out.

cc @siddontang, @Connor1996

In scope

  • Test framework
  • Distribution using Raft (Raft implemented as part of TinyKV, not using a library)
  • Raw and transactional API
  • Optimistic transactions (Percolator-style)
  • PD to balance regions and provide timestamps
  • Interoperable with TinySQL

Out of scope

  • Any kind of engine abstraction
  • Debug or diagnostics services (for development only)
  • Metrics/statistics
  • Any GC
  • Pessimistic transactions
  • backup/restore; sst import
  • profiler
  • status server
  • coprocessor
  • batched commands
  • security
  • interoperable with real TiDB and real PD

raft_test doesn't Step through messages generate by Raft.tick()?

from what I understant both local messages and network messages should be append into Raft.msgs, and some code is responsible for loop through Raft.msgs and call Raft.Step.
Is that correct?

so the local messages(MessageType_MsgHup and MessageType_MsgBeat) generate by Raft.tick() just sit in msgs, waiting to be process, until then it's state will not change.

if that is the case, the following code will not terminate
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/raft/raft_test.go#L797-L799
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/raft/raft_test.go#L853-L855

raftstore: add integration tests

LAB2: kvraft

PartA: raft

  • raft related test

PartB: raftstore + apply

  • port MIT6.824 3A #63

PartC: snapshot + raftloggc

  • port MIT6.824 3B #72

LAB3: multiraft

PartA: support conf change in raft

  • raft related test

PartB: implement conf change in raftstore

  • test_simple_conf_change #75
  • test_split_brain

PartC: split

  • test_base_split_region #83
  • test_split_epoch_not_match

storage: rename or split up kvstore package

The idea with the package is to provide abstractions for interacting with a low-level kv store. However, transaction and lock/write do kind of different things and it is really a package for lowering TinySQL abstractions to KV abstractions.

Simplify Scheduler code base

Schedule Part

At present, all sent messages follow these flow:

request -> (get store/bind stream/get region) -> raft cluster HandleRegionHeartbeat -> Dispatch

drivePushOperator -> PushOperators -> Dispatch -> SendScheduleCommand

PromoteWaitingOperator -> addOperatorLocked -> SendScheduleCommand

All operators are created in these flow:

Scheduler runScheduler -> AddWaitingOperator 

patrolRegions -> AddWaitingOperator

We can simplify scheduler in these points:

  • Remove merge operator and all related changes (because merge operation has two operator)

  • Simplify the total routine. Maybe the simplest way: request -> operator. Delete waiting queue, pushDriver... Should we have a waiting queue (or simply cancel all exceeded operation?)

schedule:

  • checker: make replica checker only checks down peer and offline peer( or just remove replica_checker) #61
  • filter: remove unnecessary filters #62
  • operator: only keep TransferLeader, AddPeer, RemovePeer #61
  • opt: remove these options #62
  • placement: remove it #61
  • remove waiting queue, pushDriver #61

schedulers:

  • remove label scheduler #61
  • simplify balance region scheduler #62

Corrupted snap files due to hard linking received snapshots

When I worked on a solution, sometimes I get error related to applying snapshot, complaining some of the sst files are of invalid size 0.

This is strange as snapshots are validated on reception. If it passes reception, it should not have an invalid size.

After digging into the snapshot applying code path, I noticed that DB.IngestExternalFiles is just hardlinking the sst files into badger without removing the received file here:
https://github.com/pingcap-incubator/tinykv/blob/fc36c301fbf92d05e30e629453a7c3282399af75/kv/raftstore/snap/snap.go#L692

so what can happen is sst files are getting compacted as part of lsm compaction, and its size is reduced to 0 (so compaction literally does unmap/truncate to free space I assume?). when snapshot is re-sent, the follower will find the snapshot already exists. then when the follower applies the snapshot, it blows up on size mismatch.

we should either:

  1. make a copy when applying the snapshot, or better
  2. remove the link after ingesting the snapshot file (and the entire snapshot file info including the manifest)?

I did approach 1) in my solution:
https://github.com/junlong-gao/tinykv/blob/0fc6c28acffee911a38b2c043809e6977d7d2822/kv/raftstore/snap/snap.go#L692
and the issue no longer surfaces.

Could pingcap team please take a look at this and validate if my theory here makes sense?

transaction layer

This issue is for the layer between the raft store and the server API, i.e., the server and storage modules in TiKV.

This layer must support the following:

  • translate protobuf data into internal data and commands
  • schedule and run commands
  • ensure that commands executing in parallel are serializable
  • handle locks and locking of data
  • manage locks (handle lock timeouts, etc)
  • delegate actual storage of key/values to raftstore
  • provide consistent reads (from snapshots)
  • handle rollbacks in the event of conflicts

At the highest level, this should be implemented in kv/tikv/server.go, by handling gRPC messages by calling the methods of InnerServer.

What does storage.Start() do

I think this function initializes storage, but it is not called in kv/server/server_test.go. So what does this function do?

message from old term should ignore or not?

In file raft_test.go (line 535) TestHandleMessageType_MsgAppend2AB test case 8 and 9.
The message term is 1, and they are sent to a follower which in term 2, and expected the follower could append them and update state. From what I understand, I think this message should be ignore because this two message's term is slower than the follower. someone can explain this? thx. :)

Possible typo in the README

Course material
Please follow the course material to learn the background knowledge and finish code step by step.

Should be Course Materials?

Consider implementing error for KeyError

just like how RegionError does, we can implement error for kvrpcpb.KeyError like this:

type KeyError struct {
	Err *kvrpcpb.KeyError
}

func (ke *KeyError) Error() string {
	return ke.Err.String()
}

so function can unify return value of kvrpcpb.KeyError into error. For example

func (p *Prewrite) prewriteMutation(txn *mvcc.MvccTxn, mut *kvrpcpb.Mutation) (*kvrpcpb.KeyError, error)  {}
->
func (p *Prewrite) prewriteMutation(txn *mvcc.MvccTxn, mut *kvrpcpb.Mutation) error {}
func (scan *Scanner) Next() ([]byte, []byte, interface{}) {}
-> 
func (scan *Scanner) Next() ([]byte, []byte, error) {}

panic in NewTestServer

I encounter a bug in test testClusterSuite.TestConcurrentHandleRegion3C. when call NewTestServer, it will panic, error stack

PANIC: cluster_test.go:376: testClusterSuite.TestConcurrentHandleRegion3C

... Panic: invalid page type: 0: 4 (PC=0x43F645)

/usr/lib/go-1.14/src/runtime/panic.go:969
  in gopanic
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/cursor.go:250
  in Cursor.search
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/cursor.go:159
  in Cursor.seek
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/bucket.go:165
  in Bucket.CreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/tx.go:108
  in Tx.CreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/backend/batch_tx.go:72
  in batchTx.UnsafeCreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/kvstore.go:149
  in NewStore
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/watchable_store.go:78
  in newWatchableStore
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/watchable_store.go:73
  in New
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/etcdserver/server.go:543
  in NewServer
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/embed/etcd.go:211
  in StartEtcd
server.go:163
  in Server.startEtcd
server.go:300
  in Server.Run
testutil.go:48
  in NewTestServer
cluster_test.go:379
  in testClusterSuite.TestConcurrentHandleRegion3C
/usr/lib/go-1.14/src/reflect/value.go:321
  in Value.Call
/home/xjw/go/pkg/mod/github.com/pingcap/[email protected]/check.go:850
  in suiteRunner.forkTest.func1
/home/xjw/go/pkg/mod/github.com/pingcap/[email protected]/check.go:739
  in suiteRunner.forkCall.func1
/usr/lib/go-1.14/src/runtime/asm_amd64.s:1373
  in goexit

do not konw whether it is my fault.

shall we expose ErrKeyNotFound error when we invoke GetCF Method ?

I encountered a pitfall in kv/server/server_test.go

Firstly

the test case TestRawGetAfterRawDelete1 in line 169 says we need to expose ErrKeyNotFound err when we try to get a non-exists key.

	resp, err := server.RawGet(nil, get)
	assert.Nil(t, err)
	assert.True(t, resp.NotFound)

Secondly

the code in TestRawDelete1 says we should return nil when we try to get a key that have already beed deleted.

	_, err := server.RawDelete(nil, req)
	assert.Nil(t, err)
	val, err := Get(s, cf, []byte{99})
	assert.Equal(t, nil, err)
	assert.Equal(t, []byte(nil), val)

In the end

I am able to pass the make project1 when I change this line .
from

	assert.Equal(t, nil, err)

to

	assert.Equal(t, badger.ErrKeyNotFound, err)

What I want to know

So, I wonder it this a bug ? or there is a missing part of my code that I should continue to implement ?

Cleanup implementation

clean up the implementation that should be filled by students and move to a new branch

  • lab1
  • lab2
  • lab3
  • lab4

Access global context is racy

for example, storeMeta is accessed in both newRaftWorker (mutation for conf change) and newStoreWorker (reading for heartbeating to pd):

https://github.com/pingcap-incubator/tinykv/blob/fc36c301fbf92d05e30e629453a7c3282399af75/kv/raftstore/raftstore.go#L91

I was working on a solution and noticed inconsistent region info was sent to the scheduler (peer and region epoch field in the storeMeta was inconsistent) probably due to the fact that I was mutating it in place, updating fields one after another, and heartbeat worker reads seeing 2 fields one before and one after the mutation.

I avoided this issue in my solution by doing a deep copy:

https://github.com/junlong-gao/tinykv/blob/0fc6c28acffee911a38b2c043809e6977d7d2822/kv/raftstore/peer_msg_handler.go#L79

yet it is still racy.

Perhaps we should kindly add a mutex in the global context so folks working on this will notice the race condition here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.