Giter Site home page Giter Site logo

copr-test's Introduction

Coprocessor Tests

Copr-test is a collection of integration tests for the Coprocessor module of TiKV. Coprocessor executes TiDB push-down executors to improve performance.

Push Down Test

./push-down-test

Currently we only have push-down-test to test the Coprocessor module. Push-down-test exists to make sure that execution results are consistent between TiDB and Coprocessor. It works by executeing test cases on a standalone TiDB (with mocktikv) and on a TiDB cluster (with TiKV Coprocessor) and comparing the two execution results.

Currently TiKV Coprocessor supports two execution models, the non-vectorized, traditional model and the newer vectorized model. Requests that cannot be processed by the vectorized model will fallback to the traditional model. Thus push-down-test also makes sure that the two execution model produce same results, by comparing the following three combinations:

  • TiDB
  • TiDB + TiKV (Non-Vectorized)
  • TiDB + TiKV (Vectorized)

Function Whitelist

The Coprocessor module does not support all functions of TiDB (and some are implemented but not fully tested), so that TiDB whitelists the functions that can be pushed down to TiKV Coprocessor. In push-down-test however, we use failpoint to control the whitelist. The whitelist can be found in push-down-test/functions.txt in this repository. This means that you can test whether or not a specific TiKV Coprocessor function implementation produces identical results to TiDB by adding the function name into the push-down-test/functions.txt file, instead of modifying TiDB's source code. This is extremely useful when we only want to add an implementation at TiKV side but not enabling the push down at TiDB side for now.

Test Cases

We have already added some test cases generated by randgen in the push-down-test. Feel free to add new ones. Your test case should be placed in the push-down-test/sql directory and ends with .sql suffix. Subdirectories are also supported.

Run Tests Locally

Push-down-test will be run on our CI platform for TiKV PR automatically. You can also run it locally in order to debug easier. Before that, make sure that you have set up an environment to successfully build TiDB, PD and TiKV.

Sample step:

mkdir my_workspace
cd my_workspace

git clone --depth=1 https://github.com/pingcap/pd.git
git clone --depth=1 https://github.com/pingcap/tidb.git
git clone --depth=1 https://github.com/tikv/tikv.git
git clone https://github.com/tikv/copr-test.git

cd pd
make

cd ../tikv
make build  # This make take a while. Be patient.

cd ../copr-test
export pd_bin=`realpath ../pd/bin/pd-server`
export tikv_bin=`realpath ../tikv/target/debug/tikv-server`
export tidb_src_dir=`realpath ../tidb`
# Run all tests
make push-down-test

# You can run other target like `push-down-without-vec`
# which will start a `mysql` client to run `push-down-without-vec` only.
# You can reproduce failed SQL more conveniently
make push-down-without-vec

# If you want to clean the `push-down-test/build` directory and kill all the tidb/pd/tikv processes, run:
make clean

If you want to filter some test cases:

include=1_arith_1.sql make push-down-test
exclude=1_arith_1.sql make push-down-test

copr-test's People

Contributors

0x5457 avatar 0x5459 avatar aerysnan avatar breezewish avatar cireu avatar codeworm96 avatar fatetharlaown avatar fullstop000 avatar gauss1314 avatar guo-shaoge avatar h-zex avatar hawkingrei avatar iosmanthus avatar koushiro avatar lonng avatar maicw4j avatar morgo avatar purelind avatar solotzg avatar tommycpp avatar wangwangwar avatar waynexia avatar windtalker avatar wjhuang2016 avatar yibin87 avatar yisaer avatar zhongzc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

copr-test's Issues

expectNoErr PANIC

[2020-04-18T16:43:31.746Z] 2020/04/19 00:43:31 Error 1690: DOUBLE value is out of range in 'pow(0, -1)'
[2020-04-18T16:43:31.747Z] panic: Error 1690: DOUBLE value is out of range in 'pow(0, -1)'
[2020-04-18T16:43:31.747Z] 
[2020-04-18T16:43:31.747Z] 
[2020-04-18T16:43:31.747Z] goroutine 40 [running]:
[2020-04-18T16:43:31.747Z] log.Panicln(0xc015be1f38, 0x1, 0x1)
[2020-04-18T16:43:31.747Z] 	/usr/local/go/src/log/log.go:352 +0xac
[2020-04-18T16:43:31.747Z] main.expectNoErr(...)
[2020-04-18T16:43:31.747Z] 	/home/jenkins/agent/workspace/tidb_ghpr_integration_copr_test/copr-test/push-down-test/src/util.go:111
[2020-04-18T16:43:31.747Z] main.runSingleStatement(0xc002649f6f, 0x122, 0x792, 0xc007a0cfc0, 0xc0145e80c0, 0x1)
[2020-04-18T16:43:31.747Z] 	/home/jenkins/agent/workspace/tidb_ghpr_integration_copr_test/copr-test/push-down-test/src/main.go:170 +0x3b4
[2020-04-18T16:43:31.747Z] main.runStatements(0xc0145e80c0, 0x7ffe3fce030d, 0x37, 0xc0175ee000, 0xbb3, 0xbb3)
[2020-04-18T16:43:31.747Z] 	/home/jenkins/agent/workspace/tidb_ghpr_integration_copr_test/copr-test/push-down-test/src/main.go:151 +0xb2
[2020-04-18T16:43:31.747Z] created by main.runTestCase
[2020-04-18T16:43:31.747Z] 	/home/jenkins/agent/workspace/tidb_ghpr_integration_copr_test/copr-test/push-down-test/src/main.go:144 +0x350
[2020-04-18T16:43:31.747Z] + Test finished
[2020-04-18T16:43:31.747Z]   - /home/jenkins/agent/workspace/tidb_ghpr_integration_copr_test/copr-test/push-down-test/build/push_down_test_bin exit code is 2

Some errors do not occur immediately after Query:

rows, err := db.Query(stmt)
if err != nil {

Instead, they will occur during the Next calls:
for rows.Next() {
tmp := make([][]byte, len(cols))
for i := 0; i < len(args); i++ {
args[i] = &tmp[i]
}
err := rows.Scan(args...)
if err != nil {
return nil, errors.Trace(err)
}
data = append(data, ByteRow{tmp})
}
err = rows.Err()
if err != nil {
return nil, errors.Trace(err)
}

then PANIC here:
byteRows, err := SqlRowsToByteRows(rows)
expectNoErr(err)

ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:4007' (61)

When following the instructions to run copr-test locally, I fail to connect tidb (WithPushDown)

...
...
+ Launching TiDB for WithPushDown test using config /Users/lijiarui/Works/pingcap/internship/tikv-test/copr-test/push-down-test/build/config/with_push_down/tidb.toml

  - Config content:
port = 4007
store = "tikv"
path = "127.0.0.1:4479"
socket = ""

[status]
report-status = false
  - Starting process...
/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/bin/tidb-server -config /Users/lijiarui/Works/pingcap/internship/tikv-test/copr-test/push-down-test/build/config/with_push_down/tidb.toml -log-file /Users/lijiarui/Works/pingcap/internship/tikv-test/copr-test/push-down-test/build/tidb_with_push_down.log -L warn
  - Sleep 10s to wait for TiDB to start

+ Waiting TiDB start up (NoPushDown)
+--------------------+
| Database           |
+--------------------+
| INFORMATION_SCHEMA |
| METRICS_SCHEMA     |
| PERFORMANCE_SCHEMA |
| mysql              |
| test               |
+--------------------+
  - TiDB startup successfully (NoPushDown)

+ Waiting TiDB start up (WithPushDown)
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:4007' (61)
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:4007' (61)
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:4007' (61)

// pd log

[2022/07/17 20:16:12.722 -04:00] [WARN] [server.go:298] ["exceeded recommended request limit"] [max-request-bytes=157286400] [max-request-size="157 MB"] [recommended-request-bytes=1$
[2022/07/17 20:16:12.913 -04:00] [WARN] [store.go:1317] ["simple token is not cryptographically signed"]
[2022/07/17 20:16:12.992 -04:00] [WARN] [metrics.go:193] ["failed to get file descriptor usage"] [error="cannot get FDUsage on darwin"]
[2022/07/17 20:16:13.785 -04:00] [FATAL] [versioninfo.go:57] ["version string is illegal"] [error="[PD:semver:ErrSemverNewVersion]51ad29c is not in dotted-tri format: 51ad29c is not$

// tidb log

...
...
 100 [2022/07/17 20:17:56.024 -04:00] [WARN] [base_client.go:251] ["[pd] failed to get cluster id"] [url=http://127.0.0.1:4479] [error="[PD:client:ErrClientGetMember]error:rpc            error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\" target:127.0.0.1:4479 status:       TRANSIENT_FAILURE: error:rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\"        target:127.0.0.1:4479 status:TRANSIENT_FAILURE"]
 101 [2022/07/17 20:17:57.029 -04:00] [WARN] [base_client.go:251] ["[pd] failed to get cluster id"] [url=http://127.0.0.1:4479] [error="[PD:client:ErrClientGetMember]error:rpc            error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\" target:127.0.0.1:4479 status:       TRANSIENT_FAILURE: error:rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\"        target:127.0.0.1:4479 status:TRANSIENT_FAILURE"]
 102 [2022/07/17 20:17:58.034 -04:00] [WARN] [base_client.go:251] ["[pd] failed to get cluster id"] [url=http://127.0.0.1:4479] [error="[PD:client:ErrClientGetMember]error:rpc            error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\" target:127.0.0.1:4479 status:       TRANSIENT_FAILURE: error:rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\"        target:127.0.0.1:4479 status:TRANSIENT_FAILURE"]
 103 [2022/07/17 20:17:59.035 -04:00] [WARN] [store.go:82] ["new store with retry failed"] [error="[pd] failed to get cluster id"]
 104 [2022/07/17 20:17:59.035 -04:00] [FATAL] [terror.go:298] ["unexpected error"] [error="[pd] failed to get cluster id"] [stack="github.com/pingcap/tidb/parser/terror.MustNil\n\t/      Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/parser/terror/terror.go:298\nmain.createStoreAndDomain\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/tidb-        server/main.go:299\nmain.main\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/tidb-server/main.go:204\nruntime.main\n\t/opt/homebrew/Cellar/go/1.18.2/libexec/src/         runtime/proc.go:250"] [stack="github.com/pingcap/tidb/parser/terror.MustNil\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/parser/terror/terror.go:298\nmain.             createStoreAndDomain\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/tidb-server/main.go:299\nmain.main\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/        tidb-server/main.go:204\nruntime.main\n\t/opt/homebrew/Cellar/go/1.18.2/libexec/src/runtime/proc.go:250"]

I tried to modify server/versioninfo/MustParseVersion to directly return semver.New(featuresDict[Version5_0]). However, the tidb still can not be connected. And the logs are as follows

// pd log

   1 [2022/07/17 20:26:40.544 -04:00] [WARN] [server.go:298] ["exceeded recommended request limit"] [max-request-bytes=157286400] [max-request-            size="157 MB"] [recommended-request-bytes=10485760] [recommended-request-size="10 MB"]
   2 [2022/07/17 20:26:40.738 -04:00] [WARN] [store.go:1317] ["simple token is not cryptographically signed"]
   3 [2022/07/17 20:26:40.819 -04:00] [WARN] [metrics.go:193] ["failed to get file descriptor usage"] [error="cannot get FDUsage on darwin"]
   4 [2022/07/17 20:26:45.041 -04:00] [WARN] [dynamic_config_manager.go:164] ["Dynamic config does not exist in etcd"]
   5 [2022/07/17 20:27:03.442 -04:00] [WARN] [grpclog.go:60] ["transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:4479-         >127.0.0.1:58610: use of closed network connection"]
   6 [2022/07/17 20:27:03.443 -04:00] [WARN] [grpclog.go:60] ["transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:4479-         >127.0.0.1:58609: use of closed network connection"]
   7 [2022/07/17 20:27:03.443 -04:00] [WARN] [grpclog.go:60] ["transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:4479-         >127.0.0.1:58625: use of closed network connection"]
   8 [2022/07/17 20:27:03.443 -04:00] [WARN] [grpclog.go:60] ["transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:4479-         >127.0.0.1:58626: use of closed network connection"]

// tidb log

...
[2022/07/17 20:27:23.480 -04:00] [ERROR] [kv.go:243] ["fail to load safepoint from pd"] [error="context deadline exceeded"]
  26 [2022/07/17 20:27:23.481 -04:00] [ERROR] [client.go:907] ["[pd] update connection contexts failed"] [dc=global] [error="rpc error: code =             Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\""]
  27 [2022/07/17 20:27:23.951 -04:00] [ERROR] [base_client.go:144] ["[pd] failed updateMember"] [error="[PD:client:ErrClientGetLeader]get leader from      [http://127.0.0.1:4479] error"]
  28 [2022/07/17 20:27:26.819 -04:00] [ERROR] [base_client.go:144] ["[pd] failed updateMember"] [error="[PD:client:ErrClientGetLeader]get leader from      [http://127.0.0.1:4479] error"]
  29 [2022/07/17 20:27:29.481 -04:00] [ERROR] [kv.go:243] ["fail to load safepoint from pd"] [error="context deadline exceeded"]
  30 [2022/07/17 20:27:29.482 -04:00] [ERROR] [client.go:907] ["[pd] update connection contexts failed"] [dc=global] [error="rpc error: code =             Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\""]
  31 [2022/07/17 20:27:29.482 -04:00] [ERROR] [client.go:792] ["[pd] create tso stream error"] [dc-location=global] [error="[PD:client:                    ErrClientCreateTSOStream]create TSO stream failed, retry timeout"]
  32 [2022/07/17 20:27:29.482 -04:00] [ERROR] [pd.go:236] ["updateTS error"] [txnScope=global] [error="[PD:client:ErrClientCreateTSOStream]create TSO      stream failed, retry timeout"]
  33 [2022/07/17 20:27:29.482 -04:00] [ERROR] [base_client.go:144] ["[pd] failed updateMember"] [error="[PD:client:ErrClientGetLeader]get leader from      [http://127.0.0.1:4479] error"]
  34 [2022/07/17 20:27:29.746 -04:00] [ERROR] [base_client.go:144] ["[pd] failed updateMember"] [error="[PD:client:ErrClientGetLeader]get leader from      [http://127.0.0.1:4479] error"]
  35 [2022/07/17 20:27:32.737 -04:00] [ERROR] [base_client.go:144] ["[pd] failed updateMember"] [error="[PD:client:ErrClientGetLeader]get leader from      [http://127.0.0.1:4479] error"]
  36 [2022/07/17 20:27:32.737 -04:00] [WARN] [backoff.go:158] ["pdRPC backoffer.maxSleep 40000ms is exceeded, errors:\nloadRegion from PD failed,          key: \"6D426F6F7473747261FF704B657900000000FB0000000000000073\", err: rpc error: code = Unavailable desc = connection error: desc = \"transport:      Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\" at 2022-07-17T20:27:23.951276-04:00\nloadRegion from PD failed, key:       \"6D426F6F7473747261FF704B657900000000FB0000000000000073\", err: rpc error: code = Unavailable desc = connection error: desc = \"transport:           Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\" at 2022-07-17T20:27:26.819438-04:00\nloadRegion from PD failed, key:       \"6D426F6F7473747261FF704B657900000000FB0000000000000073\", err: rpc error: code = Unavailable desc = connection error: desc = \"transport:           Error while dialing dial tcp 127.0.0.1:4479: connect: connection refused\" at 2022-07-17T20:27:29.746725-04:00\nlongest sleep type: pdRPC, time:      29039ms"]
  37 [2022/07/17 20:27:32.737 -04:00] [FATAL] [session.go:2951] ["check bootstrapped failed"] [error="[tikv:9001]PD server timeout"] [stack="github.       com/pingcap/tidb/session.getStoreBootstrapVersion\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/session/session.go:2951\ngithub.com/     pingcap/tidb/session.BootstrapSession\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/session/session.go:2719\nmain.                       createStoreAndDomain\n\t/Users/lijiarui/Works/pingcap/internship/tikv-test/tidb/tidb-server/main.go:301\nmain.main\n\t/Users/lijiarui/Works/          pingcap/internship/tikv-test/tidb/tidb-server/main.go:204\nruntime.main\n\t/opt/homebrew/Cellar/go/1.18.2/libexec/src/runtime/proc.go:250"]

ScalarFunction TimestampDiff is not supported in batch mode

The SQL

SELECT DATE( `col_char_2_key` ) AS field1 FROM `table0_int_autoinc` WHERE TIMESTAMPDIFF( MICROSECOND, `col_char_2_key`, ( UNIX_TIMESTAMP( NULL ) ) );

in sql/randgen/6_date_1.sql will fail when executing manually but never fail the whole test.

Expected:

[2020-04-16T09:57:55.737Z] 2020/04/16 17:57:55 2020/04/16 17:57:34 Test fail: Outputs are not matching.
[2020-04-16T09:57:55.737Z] Test case: sql/randgen-limit/6_date_1.sql
[2020-04-16T09:57:55.737Z] Statement: #151 -  SELECT DATE( `col_char_2_key` ) AS field1 FROM `table0_int_autoinc` WHERE TIMESTAMPDIFF( MICROSECOND, `col_char_2_key`, ( UNIX_TIMESTAMP( NULL ) ) ) LIMIT 1 /* QNO 154 CON_ID 164 */ ;
[2020-04-16T09:57:55.737Z] NoPushDown Output: 
[2020-04-16T09:57:55.737Z] field1
[2020-04-16T09:57:55.737Z] 
[2020-04-16T09:57:55.737Z] 
[2020-04-16T09:57:55.737Z] WithPushDown Output: 
[2020-04-16T09:57:55.737Z] Error 1105: other error: [components/tidb_query/src/batch/runner.rs:83]: BatchSelectionExecutor: Evaluate error: [components/tidb_query/src/rpn_expr/mod.rs:526]: ScalarFunction TimestampDiff is not supported in batch mode

Provide a better comparison utility to check query results

Current check can't work correctly for order by + limit queries :
MySQL specification doesn't guaratee the order of identical values.
Order by alone will generate sort operator, and will execute in TiDB only.
While order by + limit will generate two level TopN operators, one in coprocessor, one in TiDB, which might introduce implementation diefferences. And they are all right according to MySQL specification. But different orders of identical values will fail our copr test.
https://dev.mysql.com/doc/refman/5.6/en/limit-optimization.html

One current workaround pr, which removed limit operator, to pass the test. #162

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.