The gemini's discuss from scylladb

gemini-launcher silently continues even if compilation fails

This can be very frustrating to say the least.

Invalid clustering range queries generated

We get these errors:
Clustering column "ck1" cannot be restricted (preceding column "ck0" is restricted by a non-EQ relation

This is a result of queries like:
SELECT * FROM ks1.table1 WHERE pk0 = ? AND ck0 > ? AND ck0 < ? AND ck1 > ? AND ck1 < ? AND ck2 > ? AND ck2 < ?

We need to be smarter and generate a variety of queries proper EQ restrictions on the left clustering columns. Many combinations can and should be generated.

The reason we haven't seen them before is because of missing error handling which we should also fix.

Gemini version in the result file

This is very convenient and allows for easy inclusion of the version in reports.

Duration limit

Add a command line option to run the tool for a specified duration.

Passing --non-interactive flag causes a panic

Passing the --non-interactive flag to gemini:

$ ./scripts/gemini-launcher --non-interactive

causes the following panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x7250a0]

goroutine 150 [running]:
main.runJob.func1(0xc00026ac40, 0xc0002629c0, 0x8416c0, 0xc000271240, 0xc0002f6180, 0x8bb2c97000)
        /home/penberg/go/src/github.com/scylladb/gemini/cmd/gemini/root.go:195 +0x2f0
created by main.runJob
        /home/penberg/go/src/github.com/scylladb/gemini/cmd/gemini/root.go:171 +0x36b

Reported by @aleksbykov.

Partition key type is restricted to int

We restricted partition key type to int in commit 0f8e0dc because we suspect a bug in how Gemini partitions keyspace between threads.

Switch to scylla/gocql driver

It is good to use our own software and more and more users are using it over the regular driver.

Complete primitive CQL type support

We currently support the following subset of CQL primitive types:

int
bigint
blob
uuid
text
varchar
timestamp

Let's add support for the rest of the primitive types enumerated here:

http://cassandra.apache.org/doc/latest/cql/types.html#native-types

Add version number to gemini binary

Command line option for schema configuration file

Currently, if you launch gemini in the wrong directory, the tool complains that:

cannot create schema: open schema.json: no such file or directory

Let's add a command line option to specify the schema configuration file.

Generate schema updates

Let's generate schema updates (e.g. ALTER TABLE) as part of the test run to find Scylla bugs in that area.

CQL SELECT JSON support

Add support for CQL SELECT JSON statements:

http://cassandra.apache.org/doc/latest/cql/json.html#select-json

Column count configuration

We have rather few columns being generated and more types than fits in the columns currently.
Having this configurable would be neat to allow for more diverse data types being generated.

Secondary indexes not supported for durations

We need a blacklist or something for types that this is not supported.

Result of gemini contains only 0

If run several times to same nodes, gemini return results
{
"result": {
"write_ops": 0,
"write_errors": 0,
"read_ops": 0,
"read_errors": 0,
"errors": null
}
}
run next command: ./gemini -d --duration 60s -c 20 -p 100 -m mixed -f -v --test-cluster=34.201.251.194 --oracle-cluster=3.93.183.245
Version of gemini is 0.9.2
while debug log shows that there are a lot of operations:
INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[861 2010-10-14 831 aAQ1fWYPK map[udt_672245080_3:52 udt_672245080_0:X19J2m9kx udt_672245080_1:true udt_672245080_2:5594391364288805258] 2006-12-15 22:54:25 +0600 +06 0.812 c51db500-53ed-11d4-99b8-b06ebf2a6a60 map[false:14h17m0s true:14h22m0s] TavRBi29 859]) INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[561 2010-10-14 531 aAQ1fWzfc map[udt_672245080_0:XsJs udt_672245080_1:false udt_672245080_2:8730011856623748374 udt_672245080_3:5831394a326e684678] 2006-12-15 22:54:25 +0600 +06 0.512 c51db500-53ed-11d4-99b9-b06ebf2a6a60 map[false:9h17m0s true:9h22m0s] TavRBi7K 559]) INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[6 2010-10-14 31 aAQ1fRxcU map[udt_672245080_1:true udt_672245080_2:5594391364288805258 udt_672245080_3:52 udt_672245080_0:X19J2obzb] 2006-12-15 22:54:25 +0600 +06 0.012 c51db500-53ed-11d4-99ba-b06ebf2a6a60 map[true:1h2m0s false:57m0s] TavRBmqU 3]) INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[1061 2010-10-14 1031 aAQ1fUMZ1 map[udt_672245080_1:true udt_672245080_2:5594391364288805258 udt_672245080_3:52 udt_672245080_0:X19J2kB7o] 2006-12-15 22:54:25 +0600 +06 1.012 c51db500-53ed-11d4-99bb-b06ebf2a6a60 map[false:17h37m0s true:17h42m0s] TavRBnWw 1059]) INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[961 2010-10-14 931 aAQ1fTj4G map[udt_672245080_0:XsJs udt_672245080_1:false udt_672245080_2:8730011856623748374 udt_672245080_3:5831394a326d465349] 2006-12-15 22:54:25 +0600 +06 0.912 c51db500-53ed-11d4-99bc-b06ebf2a6a60 map[false:15h57m0s true:16h2m0s] TavRBjXB 959]) INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[1661 2010-10-14 1631 aAQ1fVynJ map[udt_672245080_2:8730011856623748374 udt_672245080_3:5831394a3270337261 udt_672245080_0:XsJs udt_672245080_1:false] 2006-12-15 22:54:25 +0600 +06 1.612 c51db500-53ed-11d4-99bd-b06ebf2a6a60 map[false:27h37m0s true:27h42m0s] TavRBl7O 1659]) INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0,col1,col2,col3,col4,col5) VALUES (?,?,?,?,?,?,?,?,?,(?,?)) (values=[1961 2010-10-14 1931 aAQ1fVIYV map[udt_672245080_1:false udt_672245080_2:8730011856623748374 udt_672245080_3:5831394a326e45684a udt_672245080_0:XsJs] 2006-12-15 22:54:25 +0600 +06 1.912 c51db500-53ed-11d4-99be-b06ebf2a6a60 map[false:32h37m0s true:32h42m0s] TavRBggJ 1959]) SELECT * FROM ks1.table1 WHERE pk0 IN (?) AND col2=? (values=[0.416]) SELECT * FROM ks1.table1 WHERE pk0 IN (?) AND col2=? (values=[1.716]) SELECT * FROM ks1.table1 WHERE pk0 IN (?) AND col2=? (values=[0.716]) SELECT * FROM ks1.table1 WHERE pk0 IN (?) AND col2=? (values=[0.616]) { "result": { "write_ops": 0, "write_errors": 0, "read_ops": 0, "read_errors": 0, "errors": null } } Test run completed. Exiting.

JSON Schema file input too complex

With the introduction of the complex types the schema input file was left behind. It needs to be tidied up and made understandable. The complex types needs better and more descriptive json notation.

Support list of hosts for the clusters

This is a convenient thing and can possibly help with bootstrapping issues.

Secondary index support

Add support for generating schema and query with secondary indexes.

Insertion failures because of time.Time marshaling problem

I am seeing the following error:

Failed! Mutation 'INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0) VALUES (?,?,?,?,?)' (values=[ZhtcLiAhrobsZP7Y5f9kFocDbDhPo/9jeHrF059htGLYGB126jCxOYiet 5 1994-03-12 14:31:51 +0200 EET 0cTaXQ5la32tSTixuiyUEknNFRqn5VNL3b5hh+hWEF+n7zvhpCMSu8jkfzz Uwtgq0ax18drFRuRCDPtlE6ApdMRegzcEgDUoplOWbzWgjIBSB7FhO]) caused an error: 'can not marshal time.Time into int [cluster = test, query = 'INSERT INTO ks1.table1 (pk0,ck0,ck1,ck2,col0) VALUES (?,?,?,?,?)']'

git bisect blames the following commit:

13d308e4fafaa967ce0d97992cf682c244a0cc1b is the first bad commit
commit 13d308e4fafaa967ce0d97992cf682c244a0cc1b
Author: Henrik Johansson <[email protected]>
Date:   Tue Mar 12 14:07:46 2019 +0100

    schema: Clustering keys can now have the same types as partition keys.

:100644 100644 a6d99a514ad1231420c0ff3e1c19e2b9531a15e8 1285d83f682e5fe9ac325f666a11de776d74299c M      CHANGELOG.md
:100644 100644 9bbefe97444f4404a2ea50f5d6e74f994823d89e d4cdada91f145b4e880f42f15c3d407717be69e5 M      go.mod
:100644 100644 9ca24e213568b35b43472ae8298e84b324ae567e 87967a39f921bdaacc732dd09606e2e09da59467 M      go.sum
:100644 100644 084890688da9a70a822c951092855a9aa81c9c34 5d3084a942089a890bbb10142fe368d464e3f186 M      schema.go

Result output to a file

Make it easier for machines to parse Gemini results by supporting result output to a file.

Error upone using schema parameter

If schema parameter used, gemini stop with error:
./gemini -d --duration 60s -c 20 -p 100 -m mixed -f --test-cluster=34.201.251.194 --oracle-cluster=3.93.183.245 --schema scheme.json
Seed: 1
Maximum duration: 1m0s
Concurrency: 20
Number of partitions per thread: 100
Test cluster: 34.201.251.194
Oracle cluster: 3.93.183.245
Output file:
cannot create schema: json: cannot unmarshal string into Go struct field ColumnDef.Type of type gemini.Type

schema file located here: https://s3.amazonaws.com/scylla-gemini/Binaries/schema.json

Support different consistency levels

Gemini currently uses the default QUORUM consistency level. Let's add support for different consistency levels.

Gemini launcher does not fetch new dependencies

When new dependencies are added, gemini-launcher does not seem to fetch them automatically:

[penberg@nero gemini]$ ./scripts/gemini-launcher --duration 10s --drop-schema
session.go:14:2: cannot find package "go.uber.org/multierr" in any of:
	/usr/lib/golang/src/go.uber.org/multierr (from $GOROOT)
	/home/penberg/go/src/go.uber.org/multierr (from $GOPATH)
Compilation failed

Primary key columns can only be of type int

We generate range queries for both partition and clustering keys, which limits the types of the columns to int due to limitations in value generation.

Collections CQL support

Refactor concurrency model

Currently each job is intended to work on an isolated set of partitions. This brings good things like easy coordination and program state management. It also comes with problems such as range queries that cross partitions doesn't work. This is an unacceptable limitation in the long run so we simply have to find another way to handle this.

refactoring: create query builder

Creating string based queries is error prone and hard to change. With a flexible builder we could make this much safer and easier.

Support different compaction strategies

Make Gemini generate schema using different compaction strategies.

Make printed out CQL statements executable from cqlsh

When Gemini test fails, the tool prints the CQL statement as follows:

SELECT * FROM ks1.table1 WHERE pk0 = ?' (values=[W]

Let's make the CQL statement executable from cqlsh without cumbersome copy-paste by substituting the question marks as follows:

SELECT * FROM ks1.table1 WHERE pk0 = textasblob('W');

(In this scenario, the type of pk0 is a blob hence the textasblob conversion function.)

--concurrency=1
--max-tests=10000

The result is:

thread 0: write ops: 1 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 529 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 1017 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 1515 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 2010 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 2498 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 2998 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 3490 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 3978 | read ops: 0 | write errors: 0 | read errors: 0
thread 0: write ops: 4486 | read ops: 0 | write errors: 0 | read errors: 0
Results:
	write ops:    4988
	read ops:     0
	write errors: 0
	read errors:  0

This seems way to short essentially half the requested runtime is cut.

scylladb / gemini Goto Github PK

gemini's Issues

Recommend Projects

Recommend Topics

Recommend Org