Giter Site home page Giter Site logo

tencent / tendbcluster-tspider Goto Github PK

View Code? Open in Web Editor NEW
134.0 11.0 43.0 336.32 MB

TSpider is the proxy node of TenDB Cluster.

License: GNU General Public License v2.0

Shell 0.86% CMake 0.82% C 42.94% C++ 48.47% Perl 2.30% Roff 0.85% Objective-C 0.56% Batchfile 0.03% Pascal 0.35% PHP 0.81% HTML 0.89% Assembly 0.06% NASL 0.91% SourcePawn 0.01% Pawn 0.01% Raku 0.01% VBA 0.01% POV-Ray SDL 0.01% M4 0.09% Makefile 0.03%

tendbcluster-tspider's People

Contributors

bjornmu avatar bkandasa avatar blaudden avatar buggynours avatar cvicentiu avatar dr-m avatar elenst avatar esmet avatar fizzfaldt avatar gkodinov avatar grooverdan avatar gurusami avatar igorbabaev avatar jhauglid avatar kevgs avatar knielsen avatar leifwalsh avatar marcalff avatar midenok avatar montywi avatar mydanny avatar pastcompute avatar philip-galera avatar prohaska avatar sanja-byelkin avatar spetrunia avatar vaintroub avatar vasild avatar vuvova avatar willhan123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tendbcluster-tspider's Issues

New variable spider_direct_insert_ignore

When we use INSERT IGNORE in Spider.

  1. If spider_direct_dup_insert is 1, Spider will send INSERT along with IGNORE to remote backends, so if there is a duplicate, Spider will not get noticed. This is usually problematic when dealing with auto-increment columns.
  2. If spider_direct_dup_insert is 0, Spider will not send INSERT IGNORE to remote backends and handle duplicates on the Server layer.

If we view the Spider and Remote backends altogether as a single MySQL, then value 0 should be more appropriate.

However, this variable does not affect the behavior of INSERT IGNORE, but also REPLACE, ON DUPLICATE UPDATE ..., which is a little messy.

In the 3.5.3 version, we add a new variable named spider_direct_insert_ignore with the default value 0. If the value is 0, Spider will send INSERT without the IGNORE to remote backends and handle duplicates on the Server layer. If the value is 1, Spider will send INSERT together with the IGNORE to backends.

JSON Boolean Bug on Spider

If the remote backends are MySQL rather than MariaDB

Select with JSON boolean value true returns empty set

INSERT INTO tbl VALUES (1, json_array(1,'abc',NULL,true));
SELECT * FROM tbl WHERE (j = json_array(1,'abc',NULL,true));

We added new global variable print_bool_as_literal,
If true:

SET GLOBAL print_bool_as_literal = ON;
SELECT * FROM t1 WHERE (j = json_array(1,'abc',NULL,true));
SELECT * FROM t1 WHERE (j = json_array(1,'abc',NULL,false));
-- TSpider will send to backends:
select `i`,`j` from `auto_test_remote`.`t1` where (`j` = /* <cache> */ (json_array(1,'abc',NULL,true)));
select `i`,`j` from `auto_test_remote`.`t1` where (`j` = /* <cache> */ (json_array(1,'abc',NULL,false)));

Otherwise:

SET GLOBAL print_bool_as_literal = OFF;
SELECT * FROM t1 WHERE (j = json_array(1,'abc',NULL,true));
SELECT * FROM t1 WHERE (j = json_array(1,'abc',NULL,false));
-- TSpider will send to backends:
select `i`,`j` from `auto_test_remote`.`t1` where (`j` = /* <cache> */ (json_array(1,'abc',NULL,1)));
select `i`,`j` from `auto_test_remote`.`t1` where (`j` = /* <cache> */ (json_array(1,'abc',NULL,0)));

TSpider quick multi-range read (mrr) BUG

Summary

The multi-range read execution plan is used in both rr_quick and joins optimization.
In the TSpider, the multi-range read plan is problematic, see the bug described below.
This is reproducible in MariaDB Spider also.

Bug

On backends,

CREATE TABLE `tbl` (
  `id` int(11) NOT NULL,
  `flag` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
)ENGINE=INNODB;

On TSpider,

CREATE TABLE `tbl` (
  `id` int(11) NOT NULL,
  `flag` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=SPIDER DEFAULT CHARSET=utf8
PARTITION BY LIST (id % 4)
(PARTITION `pt0` VALUES IN (0) COMMENT = 'database "bugtest", table "tbl", server "bk0"' ENGINE = SPIDER,
 PARTITION `pt1` VALUES IN (1) COMMENT = 'database "bugtest", table "tbl", server "bk1"' ENGINE = SPIDER,
 PARTITION `pt2` VALUES IN (2) COMMENT = 'database "bugtest", table "tbl", server "bk2"' ENGINE = SPIDER,
 PARTITION `pt3` VALUES IN (3) COMMENT = 'database "bugtest", table "tbl", server "bk3"' ENGINE = SPIDER);

CREATE TABLE `m` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `tbl` VALUES (101624109, 1);
INSERT INTO `tbl` VALUES (91272871, 1);
INSERT INTO `tbl` VALUES (94933594, 1);
INSERT INTO `tbl` VALUES (98646655, 1);
INSERT INTO `tbl` VALUES (13914947, 0);
INSERT INTO `tbl` VALUES (45051390, 0);
INSERT INTO `tbl` VALUES (30864034, 1);
INSERT INTO `tbl` VALUES (33671239, 0);
INSERT INTO `tbl` VALUES (39109063, 1);
INSERT INTO `tbl` VALUES (91137966, 1);
INSERT INTO `tbl` VALUES (95897689, 0);
INSERT INTO `tbl` VALUES (571307512, 1);
INSERT INTO `tbl` VALUES (35706155, 1);
INSERT INTO `tbl` VALUES (34044708, 0);

set GLOBAL spider_ignore_single_select_index = OFF;
insert into `m`(id) select `id` from `tbl` where `id` not in (94933594) and `flag` = 1;

Solution

Our solution is simply prohibiting mrr for use. Also, we set the maximum of join_cache_level to 4, so the BKA join algorithm will not take effect in TSpider.
In fact, a lot of optimization in the Server layer will do a little benefit for Spider performance but brings too much uncertainty to Spider. Therefore, in general, we ignore using indices, in other words, we use a full-table scan for most single queries. A single query is one without joining.

Todo

  1. When spider_ignore_single_update_index=ON and sql_safe_updates=ON, we cannot do update and delete based on indices as where condition.

Wrong number of rows changed when using update join

If we are using update join, the value before updating and after updating is the same. The row changed should return 0. However, TSpider returns the number of rows changed as the same as the number of rows matched.

example:

MariaDB [(none)]> select actor_rid, level from aaa.db_actor where actor_rid in 
(9223654611343214324,9223654611343244360,9223654611343214270);
+---------------------+-------+
| actor_rid           | level |
+---------------------+-------+
| 9223654611343214270 |     1 |
| 9223654611343244360 |    74 |
| 9223654611343214324 |     3 |
+---------------------+-------+
3 rows in set (0.007 sec)

MariaDB [(none)]> select actor_rid, level from test.db_actor_test where actor_rid in 
(9223654611343214324,9223654611343244360,9223654611343214270);
+---------------------+-------+
| actor_rid           | level |
+---------------------+-------+
| 9223654611343214324 |     2 |
| 9223654611343244360 |    74 |
+---------------------+-------+
2 rows in set (0.000 sec)

Execute the following SQL on TSpider.

MariaDB [(none)]> update aaa.db_actor t1  inner join test.db_actor_test t2 on t1.actor_rid = t2.actor_rid set     
t1.level=t2.level where t1.actor_rid  in (9223654611343214324,9223654611343244360,9223654611343214270);
Query OK, 2 rows affected (0.006 sec)
Rows matched: 2  Changed: 2  Warnings: 0

MariaDB [(none)]> select actor_rid, level from aaa.db_actor where actor_rid in 
(9223654611343214324,9223654611343244360,9223654611343214270);
+---------------------+-------+
| actor_rid           | level |
+---------------------+-------+
| 9223654611343214270 |     1 |
| 9223654611343244360 |    74 |
| 9223654611343214324 |     2 |
+---------------------+-------+
3 rows in set (0.002 sec)

Only one row changed, but returns Changed 2.

Parallel SELECT with LIMIT

Added global variables spider_parallel_limit with the default value "FALSE". when spider_parallel_limit is true, spider will parallel dispatch "select + limit" query

TSpider AUTO-INCREMENT fixes

BUG

  1. SHOW TABLE STATUS and SHOW CREATE TABLE will affect our auto-increment value and result in duplications.
  2. If a duplicate key is found, in some scenarios, more duplication will be generated.

We fixed the problem by resolving some multi-threaded conflicts and avoid information_schema.auto_increment to affect our auto-increment value. In other words, we no longer maintain the value of it.
Upon 3.6.4, users should not rely on the value of information_schema.auto_increment. (We think the maintenance of this value in a distributed system is a waste of efficiency and useless.

[Documentation] TSpider 3.5.3 Change Log

English
中文

Change Log


Please refer to https://github.com/Tencent/TenDBCluster-TSpider/releases/tag/tspider-3.5.3

更新记录

  1. 新增session参数spider_direct_insert_ignore,如果为0,对于insert ignore into ...语句,tspider不会下发ignore给后端,此时如果出现唯一键/主键重复,会返回warning。参数值如果为1,此时tspider会让后端处理ignore,如果出现唯一键/主键重复,tspider不会返回warning。默认值为0
  2. 修复了tspider执行multi-range read时结果会出现重复的情况。并且出于稳定性考虑,禁止了tspider使用BKA join算法,即系统参数join_cache_level若大于4则对tspider无效。(解决了之前INSERT INTO m(id) SELECT id FROM tbl WHERE id NOT IN (94933594) AND flag = 1;会产生duplicate key error的问题)
  3. 修复tspider在使用create server ...时会crash的问题
  4. 修复tspider可能在VS2019无法编译的一处小问题

TSpider Optimized Connection Pool

TL;DR

We use a Read-write lock Hash Map design with multiple Mutex Stacks to re-implement the Spider connection pool.

In our test, it reduces the p99 latency and increases the overall QPS. The performance is significantly better than the original design when the query threads are much more than the connection pool capacity.

h3. Introduction
The Spider SE creates and maintains a connection pool that caches the connection objects (pointers) to multiple remote backends.

The Spider can simultaneously send multiple SQL to one backend, so there are probably numerous connection objects to the same backend in the connection pool. These connection objects are interchangeable as long as they point to the same backend.

When receiving a request with a specified backend target bk, one Spider thread searches for one connection object to bk in the connection pool. If it exists, Spider will use the object to connect the backend and send SQL to it. After the SQL execution is complete, Spider will put the connection object back to the connection pool.

Otherwise, Spider will create one or wait for another thread to put back one. The variable spider_max_connection determines the specific behavior (whether to create one or wait). And the variable spider_conn_wait_timeout specifies the max waiting time.

Periodically, the Spider will recycle some connection objects if they are not used for a particular interval.

The Original Connection Pool

The Spider used a hash map to represent the connection pool of which the hash key is the remote_name#version and the hash value is the connection object pointer.
For example, in an example configuration:

mysql> select * from mysql.servers;
+-------------+-----------+----+----------+----------+-------+--------+---------+-------+
| Server_name | Host      | Db | Username | Password | Port  | Socket | Wrapper | Owner |
+-------------+-----------+----+----------+----------+-------+--------+---------+-------+
| SPT0        | 127.0.0.1 |    | mysql    | mysql    | 20000 |        | mysql   |       |
| SPT1        | 127.0.0.1 |    | mysql    | mysql    | 20001 |        | mysql   |       |
| SPT2        | 127.0.0.1 |    | mysql    | mysql    | 20002 |        | mysql   |       |
| SPT3        | 127.0.0.1 |    | mysql    | mysql    | 20003 |        | mysql   |       |
+-------------+-----------+----+----------+----------+-------+--------+---------+-------+
4 rows in set (0.00 sec)

The name for the remote server 127.0.0.1:20000 is SPT0. By default, the version is 0, so the hash key is 'SPT0#0'. If the 127.0.0.1:20000 is down, we will do an active/standby failover. The remote standby server will inherit the name SPT0, but update the version to 1, so the new hash key will be 'SPT0#1'.

If we want to get a connection pointer from the connection pool, Spider searches the server name from the hash map, then deletes and returns it if one exists. Spider put back a connection pointer by constructing a new hash item by packaging the 'servername#version', pointer pair, and insert it into the hash map.

Therefore, the hash map is a multi-map, meaning a hash key corresponds to multiple values.

Spider uses a mutex lock to guarantee thread safety whenever it wants to access the hash map, including the search, insertion, and deletion. This is not efficient enough since we will lock the whole hash structure every time we access it. And it is especially slow when we recycle connections since we have to iterate the entire hash map structure and delete connection objects that exceed the threshold.

Optimization

We proposed a new way to represent the connection pool, essentially split one mutex lock into multiple mutex locks to reduce locking time. In CPP pseudocode, it's like:

typedef std::string key_type; /* backend_name#version */
typedef void * conn_ptr;      /* connection pointer */

typedef struct {
  std::stack<conn_ptr> stk;
  pthread_mutex_t mutex;
} mutex_stack;  /* a stack with a mutex lock */

typedef struct {
  pthread_rwlock_t rwlock;
  std::unordered_map<key_type, mutex_stack> hashmap;
} spider_connection_pool; /* a hashmap with a rwlock */

The key of the hash map remained unchanged, is the remote_name#version, whereas the hash value is a stack that stores connection pointers. In our design, the hash map is not a multi-map, it's a unique hash map, meaning one hash key only corresponds to one value.
The hash map is guard by a read-write lock, and each stack is guard by a mutex lock.
For instance, if we have four remote backends, our connection pool should have four items; each remote backend corresponds to one mutex stack.

The stack stores the pointer of connections. If we want to put back a connection pointer, we push it into the corresponding stack. If we're going to get a connection pointer, we pop it from the corresponding stack. The push and pop operation should be wrapped by mutex locking and unlocking.

Finding the 'corresponding' stack is, first apply a read lock, second, search in the hash map by the key, third, release the read lock. If the key does not exist in the hash map, we initialize a new mutex stack, construct a hash item by <key, mutex_stack>, apply a write lock, and try to insert it into the hash map. Please note that many threads could insert hash items with the same key, and the hashmap is a unique hash so that the insertion could fail, and we should re-search the hash map.

We generally define two essential functions named put_conn (put a connection pointer back to the pool) and get_conn (get a connection pointer specified by the key).
The write lock acquirement only takes place in put_conn. If we cannot find a corresponding hash in get_conn, we can directly return not found, thus creating a new connection without getting one in the connection pool.

Except remote backends fail, we only need to apply the write lock n times. (assuming we have n remote backends) Therefore, nearly 100 percent of our operation on the hash map is read in the run time, meaning roughly non-blocking. We split one big mutex lock into n mutex locks for n stacks compared to the original design, reducing lots of conflicts.

h3. Performance
We test the performance of point selection on 1 Spider with 16 remote backends architecture. The Spider machine uses a 12-core Intel Xeon Cascade Lake 8255C (2.5 GHz) with 15G RAM. Other physical configurations are guaranteed fast and large enough so that they cannot be the bottleneck.
The target table t1 has 10GB of data and is with the following structure.

CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT 0,
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=SPIDER DEFAULT CHARSET=utf8

And we perform point selection using the Sysbench test bench.

SELECT * FROM `sbtest1` WHERE id = ?;

Test 1

We use the Perf performance test bench tool to trace the time occupied by the concurrency-related functions of the connection pool in the CPU. The result is as follows.

pthread_mutex_lock pthread_getspecific pthread_rwlock_lock Total
Before 0.355996% 0.060372% N/A 0.416368%
After 0.138516% 0.07392% 0.00003% 0.212466%

lf the remote backends is without failure, all operations of RWLock are read. So RWLock is almost non-blocking. Although we spent more time on pthread_getspecific (because we have more mutex locks than before), the considerable reduction of pthread_mutex_lock time reduce the proportion of CPU time to 51.03%, which is a significant improvement.

Test 2

We keep the number of Sysbench client threads to be 300 unchanged and do point-selection stress tests with different spider_max_connection variable values.
spider_max_connection specifies the maximum number of spider connections to a single remote backend. In other words, the maximum capacity of a single remote backend in the connection pool.
QPS tests:

spider_max_connection Before After After / Before
100 124912.36 125323.45 100.32%
50 124804.67 125407.97 100.48%
20 124437.48 125411.6 100.78%
10 98681.96 117253.38 118.81%
5 83200.37 98597.90 118.50%

It can be seen from the above test that when the number of test threads is greater than the total capacity of the connection pool (16 remotes times 10 < 300 client threads), the performance improvement of the optimization solution is significant.

h4. Test 3
We let spider_max_connection be 100 unchanged, use 50, 100, and 200, 300 Sysbench client threads to perform stress tests, and use the Sysbench summarization to record the QPS and p99 latency performance.
p99 latency tests:

Before After After / Before
50 threads 1.04 1.04 100%
100 threads 1.96 1.96 100%
200 threads 8.13 7.84 96.43%
300 threads 124644.19 125723.51 89.76%

It can be seen from the results that the greater the number of threads, the more pronounced the reduction in p99 latency, indicating that the overall query time is more stable.

QPS tests:

Before After After / Before
50 threads 57168.74 57670.93 100.88%
100 threads 99769.4 99937.68 100.17%
200 threads 122404.09 123153.73 100.61%
300 threads 124644.19 125723.51 100.87%

The overall QPS are slightly greater than the original design when the connection pool capacity is large enough.

[Documentation] TSpider 3.6.5 Change Log

English
中文

Change Log

Bugfix

  1. Signed integer out of unsigned bug-fix for PREPARE/EXECUTE.

更新记录

Bug修复

  1. 修复TSpider在使用PREPARE + EXECUTE执行时,EXECUTE中的unsigned数值会被解析成signed数值而导致无符号数溢出称为负数的Bug。

Release date: May. 25, 2021

JSON_EXTRACT Bug on Spider

JSON_EXTRACT() bug

mysql> CREATE TABLE t1 (i INT PRIMARY KEY, jdoc JSON); # create table 
Query OK, 0 rows affected (0.11 sec)

mysql> INSERT INTO t1 VALUES(1, '{"Name":"Zhangsan", "Age":18}'); # insert
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO t1 VALUES(2, '{"Name":"Lisi", "Age":35}'); # insert
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1 WHERE json_extract(jdoc, '$.Age')=35;
+---+-----------------------------+
| i | jdoc                        |
+---+-----------------------------+
| 2 | {"Age": 35, "Name": "Lisi"} |
+---+-----------------------------+
1 row in set (0.00 sec)

will generate an error

mysqladmin shutdown fails with error

Description

Under some circumstances, the crash occurs when using./mysqladmin shutdown to close the TSpider server.

This seems to be a problem of MariaDB that hasn't been completely solved for a long time.

The temporary solution that significantly reduces the chances of crashes is added in tspider-3.5.2.

GDB

We noticed that the crash was related to the following two functions.

void mysql_audit_release(THD *thd)
static void *kill_server(void *sig_ptr)

kill_server is a key step of the main thread, mysql_audit_release is a step the child thread close the thd.

As for the details of mysql_audit_release.

void mysql_audit_release(THD *thd)
{
  ...
  for (; plugins < plugins_last; plugins++)
  {
    st_mysql_audit *data= plugin_data(*plugins, struct st_mysql_audit *);  ==> break point 1
	
    /* Check to see if the plugin has a release method */
    if (!(data->release_thd))  ==> break point 2
      continue;

    /* Tell the plugin to release its resources */
    data->release_thd(thd);
  }
  ...
}

During execution, two threads are involved in the mysql_audit_release function. According to the order they hit the breakpoint, the two threads are named thread_A and thread_B, respectively. We call the main thread, which does kill_server M.
We have literally tried all executing orders of these threads and find only the B -> M -> A executing order will lead to the bug.
More specifically

  1. Freeze thread when thread A reaches breakpoint 1 or breakpoint 2.
  2. Freeze the main thread.
  3. Let thread B finishes.
  4. Thaw thread M.
  5. When thread M finishes unireg_end(), Freeze M, then Thaw A。
  6. Execute thread A, the content of data has been changed; when visited, an illegal area has been visited.

Analysis

M will call kill_server when executing shutdown. In the meantime, A will kill itself by calling THD::free_connection, and free some plugin resources. At this time, the status of the plugin is switched to DYING. However, thread M will forcibly shut down a plugin with a status other than Uninitialized or Freed or Disabled. In other words, this plugin of thread A will be shut down forcibly by thread M.

However, if the child thread executes later than the main thread, then we will visit some resources that have been freed, then a core dump occurs.

The stack of thread M:

finalize_audit_plugin(...)
plugin_deinitialize(...)
plugin_shutdown(...)
clean_up(...)
unireg_end(...)
kill_server(...)

Solution

Our solution is to sleep 2 seconds when thread_count > 0, thus waiting for the children threads to release the plugin. But our thread_count is not accurate.
This solution only reduces the chances of failure greatly, rather than completely solving the problem.

Two JSON_OBJECT with the same keys but in different orders is not equal

According to RFC 7159, two JSON Objects with the same keys but in different orders should be considered as equal.
However,

If the remote backends are MySQL rather than MariaDB

mysql> INSERT INTO tbl VALUES(1, JSON_OBJECT('id', 87, 'name', 'carrot', 'flag', true));
Query OK, 1 row affected (0.22 sec)

mysql> SELECT * FROM tbl;
+----+--------------------------------------------+
| id | j                                          |
+----+--------------------------------------------+
|  1 | {"id": 87, "flag": true, "name": "carrot"} |
+----+--------------------------------------------+
1 row in set (0.01 sec)

mysql> SELECT * FROM tbl WHERE j = JSON_OBJECT('id', 87, 'name', 'carrot', 'flag', true);
Empty set (0.03 sec)

Fix: b3084cf

A new variable to enable TSpider ignore some log output

User can set an array of spider_log_ignore_err_numbers,
when spider print errors, it will lock up whether the current error number
is in the array, if true, it will not output the error into a log.
The default value of this variable is "" (empty);
For example:
set global spider_log_ignore_err_numbers = "1062,12701,1067";
When spider output log, it will ignore error numbers 1062, 12701, and 1067.

the current bitmap is

static const ulonglong SPD_ERR_DUPLICATE       = 1U << 0; // 1062
static const ulonglong SPD_ERR_GONE_AWAWY      = 1U << 1; // 12701
static const ulonglong SPD_ERR_TOO_MANY_CONN   = 1U << 2; // 12723
static const ulonglong SPD_ERR_NO_REMOTE_EXIST = 1U << 3; // 1477
static const ulonglong SPD_ERR_CONN_REMOTE     = 1U << 4; // 1429
static const ulonglong SPD_ERR_INVALID_DEFAULT = 1U << 5; // 1067
static const ulonglong SPD_ERR_INVALID_DATE    = 1U << 6; // 1292
static const ulonglong SPD_ERR_TRUNCATE_VALUE  = 1U << 7; // 1366
static const ulonglong SPD_ERR_BAD_TYPE_VALUE  = 1U << 8; // 1411
static const ulonglong SPD_ERR_NET_TIMEOUT     = 1U << 9; // 1159
static const ulonglong SPD_ERR_COM_OUT_OF_SYNC = 1U << 10;// 2014

Update with Float Fields Fails on TSpider

How to Repeat

Reproducible in MariaDB
If we update shard keys using shard keys as a condition, re-sharding could be involved. TSpider will split the updating by the following three steps.
a. A selection (if there are where conditions) select ... from ... where id = ...
b. A insertion of new record insert high_priority into ... where id = new_val
c. A deletion of the outdated previous record delete from ... where id = old_val and a = ... and b = ...

In c, TSpider will add all the fields in the where condition to find the record before deletion.
If there happens to be a float field f in the table, then the last delete will become delete from ... where id = old_val and f = f_old_val
However, because MySQL FLOAT/DOUBLE is implemented according to IEEE-754, if you can't use the equivalent comparison, an error will occur.

CREATE TABLE t (
    id INT PRIMARY KEY, -- primary key, also shard key
    a  INT,
    b  BLOB,
    f  FLOAT
)ENGINE=SPIDER PARTITION BY LIST (id MOD 4)
INSERT INTO t VALUES(61,76,'b1',4.86947), (60,33,'b2',0), (24,75,'b3',4.61755), (63,76,'b4',4.86947);

--bug 
UPDATE t SET id = 25 WHERE id = 24;
-- tspider will send the following statements to backends
select `id`,`a`,`b`,`f` from `update_test`.`t` where (`id` = 24) for update
insert high_priority into `update_test`.`t`(`id`,`a`,`b`,`f`)values(25,75,'b3',4.61755)
delete from `update_test`.`t` where `id` = 24 and `a` = 75 and `b` = 'b3' and `f` = 4.61755 limit 1

Solution

  1. If there is a primary key or a unique key when doing delete from update_test.t where ... we only append the information of the primary key or the unique key in the where condition. (FLOAT/DOUBLE cannot be the primary key / unique key)
  2. If no primary key or unique key is found, and if float or double fields are related, we using like in comparison in these fields to substitute '=';
select `id`,`a`,`b`,`f` from `update_test`.`t` where (`id` = 24) for update
insert high_priority into `update_test`.`t`(`id`,`a`,`b`,`f`)values(25,75,'b3',4.61755)
delete from `update_test`.`t` where `id` = 24 and `a` = 75 and `b` = 'b3' and `f` like 4.61755 limit 1
  1. New GLOBAL variable spider_update_with_primary_key_first with default value ON.
    If it is ON, it will not send update SQL with float as a condition to backends (only if this update SQL uses shard key as to where condition).
    If it is OFF, the TSpider might send an update SQL with float fields as a condition, which may trigger a bug.

e.g.

CREATE TABLE t1(id INT PRIMARY KEY, a INT, b BLOB, c CHAR(20), f FLOAT, v VARCHAR(200))ENGING=SPIDER...
INSERT INTO t1 VALUES(61,76,'b','c',4.86947,'v'), (60,33,'b','c',0,'v'), (24,75,'b','c',4.61755,'v');
update t1 set id = 25 where id = 24;   -- update primary key, requires change of partition
--BUG, TSpider will dispatch `update ... set ... where ... and t1.f = 4.61755` to backend
--since float cannot be compared with equal function, it may return no row matched.
--TO AVOID THIS BUG, it is recommended to `SET GLOBAL spider_update_with_primary_key_first = ON;`

Distinct with condition and limitation BUG

This is not reproducible in MariaDB, or I haven't tested all situations.

This has been fixed in the version 3.5.2

CREATE TABLE t1 (
    uid INT PRIMARY KEY,
    id INT,
    name CHAR(20)
)ENGINE=SPIDER COMMENT='wrapper "mysql", table "t1"'
PARTITION BY LIST (uid%2)
(
    PARTITION pt0 VALUES IN (0) COMMENT = 'srv "bk0"',
    PARTITION pt1 VALUES IN (1) COMMENT = 'srv "bk1"'
);

INSERT INTO t1 VALUES (1, 1, 'a'), (3, 3, 'c'), (5, 3, 'd'), (7, 3, 'f'), (9, 3, 'g'), (11, 3, 'h'), (13, 3, 'i'), (15, 4, 'd');
INSERT INTO t1 VALUES (2, 2, 'b'), (4, 2, 'c');

If we execute TSpider1 select distinct id from t1 where name > 'a' limit 3,

Spider will send

select distinct `uid`,`id`,`name` from `tspider1`.`t1` where (`name` > 'a') limit 3 

to both remote, which would get the id of 3 from the first remote and 2 from the second remote, resulting in a miss of 4 in the first remote.

MariaDB Spider use a way to split read the result from two remotes like

# for remote 1
select distinct `id`,`name` from `test1_0`.`t1` where (`name` > 'a') limit 6
select distinct `id`,`name` from `test1_1`.`t1` where (`name` > 'a') limit 6,6
# for remote 2
select distinct `id`,`name` from `test1_1`.`t1` where (`name` > 'a') limit 6

This is closely related to the following variables

mysql> show variables like "spider%split%";
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| spider_multi_split_read      | -1    |
| spider_semi_split_read       | -1    |
| spider_semi_split_read_limit | -1    |
| spider_split_read            | -1    |
+------------------------------+-------+
4 rows in set (0.01 sec)

[Documentation] TSpider 3.6.3 Change Log

English
中文

Change Log

Features

  1. New global variable spider_parallel_limit, if on, tspider will use multi-thread to process SELECT ... LIMIT ... queries. The default value of this variable is off.

Bugfixes

  1. Continue to fix the bug of auto-increment duplicate keys.
  2. Avoid memory leak when using mysqldump with a lot of tables.

更新记录

新特性

  1. 增加 全局参数spider_parallel_limit,可控制select + limit是使用多线程并行处理,该参数默认关闭(OFF),若要启用并行limit,要将其设置为ON。

Bug修复

  1. 继续修复auto-increment产生duplicate key的问题;
  2. 修复在对大量表使用mysqldump时,会产生内存泄漏的问题。

Optimizations on Point Select

Introduction

TSpider runs on thousands of instances every day; it is of great importance to keep it stable, error-free, and robust. However, many execution plans are not implemented completely and tested on every case on the TSpider. In other words, some execution optimizations could lead to error results on TSpider.

Also, due to the distributed architecture of TSpider, many MySQL Server logical and physical optimizations may have little effect or even bring extra overhead on TSpider.

Therefore, we usually do a full-table scan for all single ( SELECT and UPDATE ) queries on TSpider to guarantee the correctness of execution and reduce the extra time of computing the "optimized" execution plan.

Optimization 1

When we tested and traced the performance of point selection by the primary key, we found that the performance is below our expectations.

After diving into the code, we found that if we allow using the index, which is the const index type under this circumstance, it will boost the performance of point selection by 7.65%. This is because the const index is one of the fastest execution plans. If the server has decided to use the const index, it will get the result of data at the join::optimize stage and ignore all other computations of the execution plan.

Optimization 2

Also, we found that TSpider will allocate 1000 row result spaces for the const index selection. However, if the index is a const index, it will only return 0 or 1 results, so there is no need to allocate such a big buffer and free them one by one. (const join type).

Therefore, if we detect a const index is used, we only allocate 1-row result space for the execution.

Two optimizations altogether increase the QPS by 20.96%.

Commits

58e2751
4c10388

Insert into ... select ... on duplicate key update BUG

The new query may use the freed trx (spider->trx) and cause a crash because without doing spider_get_trx beforehand.

case:

insert into t1 values(c1,c2) select c21,c22 from t2 on duplicate key update c1=values(c1),c2=values(c2);

This was fixed and released in 3.5.1

Spider节点和DB节点数据不一致的处理方式

通过TSpider操作建库建表过程中服务器异常崩溃了,恢复服务后发现上次操作的库在spider节点和db节点产生了不一致
image
image
无法直接清除该库
image
尝试修改TSpider节点全局变量ddl_execute_by_ctl=OFF后再次操作提示
image
请问这种情况一般是如何产生的/可以通过什么方式处理

[Documentation] TSpider 3.7.1 Change Log

English
中文

Change Log

Feature

  1. Designed the connection pool of a single computation node to multiple storage nodes using read-write lock hash map and mutex stack for TSpider, which
    a. increased the overall performance QPS by around 20%
    b. reduced the p99 latency by about 10% when the number of concurrencies is much more than the number of max_connection times the number of storage nodes.

更新记录

新特性

  1. 利用读写锁Hash和重构了TSpider到存储节点(后端)的连接池,整体QPS提升约20%,在并发量远大于max_connection的时候,p99时延降低了10%。

Release date: May. 26, 2021

Query crossing shards with limit clause leads to a little high latency response

query1: 
select * from tb_test where field1 = xxx limit 1;
1 row in set (0.034 sec)

query2:
select * from tb_test where field1 = xxx order by id limit 1;
1 row in set (0.005 sec)

This is a known issue. Tspider try to optimize the query1 by accessing the shards sequentially, because the rows retrived may be satisfied from the first shard, there is no need to request all backend shards to retrived rows in parallel.

But the bad case is that the only one satisfied row is located in the last shard. Thus this may take shard_num * one_response_time to finish the query.

请问表有多个主键会出什么问题

比如下面几种定义在目前最新的版本上:
CREATE TABLE aaa (
id1 int(10) NOT NULL,
id2 int(10) NOT NULL,
id3 int(10) NOT NULL,
PRIMARY KEY (id1,id2,id3)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE bbb (
id1 int(10) NOT NULL,
id2 int(10) NOT NULL,
id3 int(10) NOT NULL,
PRIMARY KEY (id1,id2,id3)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; COMMENT='shard_key "id1"';

CREATE TABLE ccc (
id1 int(10) NOT NULL,
id2 int(10) NOT NULL,
id3 int(10) NOT NULL,
PRIMARY KEY (id1),
KEY id1_2_index (id1,id2) USING BTREE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; COMMENT='shard_key "id1"';

CREATE TABLE ddd (
id1 int(10) NOT NULL,
id2 int(10) NOT NULL,
id3 int(10) NOT NULL,
PRIMARY KEY (id1,id2),
KEY id2_index (id2) USING BTREE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; COMMENT='shard_key "id1"';

CREATE TABLE eee (
id1 int(10) NOT NULL,
id2 int(10) NOT NULL,
id3 int(10) NOT NULL,
PRIMARY KEY (id1),
KEY id2_index (id2) USING BTREE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; COMMENT='shard_key "id1"';

Miscellaneous Changes on 3.6.1

  1. Bugfix of spider select with the group by and max()/min() returns error result.
  2. Bugfix of select + distinct + order by(shard_key) + limit
  3. Bugfix of the dangling pointer of spider_get_time.
  4. Bugfix of circling lists in the two-phase commit. ha_register. (We did not reproduce the bug, instead we raise a solution to avoid it)
  5. Default spider_max_connections is set to 500, which is to enable the connection pool. (In previous versions, this value is 0, which would lead to the OOM if the number of concurrencies is large)

[Documentation] TSpider 3.6.4 Change Log

English
中文

Change Log

Feature

  1. New global variable spider_log_ignore_err_numbers with default value '' (empty string). The variable specifies the error numbers which spider log will ignore to output. For example, if we do not want spider error log to record errors with number 1062, 12701 and 2014, we can set global spider_log_ignore_err_numbers='1062,12701,2014'. Currently, we can ignore the following error numbers: 1062,12701,12723,1477,1429,1067,1292,1366,1411,1159,2014.

Bugfix

  1. Fix auto-increment duplicates that might happen when inserting after SHOW TABLE STATUS/SHOW CREATE TABLE/SELECT * FROM information_schema.tables.

更新记录

新特性

  1. 增加 全局参数spider_log_ignore_err_numbers, spider在打印日志时,该参数指定的error number会被忽略(不打印相关日志)。该参数默认为空,表示日志输出所有错误。若想忽略错误码为1062, 12701,2014的错误,设置格式为: set global spider_log_ignore_err_numbers='1062,12701,2014'; 目前可支持忽略的错误数包括:(1062,12701,12723,1477,1429,1067,1292,1366,1411,1159,2014)

Bug修复

  1. 修复SHOW TABLE STATUS / SHOW CREATE TABLE / SELECT * FROM information_schema.tables 在特定情况下导致自增值被重置,进而可能出现重复键的问题。

Release date: Apr. 22, 2021

[Documentation] TSpider 3.6.2 Change Log

English
中文

Change Log

Bugfixes

  1. Fix the bug that, in previous versions, one auto-increment duplicate error could trigger other threads to have duplicate errors at high concurrency.
  2. Fix a bug of the wrong result of affected rows in executing UPDATE JOIN.
  3. Solved the problem of empty data/excess data/no matching rows when TSpider needs to replace a new shard under UPDATE / UPDATE JOIN. The cause of the error is related to the FLOAT column and the execution plan.
  4. Avoid the empty query Bug when TSpider executes SQLs under specific scenarios.

更新记录

Bug修复

1. 解决了在高并发场景下,某一线程由于用于指定auto increment值产生而重复错误时,同一时期其他线程可能会受到连带影响,也产生duplicate主键值错误的问题。(关键词: AUTO INCREMENT, DUPLICATE )
2. 解决了在特定情况下,TSpider在执行UPDATE JOIN时,返回affected rows不符合预期的情况。 (关键词: UPDATE JOIN,  AFFECTED ROWS)
3. 解决了TSpider在UPDATE / UPDATE JOIN下,需要更换新的分片时,会出现数据为空 / 多出数据 / 没有匹配列的错误。错误原因与FLOAT列和执行计划有关。 (关键词: UPDATE,FLOAT,执行计划)
4. 修复了特定场景下TSpider在执行SQL时,会返回empty query的Bug。(关键词: EMPTY QUERY)

取模扩容问题

比如原来HASH UID取模是64,现在变更256,TSPIDER是如何迁移数据的?可以像MongoDB那样自动迁移数据吗?

性能测试对比

我看了您的压力测试报告,没有和官方MariaDB Spider的压测对比。

另外,和直连DB,性能差多少?

[Documentation] TSpider 3.6.1 Change Log

English
中文

Change Log

Features

  1. Use indices in query plans for const and system join type, and optimize memory strategy so as to avoid redundant check and memory allocation. This boosts performance considerably.
  2. When dealing with JSON_OBJECT(), TSpider will sort it by its key, as MySQL does. (MariaDB won't perform a key sort) Therefore, two JSON objects with exactly the same keys and values but in a different order will be regarded as the same.
  3. sql_big_result will be omitted when using Spider SE.
  4. The default value of the system variable spider_max_connection is set to 500.
  5. New global variable print_bool_as_literal. When off, it will dispatch 1/0 when dealing with boolean value in JSON functions, otherwise, it will dispatch boolean value as its literal. The default value is OFF.

Bugfixes

  1. Fix a dead loop scenario in ha_list in order to prevent hang in committing phase.
  2. Avoid potential error when processing JSON_OBJECT(), JSON_EXTRACT() and JSON_REPLACE().
  3. Fix the bug of the wrong result when select with group by and aggregation functions under some circumstances.
  4. Fix the bug of the wrong result when select with distinct and limit under some circumstances.

更新记录

新特性

  1. 对于单表下利用const join type的SQL语句(例如根据主键或唯一键的点查询),优化了其执行步骤和内存分配策略,以减少不必要的校验步骤和内存分配,使该场景下性能显著提升。
  2. 在处理JSON_OBJECT()时, TSpider会对其按照key进行排序,以此来兼容后端是MySQL的情况。(MariaDB不会对key排序)所以,当两个JSON_OBJECT的<key, value>对都相同但是顺序不同时,也会被认为是相等的。
  3. 对于使用Spider存储引擎的Query,用户指定的sql_big_result将会被忽略。
  4. 参数spider_max_connections的默认值被修改为500。(之前的版本默认值为0)
  5. 新增GLOBAL参数print_bool_as_literal,如果为OFF,对于JSON函数中的true/false,TSpider在向后端转发时会将它们转化成1/0,如果为ON,TSpider会原封不动转发true/false。 该参数默认为OFF。

Bug修复

  1. 修复内核代码中ha_list的死循环bug,防止某些场景下TSpider事务会因死循环而卡住的情况;
  2. 修复了TSpider在使用JSON_OBJECT(), JSON_EXTRACT(), JSON_REPLACE()时可能会出现的问题;
  3. 修复了在特定场景下select + group by + min()/max()会出现结果错误的问题;
  4. 修复了在特定场景下select + distinct + limit会出现结果错误的问题。一些代码细节优化和编译检查warning修复。

JSON_REPLACE Bug on Spider

Update using JSON_REPLACE makes the server crash

INSERT INTO tbl VALUES (1, '{ "a": 1, "b": [2, 3]}');
UPDATE tbl SET j = JSON_REPLACE(j, '$.a', 10, '$.c', '[true, false]');

[Documentation] TSpider Pre-release Change Log

English
中文

Change Log

  1. Allow tspider to do index read when using const/system index to improve the performance of queries.
  2. Only allocate 1 result for const/system index to increase the utility of memory.
  3. New feature: customize shard count, shard function and shard type in TDBCTL: Tencent/TenDBCluster-Tdbctl#2.
  4. Add comments and todos in code.
  5. MySQL-test fix.

更新记录

  1. 允许TSpider在const/system index join type时使用索引来加速查询效率;
  2. 对于TSpider在const/system index join type时的中间内存结果,只分配一个result,提升内存利用率;
  3. 新功能:自定义shard count, shard function, shard type等参数: https://github.com/Tencent/TenDBCluster-Tdbctl/issues/2;
  4. 增加了代码注释和TODO;
  5. 修复MySQL-test的一些问题。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.