Giter Site home page Giter Site logo

baidu / baikaldb Goto Github PK

View Code? Open in Web Editor NEW
1.1K 1.1K 171.0 6.95 MB

BaikalDB, A Distributed HTAP Database.

License: Apache License 2.0

C++ 92.64% Shell 1.20% Lex 0.32% Yacc 1.72% CMake 0.98% Makefile 0.96% Lua 1.21% C 0.49% Starlark 0.33% Python 0.13% PLpgSQL 0.02%
baikaldb database htap mysql raft sql

baikaldb's Introduction

house.baidu.com

baikaldb's People

Contributors

baoxuezhao avatar ehds avatar fankux avatar hollowman6 avatar ketor avatar lgqss avatar lilithliu1 avatar lmsreborn avatar luobuda avatar lvxinup avatar phantom9999 avatar slsjnp avatar tarang11 avatar tullyliu avatar wy1433 avatar xiaozizhu avatar yz-cyz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

baikaldb's Issues

truncate 清空表之后,新插入语句自增id未从1开始

描述

truncate 清空表后,重新插入数据,自增id从清空表前的数值开始增加

重现步骤

1. 创建表(设置id自增长)
2. 插入数据
3. 查询插入数据
4. 执行truncate 清空表语句
5. 插入数据
6. 查询插入数据
详细操作步骤:
mysql> 

mysql> CREATE TABLE `TestBK`.`info`  (
    -> `id` INT NOT NULL AUTO_INCREMENT,
    -> `name` CHAR(25) NOT NULL,
    -> PRIMARY KEY (`id`))ENGINE=Rocksdb DEFAULT CHARSET=utf8 AVG_ROW_LENGTH=100 COMMENT='{"comment":"", "resource_tag":"", "namespace":"TEST"}';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into TestBK.info (name) values ('XE9HE'), ('RYF86'), ('GKJ79');
Query OK, 3 rows affected (0.00 sec)

mysql> select id, name from TestBK.info;
+----+-------+
| id | name  |
+----+-------+
|  1 | XE9HE |
|  2 | RYF86 |
|  3 | GKJ79 |
+----+-------+
3 rows in set (0.01 sec)

mysql> truncate table TestBK.info;
Query OK, 0 rows affected (0.00 sec)

mysql> select id, name from TestBK.info;
Empty set (0.01 sec)

mysql> insert into TestBK.info (name) value ('HULLZ'), ('DPDJE'), ('LY36M');
Query OK, 3 rows affected (0.00 sec)

mysql> select id, name from TestBK.info;
+----+-------+
| id | name  |
+----+-------+
|  4 | HULLZ |
|  5 | DPDJE |
|  6 | LY36M |
+----+-------+
3 rows in set (0.01 sec)

mysql> 

建议

truncate语句应当具有删除数据及初始化自增id的功能

bazel build错误

ERROR
zlm@instance-dq9qt6ci:~/BaikalDB$ bazel build //:all
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/home/zlm/BaikalDB/tools/bazel.rc
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
... still trying to connect to local Bazel server after 10 seconds ...
INFO: Invocation ID: a6a0ccc4-0e41-49da-baec-7309799f72ef
ERROR: error loading package '': Encountered error while reading extension file 'boost/boost.bzl': no such package '@com_github_nelhage_rules_boost//': The native git_repository rule is deprecated. load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository") for a replacement.
Use --incompatible_remove_native_git_repository=false to temporarily continue using the native rule.
ERROR: error loading package '': Encountered error while reading extension file 'boost/boost.bzl': no such package '@com_github_nelhage_rules_boost//': The native git_repository rule is deprecated. load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository") for a replacement.
Use --incompatible_remove_native_git_repository=false to temporarily continue using the native rule.
INFO: Elapsed time: 37.252s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
Fetching @com_github_nelhage_rules_boost; fetching

OS
zlm@instance-dq9qt6ci:~/BaikalDB$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.1 LTS
Release: 16.04
Codename: xenial

以前没有用过bazel,不知道是什么问题(看起来像是第三方包源失效了)

Advantages compared to TiDB

In my view, BaikalDB have the same architecture with TiDB after read the README of BaikalDB, I want to know the advantages of BaikalDB compared to TiDB.

事务隔离级别

支持哪些事务隔离级别哈 ?
支持 Serializable 隔离级别么?

编译报错和 protobuf 有关吗?

ERROR: /home/happen/mycode/BaikalDB/BUILD:506:1: Linking of rule '//:baikalMeta' failed (Exit 1) gcc failed: error executing command /usr/bin/gcc -o bazel-out/k8-fastbuild/bin/baikalMeta -pthread '-fuse-ld=gold' -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -pass-exit-codes -Wl,-S ... (remaining 1 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
bazel-out/k8-fastbuild/bin/external/com_github_brpc_brpc/_objs/brpc/gzip_compress.pic.o:gzip_compress.cpp:function brpc::policy::GzipCompress(google::protobuf::Message const&, butil::IOBuf*): error: undefined reference to 'google::protobuf::io::GzipOutputStream::Options::Options()'

region 分裂时,开始只有一个节点在对用的raft group中,是否有数据丢失的风险

通过分析代码,发现在region split 是会先创建一个新的region,这是在这个region中只有一个peer,在非tail的分类时,一般就是原来region的leader节点,这时会写本地rocksdb,并且通过短暂阻塞读写实现新的region和老的region的数据保持一致,这是就会更新table 的版本,这样在后续的数据读写时就会使用新的视图(新的region info),这样新的region就可以接收数据读写了,但当新的region可以接收数据读写时,并没有保证当前的raft group 已经有足够多的peer,如果这时出现节点异常,就可能会导致数据丢失。
从代码分析看region的peer增加是通过心跳处理,在RegionManager::check_peer_count 中判断副本数据不足时才选择添加新的实例到raft group的,而这个过程与region split是互相独立的,并没有发现怎样保证在一个raft group有足够多的peer时,才开始接收新的数据的协同

在实际中单节点支持的region数目能够达到多大

1.在百度内部的生成环境中,但节点上拥有的region个数为多少?即一个节点能够支持多少raft group
2.由于braft针对raft group 各个资源基本上是完全独立了,过多的raft group是否会造成资源消耗过多,如多多的心跳

关于BaikalDB快照安装的疑问

在BaikalDB里面,每个Region的没有使用braft自带了定时快照功能,而是使用单独一个线程,同一个时刻只允许指定个数的Region做快照。

void Store::snapshot_thread() {
BthreadCond concurrency_cond(-5); // -n就是并发跑n个bthread
while (_is_running) {
bthread_usleep(10 * 1000 * 1000);
traverse_copy_region_map([&concurrency_cond](SmartRegion& region) {
concurrency_cond.increase_wait();
SnapshotClosure* done = new SnapshotClosure(concurrency_cond, region.get());
// 每个region自己会控制是否做snapshot
region->snapshot(done);
});
}
// 等待全部snapshot都结束
concurrency_cond.wait(-5);
}

也就是说,其实快照不是实时的, 我记得一个文档里面好像说是实时的。

在Region里面也不是说做快照就做快照的, 也是通过时间和log条数相差到达一定时才让做。

BaikalDB/src/store/region.cpp

Lines 1779 to 1805 in 90a7ef6

void Region::snapshot(braft::Closure* done) {
brpc::ClosureGuard done_guard(done);
bool need_snapshot = false;
if (_shutdown) {
return;
}
if (_snapshot_time_cost.get_time() < FLAGS_snapshot_interval_s * 1000 * 1000) {
return;
}
if (_applied_index - _snapshot_index > FLAGS_snapshot_diff_logs) {
need_snapshot = true;
} else if (abs(_snapshot_num_table_lines - _num_table_lines.load()) > FLAGS_snapshot_diff_lines) {
need_snapshot = true;
} else if ((_applied_index - _snapshot_index) * _average_cost.load()
> FLAGS_snapshot_log_exec_time_s * 1000 * 1000) {
need_snapshot = true;
}
if (!need_snapshot) {
return;
}
DB_WARNING("region_id: %ld do snapshot, snapshot_num_table_lines:%ld, num_table_lines:%ld "
"snapshot_index:%ld, applied_index:%ld, snapshot_time_cost:%ld",
_region_id, _snapshot_num_table_lines, _num_table_lines.load(),
_snapshot_index, _applied_index, _snapshot_time_cost.get_time());
done_guard.release();
_node.snapshot(done);
}

那为啥在install snapshot,node从leader获取快照时却是使用rocksdb接口get_snapshot,直接使用当前的快照?每个SnapshotContext 是install snapshot时才构造出来的。

struct SnapshotContext {
SnapshotContext()
: snapshot(RocksWrapper::get_instance()->get_snapshot()) {}
~SnapshotContext() {
if (data_context != nullptr) {
delete data_context;
}
if (meta_context != nullptr) {
delete meta_context;
}
if (snapshot != nullptr) {
RocksWrapper::get_instance()->relase_snapshot(snapshot);
}
}
const rocksdb::Snapshot* snapshot = nullptr;
IteratorContext* data_context = nullptr;
IteratorContext* meta_context = nullptr;
};

是不是因为在on_apply中保存了_applied_index, 重复的数据可以通过 if (iter.index() <= _applied_index) { continue; } 过滤掉呢?

BaikalDB/src/store/region.cpp

Lines 1336 to 1362 in 90a7ef6

void Region::on_apply(braft::Iterator& iter) {
Concurrency::get_instance()->service_write_concurrency.increase_wait();
ON_SCOPE_EXIT([]() {
Concurrency::get_instance()->service_write_concurrency.decrease_broadcast();
});
for (; iter.valid(); iter.next()) {
braft::Closure* done = iter.done();
brpc::ClosureGuard done_guard(done);
butil::IOBuf data = iter.data();
butil::IOBufAsZeroCopyInputStream wrapper(data);
pb::StoreReq request;
if (!request.ParseFromZeroCopyStream(&wrapper)) {
DB_FATAL("parse from protobuf fail, region_id: %ld", _region_id);
if (done) {
((DMLClosure*)done)->response->set_errcode(pb::PARSE_FROM_PB_FAIL);
((DMLClosure*)done)->response->set_errmsg("parse from protobuf fail");
braft::run_closure_in_bthread(done_guard.release());
}
continue;
}
pb::OpType op_type = request.op_type();
_region_info.set_log_index(iter.index());
if (iter.index() <= _applied_index) {
//DB_WARNING("this log entry has been executed, log_index:%ld, applied_index:%ld, region_id: %ld",
// iter.index(), _applied_index, _region_id);
continue;
}

这样是可以过滤掉, 但节点所installed snaoshot 其实并不是leader on_snapshot_save时所保存的snaoshot 了。

已成功删除的数据库还能被查到

描述

已经删除成功的数据库,在mysql客户端上仍然显示在show databases的结果集中

重现步骤

1. 删除数据库,并确认删除成功
2. 执行“show databases”,查看结果集

351571124308_ pic

建议

为避免引起删除数据库异常歧义,建议show命令仅展示当前状态正常的数据库

Build error on CentOS 7

ERROR: /home/BaikalDB/BUILD:453:1: C++ compilation of rule '//:sqlparser' failed (Exit 1)
include/sqlparser/sql_parse.y:18:1: error: expected unqualified-id before '%' token
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/new:40:0,
                 from /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/ext/new_allocator.h:33,
                 from /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/x86_64-redhat-linux/bits/c++allocator.h:33,
                 from /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/bits/allocator.h:46,
                 from /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/string:41,
                 from include/sqlparser/parser.h:16,
                 from include/sqlparser/sql_parse.y:22:
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/exception:35:37: error: expected declaration before end of line
 #pragma GCC visibility push(default)

Environment:

OS: CentOS 7
bazel version : 0.16.0
gcc version: 4.8.5

I have followed the instructions in the README.md.

性能报告

hi,啥时候能给出一个性能报告呢?

建议

提个小小的建议哦,

  1. 可以把编译好的可执行文件放到release下
  2. 可以写一些教程,原理讲解啥的
  3. 可以多一些案例分享

祝baikaldb 成功!

一个机器多个SSD即启动多个实例,如果配置策略使得同一个raft group 不在相同节点上

分析代码发现 whether_legal_for_select_instance 只是按照一下规则进行选择resource_tag相同, 不在exclude_stores之内, logical_room相同,而在一种一般的场景下如:
一个机器有多块SSD,而针对每一个SSD会启动一个store的实例,当进行region分配时,我们需要针对一个raft group 需要在不同的机器上才能保证数据的安全,为了实现这样的目标我们需要怎样实现配置?

SST备份与还原问题

curl -v -o 5_1584106435_60936.sst http://$1/StoreService/backup_region/download/5/data/0
少数据量的时候可以get,没问题,但是大文件的status_code:500的错误.

TIME类型异常

问题描述: TIME类型列插入显示异常。

重现步骤:

mysql> CREATE TABLE `TestDB`.`test` (
    -> `id`  INTEGER NOT NULL,
    -> `Pro_time` TIME default '20:27:07',
    -> PRIMARY KEY (`id`))ENGINE=Rocksdb;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test(id) values(1);                
Query OK, 1 row affected (0.00 sec)

mysql> insert into test values(2,"20:07:03");
Query OK, 1 row affected (0.00 sec)

mysql> select * from test;
+----+-----------+
| id | Pro_time  |
+----+-----------+
|  1 | -01:06:36 |
|  2 | -01:01:34 |
+----+-----------+
2 rows in set (0.00 sec)

ut没有全编译出来

RT,ut只有test_date_time编译了二进制,但是测试不通过。其他ut没有编译产出

MySQL Compatibility ?

Hi admin,

I try to test :

Screen Shot 2020-07-13 at 21 53 53

mysql --host=127.0.0.1 --port=5000 -u root -p:
CREATE DATABASE menagerie
show databases; => empty

Screen Shot 2020-07-13 at 21 52 29

How I can use BaiKalDB from my source code? Must I use "baikal-client" library?
Is there any document to guide usage? I can not understand about namespace, database, table ??

cmake 编译失败

编译环境:
centos 7.7.1908
gcc 8.3.1

In file included from /home/work/workspace/BaikalDB/include/common/log.h:22,
                 from /home/work/workspace/BaikalDB/include/common/common.h:48,
                 from /home/work/workspace/BaikalDB/include/engine/rocks_wrapper.h:25,
                 from /home/work/workspace/BaikalDB/include/raft/log_entry_reader.h:17,
                 from /home/work/workspace/BaikalDB/src/raft/log_entry_reader.cpp:15:
/usr/local/include/braft/storage.h: In function 'int braft::gc_dir(const string&)':
/usr/local/include/braft/storage.h:116:9: error: 'COMPACT_GOOGLE_LOG_NOTICE' was not declared in this scope
         LOG(NOTICE) << "Target path not exist, so no need to gc, path: "
         ^~~
/usr/local/include/braft/storage.h:116:9: note: suggested alternative: 'COMPACT_GOOGLE_LOG_FATAL'
In file included from /home/work/workspace/BaikalDB/src/raft/log_entry_reader.cpp:16:
/home/work/workspace/BaikalDB/include/raft/my_raft_log_storage.h: At global scope:
/home/work/workspace/BaikalDB/include/raft/my_raft_log_storage.h:99:9: error: 'int baikaldb::MyRaftLogStorage::append_entries(const std::vector<braft::LogEntry*>&)' marked 'override', but does not override
     int append_entries(const std::vector<braft::LogEntry*>& entries
         ^~~~~~~~~~~~~~
make[2]: *** [CMakeFiles/baikaldb.dir/build.make:245: CMakeFiles/baikaldb.dir/src/raft/log_entry_reader.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:76: CMakeFiles/baikaldb.dir/all] Error 2

看起来是glog和braft没找到。我已经把这两个库都安装到了/usr/local/include 和 /usr/local/lib。
求问下怎么处理。

另外,CMakeLists.txt第15行, find_library(RAPIDJSON_LIB NAMES glog) 是不是应该改为find_library(RAPIDJSON_LIB NAMES rapidjson)

当前项目的稳定行情况

项目开源已经有一段时间了,是不是节点扩容的功能,是在内部使用的基础,这个开源版本能在生产中使用吗?还有就是性能相当tidb有优势没?

Build error in Ubuntu 18

hi Admin,
I compiled BaikalDB from source
git clone --recurse https://github.com/baidu/BaikalDB.git
cd BaikalDB
mkdir -p _build && cd _build
cmake -DWITH_BAIKAL_CLIENT=ON -DWITH_SYSTEM_LIBS=OFF -DCMAKE_BUILD_TYPE:STRING=Release ../
make all -j 4

I see these errors :

/mnt/data/temp/baidu/BaikalDB/_build/third-party/rocksdb/src/extern_rocksdb/./util/compression.h:115: undefined reference to ZSTD_freeDCtx' /mnt/data/temp/baidu/BaikalDB/_build/third-party/rocksdb/src/extern_rocksdb/./util/compression.h:367: undefined reference to ZSTD_freeCCtx'
/mnt/data/temp/baidu/BaikalDB/_build/third-party/rocksdb/src/extern_rocksdb/./util/compression.h:191: undefined reference to `ZSTD_freeCDict'
collect2: error: ld returned 1 exit status
CMakeFiles/baikalMeta.dir/build.make:419: recipe for target 'baikalMeta' failed
make[2]: *** [baikalMeta] Error 1
CMakeFiles/Makefile2:131: recipe for target 'CMakeFiles/baikalMeta.dir/all' failed
make[1]: *** [CMakeFiles/baikalMeta.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.