Giter Site home page Giter Site logo

zeromicro / cds Goto Github PK

View Code? Open in Web Editor NEW
953.0 39.0 137.0 7.17 MB

Data syncing in golang for ClickHouse.

License: MIT License

Makefile 0.70% Go 42.23% Shell 0.25% HTML 0.05% JavaScript 14.85% Python 0.23% Handlebars 0.08% Vue 39.40% CSS 0.09% SCSS 2.12%
go golang clickhouse bigdata kafka-consumer

cds's Introduction

ClickHouse Data Synchromesh

Data syncing in golang for ClickHouse.

based on go-zero

ARCH

Data workflow of a typical data warehouse architecture

avatar

Design of Data Sync

Automatically synchronizing data from MySQL/MongoDB data source to ClickHouse cluster in real time(< 2min).

sync

start up

git clone https://github.com/zeromicro/cds.git
cd cds
make up

After the build , pay attention to check if any container exits abnormally.

click http://localhost:3414/cds.html to accessing the web interface.

using username and password below to login

user: [email protected]
password: 123456

create table in ClickHouse for syncing MySQL(or MongoDB) data

Chose "create table" tab

1. Click "Target ClickHouse Database Info",
2. Click “connect”
3. Select the schema ("default") synchronized to Clickhouse
4. switch to "Data Source" 
5. MySQL input connection string root:root@tcp(mysql:3306)/test_mysql
or
MongoDB input connection string mongodb://mongo1:30001/test_mongo
6. Click “connect”
7. Select the table, such as the default example_mysql (MySQL) or example (MongoDB)
8. click "Generate create Table SQL" 
 Note: select the partition field as needed, here'PARTITION BY toYYYYMM()' can be deleted, or replaced with'PARTITION BY toYYYYMM(dt)'
9. . Click "send SQL to ClickHouse", and the successful execution will pop up below

One-time full data synchronization:

Chose "full sync" tab

1. Click "+" in the upper right corner
2. MySQL input connection string root:root@tcp(mysql:3306)/test_mysql
or
MongoDB input connection string mongodb://mongo1:30001/test_mongo
3. Click “connect”
4. Select the table, such as the default example_mysql (MySQL) or example (MongoDB)
5. Click "Target ClickHouse Database Info"
6. Click “connect”
7. Select the schema ("default") synchronized in Clickhouse
10. Click "Add", a pop-up below shows successful execution

check task status

Refresh the page

Confirm data in Clickhouse

image-20201118135156133

Turn on real-time incremental synchronization

Take mysql as an example

chose "Connector" tab

1. Click "+" in the upper right corner
2. MySQL input connection string root:root@tcp(mysql:3306)/test_mysql
3. select table 
4. Click "Add"

chose "Incremental Sync" tab

1. Click "+" in the upper right corner
2. MySQL input connection string root:root@tcp(mysql:3306)/test_mysql
3. select table 
4. Click "Target ClickHouse Database Info"
5. Click “connect”
6. Select the schema ("default") synchronized in Clickhouse
7. Click "Add", a pop-up below shows successful execution

refresh page

Verify incremental update

Execute the initialization database script again, you can insert 100000 rows of data again.

cd sit/docker/
sh ./init.sh

Verify the incremental data of mysql in clickhouse:

image-20201118135503830

clean up

To clean up all the docker containers started above and restore the initial state, you can :

cd cds
make down

only clean

cd cds
make docker_clean

data model in clickhouse

CDS中ClickHouse使用的建表方案

help

提问的智慧

如何有效的报告bug


if you like this project and want to support it,please star 🤝

cds's People

Contributors

ahmczsy avatar alexey-milovidov avatar icy4ever avatar kevwan avatar org0000h avatar zxc111 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cds's Issues

是否会考虑支持ReplicatedCollapsingMergeTree?

开发组成员们,你们好。
在研究本项目的时候产生了一个疑问,希望能得到解答。
背景:在Galaxy生成的Clichouse DDL中,为每个表建立了”ck_is_delete“列,复制表的引擎选用:ReplicatedMergeTree。
问题:这里为什么没有选用ReplicatedCollapsingMergeTree作为复制表的引擎,使用”sign“代替”ck_is_delete“?

望解答

ERROR-全量同步过程中出现错误

环境: centos 7 | clickhouse ClickHouse server version 20.12.5 | cds 最新版(建议添加一个版本号, commitid: 28c51c5)

现像: 在配置好全量同步,执行同步过程出现如下错误:

{"@timestamp":"2021-01-04T17:03:44.268+08","level":"error","content":"mysqltypeconv.go:59 sql: Scan error on column index 4, name \"Default\":
 converting NULL to string is unsupported"}

可能原因: 建表语句指定了字段类型, 导致NULL转换失败

问题:

* 建表语句是否可支持 Nullable
* 全库同步需要选中所有表 ?

MySQL灾备

这个可以作为MySQL异地主从备份吗?当另一MySQL故障时,切到当前MySQL

mac quickstart

ERROR: for canal-server Container "4252636c3eef" is unhealthy.
ERROR: Encountered errors while bringing up the project.

make up mongo error

mongodb test_mongo.example inserted 10000 lines
Traceback (most recent call last):
File "/tmp/init_db.py", line 74, in
init_mongo()
File "/tmp/init_db.py", line 69, in init_mongo
collection.insert_many(result)
File "/usr/local/lib/python3.10/site-packages/pymongo/collection.py", line 615, in insert_many
blk.execute(write_concern, session=session)
File "/usr/local/lib/python3.10/site-packages/pymongo/bulk.py", line 459, in execute
return self.execute_command(generator, write_concern, session)
File "/usr/local/lib/python3.10/site-packages/pymongo/bulk.py", line 351, in execute_command
with client._tmp_session(session) as s:
File "/usr/local/lib/python3.10/contextlib.py", line 135, in enter
return next(self.gen)
File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1656, in _tmp_session
s = self._ensure_session(session)
File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1643, in _ensure_session
return self.__start_session(True, causal_consistency=False)
File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1594, in __start_session
server_session = self._get_server_session()
File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1629, in _get_server_session
return self._topology.get_server_session()
File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 534, in get_server_session
session_timeout = self._check_session_support()
File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 520, in _check_session_support
self._select_servers_loop(
File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 223, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: mongo1:30001: [Errno -2] Name or service not known,mongo2:30002: [Errno -2] Name or service not known,mongo3:30003: [Errno -2] Name or service not known, Timeout: 30s, Topology Description: <TopologyDescription id: 624a83e9fd437d56bbaaf7a3, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('mongo1', 30001) server_type: Unknown, rtt: None, error=AutoReconnect('mongo1:30001: [Errno -2] Name or service not known')>, <ServerDescription ('mongo2', 30002) server_type: Unknown, rtt: None, error=AutoReconnect('mongo2:30002: [Errno -2] Name or service not known')>, <ServerDescription ('mongo3', 30003) server_type: Unknown, rtt: None, error=AutoReconnect('mongo3:30003: [Errno -2] Name or service not known')>]>
make: *** [init] Error 1

make up时候运行错误

Traceback (most recent call last):
File "/tmp/init_db.py", line 74, in
init_mongo()
File "/tmp/init_db.py", line 69, in init_mongo
collection.insert_many(result)
File "/usr/local/lib/python3.9/site-packages/pymongo/collection.py", line 761, in insert_many
blk.execute(write_concern, session=session)
File "/usr/local/lib/python3.9/site-packages/pymongo/bulk.py", line 528, in execute
return self.execute_command(generator, write_concern, session)
File "/usr/local/lib/python3.9/site-packages/pymongo/bulk.py", line 359, in execute_command
client._retry_with_session(
File "/usr/local/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1384, in _retry_with_session
return self._retry_internal(retryable, func, session, bulk)
File "/usr/local/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1414, in _retry_internal
raise last_error
File "/usr/local/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1416, in _retry_internal
return func(session, sock_info, retryable)
File "/usr/local/lib/python3.9/site-packages/pymongo/bulk.py", line 353, in retryable_bulk
self._execute_command(
File "/usr/local/lib/python3.9/site-packages/pymongo/bulk.py", line 309, in _execute_command
result, to_send = bwc.execute(ops, client)
File "/usr/local/lib/python3.9/site-packages/pymongo/message.py", line 907, in execute
result = self.write_command(request_id, msg, to_send)
File "/usr/local/lib/python3.9/site-packages/pymongo/message.py", line 999, in write_command
reply = self.sock_info.write_command(request_id, msg)
File "/usr/local/lib/python3.9/site-packages/pymongo/pool.py", line 771, in write_command
helpers._check_command_response(result, self.max_wire_version)
File "/usr/local/lib/python3.9/site-packages/pymongo/helpers.py", line 151, in _check_command_response
raise NotMasterError(errmsg, response)
pymongo.errors.NotMasterError: operation was interrupted, full error: {'errorLabels': ['RetryableWriteError'], 'topologyVersion': {'processId': ObjectId('6071d9eecc09c5721c8f196f'), 'counter': 6}, 'operationTime': Timestamp(1618074848, 500), 'ok': 0.0, 'errmsg': 'operation was interrupted', 'code': 11602, 'codeName': 'InterruptedDueToReplStateChange', '$clusterTime': {'clusterTime': Timestamp(1618074848, 500), 'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'keyId': 0}}}
make: *** [init] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.