Giter Site home page Giter Site logo

sky007z / bitsail Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bytedance/bitsail

0.0 0.0 0.0 4.91 MB

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

License: Apache License 2.0

Shell 0.01% Java 100.00%

bitsail's Introduction

BitSail

English | 简体中文

Build License Join Slack

Introduction

BitSail is ByteDance's open source data integration engine which is based on distributed architecture and provides high performance. It supports data synchronization between multiple heterogeneous data sources, and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, it serves almost all business lines in ByteDance, such as Douyin, Toutiao, etc., and synchronizes hundreds of trillions of data every day.

Why Do We Use BitSail

BitSail has been widely used and supports hundreds of trillions of large traffic. At the same time, it has been verified in various scenarios such as the cloud native environment of the volcano engine and the on-premises private cloud environment.

We have accumulated a lot of experience and made a number of optimizations to improve the function of data integration

  • Global Data Integration, covering batch, streaming and incremental scenarios

  • Distributed and cloud-native architecture, supporting horizontal scaling

  • High maturity in terms of accuracy, stability and performance

  • Rich basic functions, such as type conversion, dirty data processing, flow control, data lake integration, automatic parallelism calculation , etc.

  • Task running status monitoring, such as traffic, QPS, dirty data, latency, etc.

BitSail Use Scenarios

  • Mass data synchronization in heterogeneous data sources

  • Streaming and batch integration data processing capability

  • Data lake and warehouse integration data processing capability

  • High performance, high reliability data synchronization

  • Distributed, cloud-native architecture data integration engine

Features of BitSail

  • Low start-up cost and high flexibility

  • Stream-batch integration and Data lake-warehouse integration architecture, one framework covers almost all data synchronization scenarios

  • High-performance, massive data processing capabilities

  • DDL automatic synchronization

  • Type system, conversion between different data source types

  • Engine independent reading and writing interface, low development cost

  • Real-time display of task progress, under development

  • Real-time monitoring of task status

Architecture of BitSail

Source[Input Sources] -> Framework[Data Transmission] -> Sink[Output Sinks]

The data processing pipeline is as follows. First, pull the source data through Input Sources, then process it through the intermediate framework layer, and finally write the data to the target through Output Sinks

At the framework layer, we provide rich functions and take effect for all synchronization scenarios, such as dirty data collection, auto parallelism calculation, task monitoring, etc.

In data synchronization scenarios, it covers batch, streaming, and incremental data synchronization

In the Runtime layer, it supports multiple execution modes, such as yarn, local, and k8s is under development

Supported Connectors

DataSource Sub Modules Reader Writer
Hive -
Hadoop -
Hbase -
Hudi -
Kafka -
RocketMQ -
Redis -
Doris -
MongoDB -
JDBC MySQL
Oracle
PostgreSQL
SqlServer
Fake -
Print -

Documentation for Connectors.

Community Support

Slack

Join BitSail Slack channel via this link

Mailing List

Currently, BitSail community use Google Group as the mailing list provider. You need to subscribe to the mailing list before starting a conversation

Subscribe: Email to this address [email protected]

Start a conversation: Email to this address [email protected]

Unsubscribe: Email to this address [email protected]

WeChat Group

Welcome to scan this QR code and to join the WeChat group chat.

qr

Environment Setup

Link to Environment Setup.

Deployment Guide

Link to Deployment Guide.

BitSail Configuration

Link to Configuration Guide.

Contributing Guide

Link to Contributing Guide.

Contributors

Thanks all contributors

License

Apache 2.0 License.

bitsail's People

Contributors

blockliu avatar garyli1019 avatar hk-lrzy avatar lichang-bd avatar ysamchu avatar lujg avatar ayonel avatar laglangyue avatar zhaoxinlong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.