Giter Site home page Giter Site logo

yinchuanwang / clickhouse_sinker Goto Github PK

View Code? Open in Web Editor NEW

This project forked from housepower/clickhouse_sinker

0.0 1.0 0.0 3.81 MB

Easily load data from kafka to ClickHouse

License: Apache License 2.0

Go 96.07% Makefile 0.92% Shell 2.66% Dockerfile 0.34%

clickhouse_sinker's Introduction

clickhouse_sinker

Build Status Go Report Card

clickhouse_sinker is a sinker program that transfer kafka message into ClickHouse.

Features

  • Uses Native ClickHouse client-server TCP protocol, with higher performance than HTTP.
  • Easy to use and deploy, you don't need write any hard code, just care about the configuration file
  • Support multiple parsers: csv, fastjson, gjson.
  • Support multiple Kafka client: sarama, kafka-go.
  • Custom parser support.
  • Support multiple sinker tasks, each runs on parallel.
  • Support multiply kafka and ClickHouse clusters.
  • Bulk insert (by config bufferSize and flushInterval).
  • Parse messages concurrently (by config concurrentParsers).
  • Write batches concurrently.
  • Every batch is sharded to a determined clickhouse node. Exit if loop write failed.
  • Custom sharding policy (by config shardingKey and shardingPolicy).
  • At least once delivery guarantee, more info about it.

Install && Run

By binary files (suggested)

Download the binary files from release, choose the executable binary file according to your env, modify the conf files, then run ./clickhouse_sinker -conf conf

By source

  • Install Golang

  • Go Get

go get -u github.com/housepower/clickhouse_sinker/...
  • Build && Run
make build
## modify the config files, set the configuration directory, then run it
./dist/clickhouse_sinker -conf conf

Examples

Supported data types

  • UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64
  • Float32, Float64
  • String
  • FixedString
  • Date, DateTime, DateTime64 (Custom Layout parser)
  • Array(UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64)
  • Array(Float32, Float64)
  • Array(String)
  • Array(FixedString)
  • Nullable
  • ElasticDateTime => Int64 (2019-12-16T12:10:30Z => 1576498230)

Configuration

See config example

Authentication with Kafka

clickhouse_sinker support following three authentication mechanism:

  • No authentication

An example clickhouse_sinker config:

    "kfk1": {
      "brokers": "127.0.0.1:9092",
      "Version": "0.10.2.1"
    }
  • SASL/PLAIN

An example clickhouse_sinker config:

    "kfk2": {
      "brokers": "127.0.0.1:9093",
      "sasl": {
        "enable": true,
        "password": "username",
        "username": "password"
      "Version": "0.10.2.1"
    }
  • SASL/GSSAPI(Kerberos)

You need to configuring a Kerberos Client on the host on which the clickhouse_sinker run. Refers to Kafka Security.

  1. Install the krb5-libs package on all of the client machine.
$ sudo yum install -y krb5-libs
  1. Supply a valid /etc/krb5.conf file for each client. Usually this can be the same krb5.conf file used by the Kerberos Distribution Center (KDC).

An example clickhouse_sinker config:

    "kfk3": {
      "brokers": "127.0.0.1:9094",
      "sasl": {
        "enable": true,
        "gssapi": {
          "authtype": 2,
          "keytabpath": "/home/keytab/zhangtao.keytab",
          "kerberosconfigpath": "/etc/krb5.conf",
          "servicename": "kafka",
          "username": "zhangtao/localhost",
          "realm": "ALANWANG.COM"
        }
      },
      "Version": "0.10.2.1"
    }

FYI. The same config looks like the following in Java code:

security.protocol:SASL_PLAINTEXT
sasl.kerberos.service.name:kafka
sasl.mechanism:GSSAPI
sasl.jaas.config:com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true debug=true keyTab=\"/home/keytab/zhangtao.keytab\" principal=\"zhangtao/[email protected]\";

Prometheus Metrics

All metrics are in statistics.go.

  • Push to promethues

Add a piece like the following to config.json:

  "statistics": {
    "enable": true,
    "pushGatewayAddrs": ["http://node3:33445/cks/prom/write"]
  },

Or set env METRICS_PUSH_GATEWAY_ADDR to a list of urls joined with comma. The env have precedence over the config file.

  • Pull with prometheus

The command line parameter --http-addr expects a ip:port. Metrics are exposed at http://ip:port/metrics.

Custom metric parser

  • You just need to implement the parser interface on your own
type Parser interface {
	Parse(bs []byte) model.Metric
}

See json parser

Debugging

echo '{"date": "2019-07-11T12:10:30Z", "level": "info", "message": "msg4"}' | kafkacat -b 127.0.0.1:9093 -P -t logstash

clickhouse-client -q 'select * from default.logstash'
2019-12-16	info	msg4
2019-07-11	info	msg4
2015-05-11	info	msg4
2019-07-11	info	msg4

clickhouse_sinker's People

Contributors

sundy-li avatar yuzhichang avatar ww1516123 avatar artemchekunov avatar ahzamali avatar cheng1483x avatar jgulick48 avatar toannhu96 avatar taiyang-li avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.