shell909090 / influx-proxy Goto Github PK

View Code? Open in Web Editor NEW

490.0 35.0 271.0 98 KB

License: Other

Go 90.83% Makefile 0.80% Python 8.38%

influx-proxy's Introduction

InfluxDB Proxy

This project adds a basic high availability layer to InfluxDB.

NOTE: influx-proxy must be built with Go 1.7+, don't implement udp.

Why

We used InfluxDB Relay before, but it doesn't support some demands. We use grafana for visualizing time series data, so we need add datasource for grafana. We need change the datasource config when influxdb is down. We need transfer data across idc, but Relay doesn't support gzip. It's inconvenient to analyse data with connecting different influxdb. Therefore, we made InfluxDB Proxy.

Features

Support gzip.
Support query.
Filter some dangerous influxql.
Transparent for client, like cluster for client.
Cache data to file when write failed, then rewrite.

Requirements

Golang >= 1.7
Redis-server
Python >= 2.7

Usage

$ # install redis-server
$ yum install redis
$ # start redis-server on 6379 port
$ redis-server --port 6379 &
$ # Install influxdb-proxy to your $GOPATH/bin
$ go get -u github.com/shell909090/influx-proxy/service
$ go install github.com/shell909090/influx-proxy/service
$ mv $GOPATH/bin/service $GOPATH/bin/influxdb-proxy
$ # Edit config.py and execute it
$ python config.py
$ # Start influx-proxy!
$ $GOPATH/bin/influxdb-proxy -redis localhost:6379 [--redis-pwd xxx --redis-db 0]

Configuration

Example configuration file is at config.py. We use config.py to genrate config to redis.

Description

The architecture is fairly simple, one InfluxDB Proxy process and two or more InfluxDB processes. The Proxy should point HTTP requests with measurements to the two InfluxDB servers.

The setup should look like this:

        ┌─────────────────┐
        │writes & queries │
        └─────────────────┘
                 │
                 ▼
         ┌───────────────┐
         │               │
         │InfluxDB Proxy │
         |  (only http)  |
         │               │         
         └───────────────┘       
                 │
                 ▼
        ┌─────────────────┐
        │   measurements  │
        └─────────────────┘
          |              |       
        ┌─┼──────────────┘       
        │ └──────────────┐       
        ▼                ▼       
  ┌──────────┐      ┌──────────┐  
  │          │      │          │  
  │ InfluxDB │      │ InfluxDB │
  │          │      │          │
  └──────────┘      └──────────┘

measurements match principle:

Exact match first. For instance, we use cpu.load for measurement's name. The KEYMAPS has cpu and cpu.load keys. It will use the cpu.load corresponding backends.
Then Prefix match. For instance, we use cpu.load for measurement's name. The KEYMAPS only has cpu key. It will use the cpu corresponding backends.

Query Commands

Unsupported commands

The following commands are forbid.

DELETE
DROP
GRANT
REVOKE

Supported commands

Only support match the following commands.

.*where.*time
show.*from
show.*measurements

License

MIT.

influx-proxy's People

Contributors

Stargazers

Watchers

Forkers

pingliu no2key oneoaas-golib wang1219 lwhile ohgenlong prcyangli uin57 dotwoo pruepei jameslinus lsllcm66 raymondlwb cdtv0833 haoyuxy yozora-hitagi gotoolkits lflxp startime-h toni-moreno woozhijun zhanglei sky-time lwightmoon ltxz2008 kangzhenkang nicholaswang ziscloud jialine wlay bsjs viccom tozhengxq jinyule yanghongfeng cy7851 husttb awngas airclear arterhuo yangminglis waynejohny leonliu315 stallman-cui guodidi bozzcq bingooyong kevindavidmitnick qiumike liuts pfroad roryliuwenxuan shenhj2016 agchrys unleashhell chengshiwen tsdb-project glymehrvrd gklbiti duanyh1982 whaon hailwind wilhelmguo farontier hymcagee muidea saperliu arielce tingxinli alonefire biqingguo nalsas supergod oliver831220 admjack huberycao menglingjun20111 goldenheart zwunix sinopower wzhich asd7228 skyformat99 booboowei barryz flashmouse freeznet fightchwang kkdev163 nh-live winca fredbuka 2d0g imbeder zhangkaikai xixidefujun34 hadesfeng aaronsuscb qshine foxxnuaa

influx-proxy's Issues

安装问题

我再使用下列命令时出现下列错误，请问readme中的命令是否可以使用。谢谢。

[root@vm1 influx-proxy]# go get -u github.com/shell909090/influx-proxy

cd /root/go/src/github.com/shell909090/influx-proxy

; git pull --ff-only
error: while accessing https://github.com/shell909090/influx-proxy/info/refs

fatal: HTTP request failed
package github.com/shell909090/influx-proxy

: exit status 1

关于measurements的prefix的疑惑

关于measurement会去判断是不是前缀，这样设计的话如果两个单词局部相同，不就混了吗

func (ic *InfluxCluster) GetBackends(key string) (backends []BackendAPI, ok bool) {
  ....
     if strings.HasPrefix(key, k) {}
...

如cpu和cpu_percent这样会去write到同样的backends

写请求没有转发 http headers 导致 HTTP Basic access authentication 失败

如果 Influxdb 开启了 authentication 的话，客户端就必须带上用户名和密码，如果用 HTTP Basic Authentication 的话就是带上 request header，而 influx-proxy 的 service/http.go 里面的 HandlerWrite 中是只取 req.Body 来进行重试的，这是不是意味着无法转发通过 HTTP Basic access authentication 的写请求？

对后端的influxdb 有健康检查嘛

没外网怎么安装influxProxy？

有个环境没外网，怎么安装influxProxy？
有源码编译之类的方法吗？
或者可以把有外网环境下安装好的influxProxy 直接复制一些文件过来使用吗？需要复制哪些文件啊？

谢谢

请问插入数据类型冲突，怎么解决？

error response: {"error":"partial write: field type conflict: input field "rss_overhead_bytes" on measurement "redis" is type float, already exists as type integer dropped=2"}

写入时间戳格式问题

prometheus 数据写入时间戳全部是少几位这个要从哪个地方补齐到纳秒单位。用Influxdb源生的/api/v1/prom/write?db=prometheus 写入时间戳就正常的。

（1）redis是cluster模式的是不是没有兼容？（2）influxdb的password怎么传

influx-proxy 高可用问题

      nginx   ------  nginx
		  |
 influxProxy-1    influxProxy-2
                     |

influxDB01 influxDB02 influxDB03

influx-proxy高可用环境下，如果有多个influx-proxy服务器，需要把所有的influxdb-proxy服务器都写在config.py里的NOEDS这吗？是使用IP;6666来区分吗？多个influx-proxy使用的config.py文件完全一样？还是第一个influx-proxy的config.py里写l1, 第二个influx-proxy里写l2?
一个node是一个influx-proxy，NODES里的db是什么作用呢？BACKENDS里已经指定了后端influxdb的db， BACKENDS里有多个不同的db，例如示例中的test和test2，那这个NODES里的db怎么设呢？
这些influx-proxy都挂在nginx负载均衡器上，这个DEFAULT_NODE该怎么设置？
nginx建议用哪种负载均衡模式（调度算法）？
谢谢

NODES = {
'l1': {
'listenaddr': '192.168.1.31:6666',
'db': 'test',
'zone': 'local',
'interval':10,
'idletimeout':10,
'writetracing':0,
'querytracing':0,
}
'l2': {
'listenaddr': '192.168.1.32:6666',
'db': 'test',
'zone': 'local',
'interval':10,
'idletimeout':10,
'writetracing':0,
'querytracing':0,
}

}

the influxdb default cluster node
DEFAULT_NODE = {
'listenaddr': ':6666'
}

请问我启动了influx-proxy，但是query返回错误，write没有数据

write：虽然返回204成功，但是2个influxdb库里并没有数据
[root@localhost ~]# curl -i -XPOST 'http://20.21.1.129:6666/write?db=sanchen' --data-binary 'cpu_load_short2,host=server01,region=us-west value=0.64 1539825713536000000'
HTTP/1.1 204 No Content
X-Influxdb-Version: 1.1
Date: Fri, 26 Oct 2018 01:47:06 GMT

query：返回fobbideon
[root@localhost ~]# curl -v -G "http://20.21.1.129:6666/query?db=sanchen&u=root&p=1qaz@WSX" --data-urlencode "q=select * from cpu_load_short "

About to connect() to 20.21.1.129 port 6666 (#0)
Trying 20.21.1.129...
Connected to 20.21.1.129 (20.21.1.129) port 6666 (#0)

GET /query?db=sanchen&u=root&p=1qaz@WSX&q=select%20%2A%20from%20cpu_load_short%20 HTTP/1.1
User-Agent: curl/7.29.0
Host: 20.21.1.129:6666
Accept: /

< HTTP/1.1 400 Bad Request
< X-Influxdb-Version: 1.1
< Date: Fri, 26 Oct 2018 01:48:43 GMT
< Content-Length: 15
< Content-Type: text/plain; charset=utf-8
<

Connection #0 to host 20.21.1.129 left intact
query forbidden

问下该怎么解决

是否需要企业版的influxdb

您好，请问是否需要安装企业版influxdb，使用集群模式，还是普通的influxdb就可以了

是否支持动态扩展influxDB

是否支持添加和减少influxDB

[doubts] Some doubts on how to work with this promising tool

HI @shell909090 I'm looking for a influx proxy/write cache/write filter system tool, And I've found your influx-proxy.

I've doubts on how it works and why. ( Perhaps I could help you to add some new features if you like it)

why are you using redis as a config repository? if seems easier to work with a toml file like influx-proxy does?
do you use redis also as metric cache in case of remote infludb services are down?
does influx-proxy detects configuration changes automatically and reconfigure itself online?
does influx-proxy support a "db to destination" keymap as it already supports "measurement to destination" ?

Thank you very much

启动influx-proxy的时候日志打印：read meta error: EOF 为什么？

2019/04/28 16:21:44.288413 file.go:208: read meta error: EOF
2019/04/28 16:21:44.288660 file.go:208: read meta error: EOF

后面挂的两个influxdb实例，每个实例的数据是相同的吗？还是分片的？

新手，请问有没有使用例子，启动proxy后怎么去调用

水平分片需求

你好，请问可实现“水平分片”功能么？比如，按IP地址/区域/网段/业务等自定义的规则分别存储到多个influxdb.
目前看到的应该是实现了measument层面的“垂直切分”吧？但，如果一个中心内要监控的内容太多。同一measuments的数据量也会出现过大的情况。。不知贵司是如何避免此情况的啊？

查询数据ok，写入数据没有任何响应

curl -XPOST 'http://localhost:8089/write?db=ta&u=admin&p=dfd' --data-binary 'reports,host=23 kd=22'

response:
HTTP/1.1 204 No Content
X-Influxdb-Version: 1.1
Date: Thu, 25 Jan 2018 08:17:14 GMT

请问是怎么回事

Prometheus remote_write url格式

`remote_write:
  - url: "http://proxy:6666/api/v1/prom/write?db=k8s"
remote_read:
  - url: "http://proxy:6666/api/v1/prom/read?db=k8s"`

请问prom的远端存储用proxy是怎样的？上面404

横向扩展怎么

Hi @shell909090 我看所有的influxdb节点都在这块写死了
{
BACKENDS = {
'local': {
'url': 'http://localhost:8086',
'db': 'test',
'zone':'local',
'interval': 1000,
'timeout': 10000,
'timeoutquery':600000,
'maxrowlimit':10000,
'checkinterval':1000,
'rewriteinterval':10000,
},
'local2': {
'url': 'http://influxdb-test:8086',
'db': 'test2',
'interval': 200,
},
}
那要是想横向扩展的话要怎么做呢？

go get -u github.com/shell909090/influx-proxy/service 失败！！！

[root@localhost src]# go get -u github.com/shell909090/influx-proxy/service

package gopkg.in/redis.v5: unrecognized import path "gopkg.in/redis.v5" (https fetch: Get https://gopkg.in/redis.v5?go-get=1: dial tcp 35.196.143.184:443: getsockopt: connection refused)

package gopkg.in/natefinch/lumberjack.v2: unrecognized import path "gopkg.in/natefinch/lumberjack.v2" (https fetch: Get https://gopkg.in/natefinch/lumberjack.v2?go-get=1: dial tcp 35.196.143.184:443: getsockopt: connection refused)

求问下该怎么解决。

安装过程，运行python config.py报错

按照安装文件，已经安装了redis和influxdb，
运行到python config.py时，报错。
Traceback (most recent call last):
File "config.py", line 13, in
import redis ImportError: No module named redis
知道原因，就不知道怎么处理。你的操作步骤写的略简单，水平有限，在慢慢消化。希望这个问题抽空处理下。

query forbidden！求指出疏漏之处。

[hadoop@localhost influxdb-1.6.4-1]$ curl -G 'http://localhost:6666/query?pretty=true' --data-urlencode "db=telegraf" --data-urlencode "q=select * from cpu where time > now() - 1m"
query forbidden

首先，如果这条语句的IP改成直连的后端是可以通过的。
这里我看也符合".*where.*time"这个格式···
请教下，这里是疏忽了什么才导致请求被禁止了~

关于proxy中RWMutex的使用(设计疑惑)

在cluster.go中对于RWMutex的使用

func (ic *InfluxCluster) ForbidQuery(s string) (err error) {
	r, err := regexp.Compile(s)
	if err != nil {
		return
	}

	ic.lock.Lock()
	defer ic.lock.Unlock()
	ic.ForbiddenQuery = append(ic.ForbiddenQuery, r)
	return
}

func (ic *InfluxCluster) EnsureQuery(s string) (err error) {
	r, err := regexp.Compile(s)
	if err != nil {
		return
	}

	ic.lock.Lock()
	defer ic.lock.Unlock()
	ic.ObligatedQuery = append(ic.ObligatedQuery, r)
	return
}

func (ic *InfluxCluster) AddNext(ba BackendAPI) {
	ic.lock.Lock()
	defer ic.lock.Unlock()
	ic.bas = append(ic.bas, ba)
	return
}

其实以上三处的加锁和解锁的使用中，AddNext方法是没有使用的，ForbidQuery，EnsureQuery仅仅在启动程序的时候用到，而且是在同一个协程中的，因此是否这里的加解锁操作是多余的呢？或者是这里加解锁有其他的考量呢？谢谢

"go get"遇到问题

[root@192 local]# go get -u github.com/shell909090/influx-proxy/service
# github.com/influxdata/influxdb/models
/root/go/src/github.com/influxdata/influxdb/models/points.go:1885: syntax error: unexpected = in type declaration
/root/go/src/github.com/influxdata/influxdb/models/points.go:1896: syntax error: unexpected = in type declaration

[root@192 ~]# go version
go version go1.7.4 linux/amd64

为什么提示会有语法错误呢......

influx-proxy中的config.py 文件中的NODES是什么概念呢为什么需要多个NODE

源码文件有完成下载，但是没法完成编译。

查看了源码文件有完成下载，但是没法完成编译。
[root@rhel6 bin]# go get -u github.com/shell909090/influx-proxy
package github.com/shell909090/influx-proxy: no buildable Go source files in /mygo/src/github.com/shell909090/influx-proxy

通过proxy写入的数据在配置的两local中数据不一致

1.原本通过java api循环插入了100条数据，在KEYMAPS配置了cpu表对应到两个local中，两个local中的cpu表数据一个有10条，另外一个有50条，请问造成这种问题的原因是什么？
2.我看之前提的问题，咱们这个是可以支持某个节点only_write的需要怎么配置一下呢？

请问一下，假如启动2个proxy服务，使用同一份buffer文件，会存在并发问题吗？

假设我有两个influxdb，分别为influx1、influx2。
还有两个proxy,分别proxy1、proxy2，都连接influx1和influx2，实现相同的功能，从而实现proxy的高可用。

我想到的做法是，将2个proxy的本地持久化的buffer文件通过nfs的形式使用同一份文件。
可问题是，不太知道这样并发写是否存在问题。
并且假如influx1重启，2个proxy也应该会同时根据buffer文件，将数据消费到influx1中吧。
希望您能帮忙看一下这个方案是否可行呢？

query的机器选择的是哪台机器

两台influxdb机器：A和B。
所有数据都同时存入两台机器，例如KEYMAPS的配置为：

KEYMAPS = {
    temperature: ["A", "B"]
}

写入temperature的时候会同时写入两台机器，那么query的时候现在的表现是去第一个backend也就是A里面拿数据，即使这次query报错
这样是对的吗？

关于BACKENDS的疑问

你好，如果我有两个BACKENDS，并将KEYMAPS配成：'cpu': ['local1','local2'], 是不是向两个BACKENDS都写数据？两个BACKENDS的数据是一模一样的？相当于做了个备份？

influxdb proxy后端的influxdb如果设置了用户密码，在config.py中怎么配置呢

telegraf的output设为influxProxy时报错

telegraf的output设为influxProxy时，telegraf启动log里报错如下，返回404错误。
Dec 3 13:16:09 yangzantest telegraf: 2018-12-03T05:16:09Z W! [outputs.influxdb] when writing to [http://172.17.11.85:6666]: database "telegraf" creation failed: 404 Not Found

telegraf的配置如下：
[[outputs.influxdb]]
urls = ["http://172.17.11.85:6666"] # required
database = "telegraf" # required
retention_policy = ""
write_consistency = "any"
timeout = "10s"

如果直接写成influxdb的地址就不报错。
urls = ["http://172.17.11.83:8086"]

请问需要改什么配置吗？
谢谢

使用vegeta进行压测，5000次每秒，持续1分钟，最后数据库里面只有120条，这是为什么，新手

请问下，是是不是配置完config.py,就可以使用啊

我配置完不知道如何使用，有大概指点下的吗？

1.可以同时写入到2台机器到数据库吗？2.会不会出现时间不一致的情况？

请问我在执行查询的时候为什么提示查询禁止呢？

curl -G 'http://localhost:6666/query?pretty=true' --data-urlencode "db=db_relay" --data-urlencode "q=show databases"

query forbidden

不支持Retention Policy

/write POST请求里的rp参数会被忽略掉，导致最终使用缺省的rp。无法满足使用多rp的场景。
demo:
curl -i -XPOST 'http://localhost:6666/write?db=mydb&rp=one_week' --data-binary 'test,name=brian status=0 1543302627530000000'

请问grafana如何与proxy的query对接呢

rt 3ks

redis的作用

简单看了下代码, 发现redis目前好像只是用在存储节点信息上, 并没有用来做query结果的缓存. 请问这样设计是出于哪方面的考量 ?

[新功能]只写数据库

多个backend混合时，允许某些后端数据库只写。

通过influx-proxy使用query时报错：query forbidden

安装配置好influx-proxy之后，write方法已经没有问题了，可以同时写入两台influxdb，但是query方法一直不行，报错query forbidden，没有更多的信息，麻烦帮忙看一下是什么问题，谢谢！

query方法：

curl -G 'http://172.16.6.104:6666/query' --data-urlencode "db=telegraf" --data-urlencode "epoch=s" --data-urlencode "q=SELECT value FROM cpu_load_short WHERE region='us-west'"

核心配置如下：

BACKENDS = {
    'local': {
        'url': 'http://172.16.6.103:8086', 
        'db': 'telegraf', 
        'zone':'l1', 
        'interval': 1000,
        'timeout': 10000, 
        'timeoutquery':600000, 
        'maxrowlimit':10000,  
        'checkinterval':1000, 
        'rewriteinterval':10000
    },
    'local2': {
        'url': 'http://172.16.6.104:8086',
        'db': 'telegraf',
        'zone':'l1',
        'interval': 1000
    },
}

KEYMAPS = {
    'cpu': ['local','local2']
}

NODES = {
    'l1': { 
        'listenaddr': ':6666',
        'db': 'telegraf',
        'zone': 'l1',
        'interval':10,
        'idletimeout':10,
        'writetracing':1,
        'querytracing':1,
    }
}

influx-proxy支持高可用嘛？前面可以挂lvs?

influx的性能指标

想请教一下，对于influx的性能情况，贵公司是怎么去衡量的，谢谢

influxdb proxy如果配置了一个measurement有2个backends，写的时候是同时写2个backends，那读取数据呢，读数据是随机选择一个backend获取数据，还是同时向2个backend发起获取数据请求的？

请问一下如何实现不同的数据库使用不同的实例？

比如我有两个数据库 k8s 和 heapster，
和两个数据库实例 influxdb0，influxdb1

想要实现访问proxy ，根据数据名称访问不同 influxdb实例，比如数据库 k8s 的数据放到influxdb0，数据库 heapster 的数据放到实例 influxdb1

如果一个后段influxdb实例挂掉，这部分历史数据会丢吗

database not exists异常

proxy配置（部署在10.1.201.201中）：
BACKENDS = {
'local': {
'url': 'http://10.1.201.201:8086',
'db': 'pard',
'zone':'local',
'interval': 1000,
'timeout': 10000,
'timeoutquery':600000,
'maxrowlimit':10000,
'checkinterval':1000,
'rewriteinterval':10000,
},
'local2': {
'url': 'http://10.1.201.202:8086',
'db': 'pard',
'interval': 200,
},
}

KEYMAPS = {
'cpu': ['local'],
'temperature': ['local2'],
'default': ['local']
}

在local主机中：
show databases;
name: databases
name
_internal
aTimeSeries
pard

在终端执行：
curl -G 'http://10.1.201.201:6666/query?pretty=true' --data-urlencode "db=pard" --data-urlencode "q=SELECT idle FROM cpu"
提示：
database not exist.

安装报错，求解

Shell 您好！

我按照 readme 中的安装步骤进行操作，到 go get -u github.com/shell909090/influx-proxy/service 这一步时，出现错误：

$ go get -u github.com/shell909090/influx-proxy/service
package github.com/influxdata/influxdb/client/v2: cannot find package "github.com/influxdata/influxdb/client/v2" in any of:
/usr/lib/go/src/github.com/influxdata/influxdb/client/v2 (from $GOROOT)
/home/test/go/src/github.com/influxdata/influxdb/client/v2 (from $GOPATH)

我对 go 不太熟悉，请问遇到这种问题，应该如何处理？

谢谢您！

一点疑惑

简单看了下代码，逻辑是根据后端 influxdb的measurement 选择入库，还没做 shard分片的逻辑，这个其实可以有的。

@shell909090 我记得七牛应该早已经实现了自己的influxdb proxy... 你也有参与吧...