pruepei / influx-proxy Goto Github PK
View Code? Open in Web Editor NEWThis project forked from shell909090/influx-proxy
License: Other
This project forked from shell909090/influx-proxy
License: Other
func (hs *HttpService) HandlerPing(w http.ResponseWriter, req *http.Request) {
defer req.Body.Close()
version, err := hs.ic.Ping()
if err != nil {
panic("WTF")
return // this line is unreachable code !
}
...
}
一个关于regex
导致性能损失的猜想有待profiling
来验证。
cluster.go
:
func (ic *InfluxCluster) CheckQuery(q string) (err error) {
ic.lock.RLock()
defer ic.lock.RUnlock()
if len(ic.ForbiddenQuery) != 0 {
for _, fq := range ic.ForbiddenQuery { // 放大系数
if fq.MatchString(q) { // 热点
return ErrQueryForbidden
}
}
}
if len(ic.ObligatedQuery) != 0 {
for _, pq := range ic.ObligatedQuery { // 同上
if pq.MatchString(q) {
return
}
}
return ErrQueryForbidden
}
return
}
按照一个业务逻辑,一个sql进来首先要判断是否能执行(单元测试drop之类被//
了),而这个函数的实现是使用了regex
方法,这个方法按照[^1:]分析是会损失性能的,尤其在业务量比较大的时候会被放大,猜想的验证需要完善benchmark测试。
reference :
1: Go代码调优利器-火焰图
可以看到如下函数声明是有返回值err的,但是在实现中并没有发现把下层err信息往上层传。
func (ic *InfluxCluster) LoadConfig() (err error) {
backends, bas, err := ic.loadBackends()
if err != nil {
return
}
m2bs, err := ic.loadMeasurements(backends)
if err != nil {
return
}
ic.lock.Lock()
orig_backends := ic.backends
ic.backends = backends
ic.bas = bas
ic.m2bs = m2bs
ic.lock.Unlock()
for name, bs := range orig_backends {
err = bs.Close()
if err != nil {
log.Printf("fail in close backend %s", name)
}
}
return
}
而且看到一处调用方式如下:
func main() {
initLog()
...
ic.LoadConfig()
...
}
既然调用方式并没有取返回值,为什么函数要这样声明?
因为我们的大多数情况是不分片的,可以简单转发这样的处理路径。但是这个情况下对查询语句的错误检查就不在proxy上做了,而是等后端服务器发生错误时候再来错误处理。
优点:优良的分派查询的性能。
缺点:查询语句的正确由使用者保证,错误在后端服务器上发生后在处理(待reivew)。
写逻辑:
根据measurement获取对应的hash ring
根据measurement获取对应的后端分组
根据measurement+sortedkey 在hash ring找到后端分组索引
根据索引找到对应后端,循环写入数据
查询逻辑:
根据measurement找到后端分组
如果后端分组只有一个,说明只有一个分片,选择第一个可用后端发送请求,然后返回数据,请求结束。如果请求失败,使用第二个可用后端。如果所有后端请求都失败,返回失败请求。
如果后端分组有多个,说明有多个分片,将请求复制到多个分片。请求和上面类似。如果有一个分片请求失败,认为请求失败。从分片获取数据后,在proxy上进行二次计算。
优先考虑实现如下函数:
count
,
mean
,sum
,
min
,max
项目的目标:
项目的目标是high performance
和reliable
。
根据阅读代码,可以理解下面函数是实现查询的核心入口,是否可以考虑每goroutine
/每查询实例的设计以此来提高性能,各位怎么看?
cluster.go
func (ic *InfluxCluster) Query(w http.ResponseWriter, req *http.Request) (err error) {
atomic.AddInt64(&ic.stats.QueryRequests, 1)
defer func(start time.Time) {
atomic.AddInt64(&ic.stats.QueryRequestDuration, time.Since(start).Nanoseconds())
}(time.Now())
switch req.Method {
case "GET", "POST":
default:
w.WriteHeader(400)
w.Write([]byte("illegal method"))
atomic.AddInt64(&ic.stats.QueryRequestsFail, 1)
return
}
// TODO: all query in q?
q := strings.TrimSpace(req.FormValue("q"))
if q == "" {
w.WriteHeader(400)
w.Write([]byte("empty query"))
atomic.AddInt64(&ic.stats.QueryRequestsFail, 1)
return
}
err = ic.query_executor.Query(w, req)
if err == nil {
return
}
err = ic.CheckQuery(q)
if err != nil {
w.WriteHeader(400)
w.Write([]byte("query forbidden"))
atomic.AddInt64(&ic.stats.QueryRequestsFail, 1)
return
}
key, err := GetMeasurementFromInfluxQL(q)
if err != nil {
log.Printf("can't get measurement: %s\n", q)
w.WriteHeader(400)
w.Write([]byte("can't get measurement"))
atomic.AddInt64(&ic.stats.QueryRequestsFail, 1)
return
}
apis, ok := ic.GetBackends(key)
if !ok {
log.Printf("unknown measurement: %s,the query is %s\n", key, q)
w.WriteHeader(400)
w.Write([]byte("unknown measurement"))
atomic.AddInt64(&ic.stats.QueryRequestsFail, 1)
return
}
// same zone first, other zone. pass non-active.
// TODO: better way?
for _, api := range apis {
if api.GetZone() != ic.Zone {
continue
}
if !api.IsActive() || api.IsWriteOnly() {
continue
}
err = api.Query(w, req)
if err == nil {
return
}
}
for _, api := range apis {
if api.GetZone() == ic.Zone {
continue
}
if !api.IsActive() {
continue
}
err = api.Query(w, req)
if err == nil {
return
}
}
w.WriteHeader(400)
w.Write([]byte("query error"))
atomic.AddInt64(&ic.stats.QueryRequestsFail, 1)
return
}
2017/08/15 14:05:18 handler any get url: /ping
2017/08/15 14:05:18 http error: Get http://127.0.0.1:53643/ping: dial tcp 127.0.0.1:53643: getsockopt: connection refused
2017/08/15 14:05:18 read meta error: EOF
2017/08/15 14:05:18 read meta error: EOF
2017/08/15 14:05:18 read meta error: EOF
2017/08/15 14:05:18 new measurement: load.cpu
2017/08/15 14:05:18 new measurement: test
2017/08/15 14:05:18 handler any get url: /ping
2017/08/15 14:05:18 handler any get url: /ping
2017/08/15 14:05:18 handler any get url: /ping
2017/08/15 14:05:18 handler any get url: /write?db=test1
2017/08/15 14:05:18 handler any get url: /write?db=write_only
2017/08/15 14:05:18 handler any get url: /write?db=test2
2017/08/15 14:05:19 http error: Get http://127.0.0.1:53643/ping: dial tcp 127.0.0.1:53643: getsockopt: connection refused
2017/08/15 14:05:19 handler any get url: /ping
2017/08/15 14:05:19 handler any get url: /ping
2017/08/15 14:05:19 handler any get url: /ping
2017/08/15 14:05:19 read meta error: EOF
2017/08/15 14:05:19 read meta error: EOF
2017/08/15 14:05:19 read meta error: EOF
2017/08/15 14:05:19 handler any get url: /ping
2017/08/15 14:05:19 handler any get url: /ping
2017/08/15 14:05:19 handler any get url: /ping
2017/08/15 14:05:20 http error: Get http://127.0.0.1:53643/ping: dial tcp 127.0.0.1:53643: getsockopt: connection refused
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 read meta error: EOF
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 read meta error: EOF
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 read meta error: EOF
2017/08/15 14:05:20 unknown measurement: test,the query is SELECT cpu_load from test WHERE time > now() - 1m
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /query?db=test1&q=+select+cpu_load+from+cpu+WHERE+time+%3E+now%28%29+-+1m
2017/08/15 14:05:20 handler any get url: /query?db=test1&q=+select+cpu_load+from+%22cpu.load%22+WHERE+time+%3E+now%28%29+-+1m
2017/08/15 14:05:20 unknown measurement: load.cpu,the query is select cpu_load from "load.cpu" WHERE time > now() - 1m
2017/08/15 14:05:20 handler any get url: /query?db=test1&q=SHOW+tag+keys+from+%22cpu%22+
2017/08/15 14:05:20 write meta: 8
2017/08/15 14:05:20 write meta: 0
2017/08/15 14:05:20 http backend write test
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /write?db=test
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /write?db=test
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /ping
2017/08/15 14:05:20 handler any get url: /query?db=test
2017/08/15 14:05:20 handler any get url: /ping
PASS
ok github.com/eleme/influx-proxy/backend 5.030s
是否需要简化单元测试去除网络操作性的部分,让单元测试仅仅限于描述接口和测试函数性能?
是否需要端到端测试来补充单元测试在proxy与influxdb之前操作描述的不足?
$ grep -rin LoadJson .
Binary file ./bin/influx-proxy matches
./service/main.go:45:func LoadJson(configfile string, cfg interface{}) (err error) {
./service/main.go:81: err = LoadJson(ConfigFile, &cfg)
如上所示LoadJson
函数仅在初始化过程中使用,不建议使用go的导出函数命名风格。
在阅读cluster_test.go
中,发现如下代码片段:
func CreateTestInfluxCluster() (ic *InfluxCluster, err error) {
redisConfig := &RedisConfigSource{}
...
cfg.WriteOnly = 1
...
return
}
中的cfg.WriteOnly = 1
是不是bool
类型更合理一些?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.