Giter Site home page Giter Site logo

megaease / easeprobe Goto Github PK

View Code? Open in Web Editor NEW
2.0K 2.0K 210.0 6.16 MB

A simple, standalone, and lightweight tool that can do health/status checking, written in Go.

License: Apache License 2.0

Go 99.01% Dockerfile 0.11% Makefile 0.16% Shell 0.73%
alerting go golang monitoring notifications probe prometheus

easeprobe's People

Contributors

actions-user avatar allenxuxu avatar cuishuang avatar dependabot[bot] avatar douglarek avatar gelleson avatar haoel avatar haoqixu avatar hellojukay avatar icpd avatar jiacheo avatar jordy1024 avatar ken8203 avatar lostsquirrel avatar muicoder avatar nullsimon avatar proditis avatar qdongxu avatar samanhappy avatar shawyeok avatar suchen-sci avatar testwill avatar tg123 avatar xiaomao87 avatar xiaoxuanzi avatar xiekeyi98 avatar youniverse-zhao avatar yunyouu avatar zhangjunjie6b avatar zhangtaomox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

easeprobe's Issues

Can you add a Feature like cURL's "--resolve" option to pin a request to an IP address

Background & X Problem
If one domain has different cdn suppliers.
I want to check all supplier ssl cert.

Proposal and Expectation
A clear and concise description of what you want to happen.

Solutions & Y Problems
cURL's "--resolve" option is ok
https://stackoverflow.com/questions/40624248/golang-force-http-request-to-specific-ip-similar-to-curl-resolve

Additional context
Use cURL's "--resolve" option to pin a request to an IP address
https://support-acquia.force.com/s/article/360005257154-Use-cURL-s-resolve-option-to-pin-a-request-to-an-IP-address

HTTP Probe Configuration 能添加个proxy吗?

大大好,
在配置 HTTP Probe Configuration 的时候, 我们有个别内网页面,是需要通过指定的代理才能访问的,
不晓得 大大能有时间给 HTTP Probe Configuration 探活功能,添加个 指定代理的功能吗?(比如 http代理)。

该项目的一个xss漏洞

Environment:

  • OS: [Linux: Ubuntu 20.04]
  • EaseProbe Version [2.0.0]

Describe the bug
ba1a05f6c8b21500c87338d337d3352

修复建议:
在服务端对用户的输入进行转义:

[1] |竖线符号) 
[2] && 符号) 
[3];(分号) 
[4] $(美元符号) 
[5] %百分比符号) 
[6] @(at 符号) 
[7] '(单引号) 
[8] "(引号) 
[9] \'(反斜杠转义单引号) 
[10] \"(反斜杠转义引号) 
[11] <>尖括号) 
[12] ()(括号) 
[13] +加号) 
[14] CR回车符ASCII 0x0d) 
[15] LF换行ASCII 0x0a) 
[16] ,(逗号) 
[17] \(反斜杠

Native MongoDB Client 如何配置复制集

老师辛苦了, 如何配置 mongo复制集监控呢? 谢谢

说明文档上,只有单机版的。

  • name: MongoDB Native Client
    driver: "mongo"
    host: "192.168.28.171:27117"

Add default config.yaml to Docker Image

Background & X Problem
When I follow the readme to run docker, it fails to start as follow

Using config file: /opt/config.yaml
ERRO[0000] error: stat config.yaml: no such file or directory
ERRO[0000] Fatal: Cannot read the YAML configuration file!

I think we can add a default configuration file to Docker Image, and make docker st

Proposal and Expectation
Simplification startup process when using docker

Solutions & Y Problems
add a default configurate file to Docker Image

Additional context
N/A

Accidentally started another instance of easeprobe on same host and CTRL+C didnt seem to work

Hi,

I accidentally started a second easeprobe process (from a different binary / location) and after pressing ctrl+c i had to wait for a long time until the system exits.

At first i thought that it got stuck but after testing further, it exited after the initiated probes had completed first.

During the tests, the infrastructure the easeprobe was trying to "probe" was not responsive, so i suspect this was the connection timeout that was waiting to kick-in and finish up.

I believe that the active probes should immediately stop operating once SIGINT is received, but i may overlooked something.

Another workaround to this could be to to "release" the signal handler once the signal is received, so consecutive CTRL+C / SIGINT invocations can actually kill the process without waiting if needed?

After investigating it a bit more it seems to needing bit more than just signal.Ignore()

Introduce repeatable alert

Currently the notification in EaseProbe is using edge triggered mode which means one notification will be sent only once when the status changes, but people may miss notifications in some situations, below are some:

  • one was busy at something else when receiving a important fail notification, and when he came back the alert was just forgotten, as you know, people always forget
  • in some times of emergency, one will receive many notifications in a very short period, unfortunately there are two different problems in these notifications, one is resloved, but other one was just neglected

In other notification systems using level trigger mode as I know, aliyun is using a silence duration to control repeatable alert.

So is it necessary for us to introduce a similar mechanism? What is your opinion? @haoel @proditis

钉钉加签通知发送失败

连续换了两个钉钉加签的通知机器人,都发送失败

time="2022-08-26T15:57:06+08:00" level=warning msg="[dingtalk / dingtalk alert service / Notification] Retried to send 1/3 - Error response from Dingtalk [%!d(float64=310000)] - [{"errcode":310000,"errmsg":"description:机器人发送签名不匹配;solution:请确认签名和生成签名的时间戳必须都放在调用的网址中,请确认机器人的密钥加密和填写正确;link:请参考本接口对应文档获得具体要求,或者在https://open.dingtalk.com/document/ 搜索相关文档;"}]"

Module-wise installation?

Background & X Problem
Selective installation to reduce container image.

Proposal and Expectation
Is it possible to install only a given set of probes and notifiers? For example, I want to monitor MySQL database with the notification to be sent to Slack and Teams. Such an installation will reduce container image resulting in efficient deployment.

Solutions & Y Problems
I am not a Go developer. However, I would expect a config.yaml key for probes and notifiers that are arrays for the chosen probes and notifiers. Or, at least, instructions for custom build that can pick specific probes and notifiers would be helpful.

Additional context
None.

[feature request]Hot reloading support

Background & X Problem
Currently (2022-05-05), easeprobe does not support hot reloading, so the problem is that if I just modify the config.yaml file (which can be a frequent action), then I need to restart the application for it to take effect.

Proposal and Expectation
We can add a reload option to the configuration file to enable it:

reload:
  enabled: true # default is false
  period: 10s # scan interval, default is 10s

Solutions & Y Problems
We need to not lose the SLA information that the configuration has not changed when the reload is turned on.

Adding a helm repository returns 404

Describe the bug
The helm repository seems to be invalid

> helm repo add easeprobe https://megaease.github.io/easeprobe
Error: looks like "https://megaease.github.io/easeprobe" is not a valid chart repository or cannot be reached: failed to fetch https://megaease.github.io/easeprobe/index.yaml : 404 Not Found

EASEPROBE_TIME is only provided in JSON format

Environment (please complete the following information):

  • OS: Linux VM-8-5-centos 3.10.0-1160.76.1.el7.x86_64 # 1 SMP Wed Aug 10 16:21:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • EaseProbe Version EaseProbe v1.7.0 @ a4255478e017

Describe the bug
When looking at shell notification script, values are both provided by JSON format or env variables, however they're misaligned. EASEPROBE_TIME is part of JSON format but not in env variables.

To Reproduce

  1. set shell script as https://github.com/megaease/easeprobe/blob/main/resources/scripts/notify/notify.sh
  2. see output

Expected behavior
EASEPROBE_TIME env variable should also exist.

The reason I need this is because, the docker image was built based on alpine so there's quite a little utitlites avaiable out of box, so I cannot run jq without a self-made image. EASEPROBE_TIMESTAMP is in env variable but it's a millisecond epoch so date -d is not able to calc it unless I do the math first.

用户手册 tls 示例的错误

Environment (please complete the following information):

  • OS: [Linux: Ubuntu 20.04]
  • EaseProbe Version [2.0.0]

https://github.com/megaease/easeprobe/blob/main/docs/Manual.md#76-tls-probe-configuration
这里的格式

tls:
- name: expired test
    host: expired.badssl.com:443
    proxy: socks5://proxy.server:1080 # Optional. Only support socks5.
                                    # Also support the `ALL_PROXY` environment.
    insecure_skip_verify: true # dont check cert validity
    expire_skip_verify: true # dont check cert expire date
    alert_expire_before: 168h # alert if cert expire date is before X, the value is a Duration, see https://pkg.go.dev/time#ParseDuration. example: 1h, 1m, 1s. expire_skip_verify must be false to use this feature.
    # root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
    # root_ca_pem: |
    #   -----BEGIN CERTIFICATE-----
- name: untrust test
    host: untrusted-root.badssl.com:443
    # insecure_skip_verify: true # dont check cert validity
    # expire_skip_verify: true # dont check cert expire date
    # root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
    # root_ca_pem: |
    #   -----BEGIN CERTIFICATE-----

放到任意一个 yaml 编辑器可以看到是错的,正确的应为:

tls:
  - name: expired test
    host: expired.badssl.com:443
    proxy: socks5://proxy.server:1080 # Optional. Only support socks5.
                                    # Also support the `ALL_PROXY` environment.
    insecure_skip_verify: true # dont check cert validity
    expire_skip_verify: true # dont check cert expire date
    alert_expire_before: 720h # alert if cert expire date is before X, the value is a Duration, see https://pkg.go.dev/time#ParseDuration. example: 1h, 1m, 1s. expire_skip_verify must be false to use this feature.
    # root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
    # root_ca_pem: |
    #   -----BEGIN CERTIFICATE-----
  - name: untrust test
    host: untrusted-root.badssl.com:443    
    # insecure_skip_verify: true # dont check cert validity
    # expire_skip_verify: true # dont check cert expire date
    # root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
    # root_ca_pem: |
    #   -----BEGIN CERTIFICATE-----    

Add version option for executable and configuration files

Background & X Problem
I wanted to performing updates between v1.4.0 and the latest version from repo on my systems, as part of the process I wanted to confirm the version of the binary but we currently have no command line option that will simply display the version.
As it currently stands those who are interested in tracking the version somehow, have to go through checksums to the binary which is not optimal imho.

Similarly I could not determine for what version of binary my config was for.

Proposal and Expectation
I propose the following,

  • addition of a command line option -v or -V that will just display the version of the current binary
  • addition of the version string to the displayed Usage
  • addition of a config.yaml top key version: that will allow us to determine what version of binary this config works with
  • (optionally) include a version key also on the data.yaml so that future binaries can detect older versions of the data and accommodate any potential changes that need to take place

edit: If anyone wants to take on this please be my guest

Lack of http.refresh and probe.interval settings from config cause panic

Enviornment:

  • OS: [Linux: Linux urandom 5.4.0-109-generic #123-Ubuntu SMP Fri Apr 8 09:10:54 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux]
  • EaseProbe Version [main branch]

Describe the bug
I removed some options from the config (expecting easeprobe defaults to kick in) and this caused easeprobe to panic.

image

I will investigate further to see if i can spot where the bug lies exactly.

To Reproduce
Steps to reproduce the behavior:

  1. Use the provided config (see below)
  2. start easeprobe easeprobe -f test.yaml
  3. See error

Expected behavior
I would expect easeprobe to fail gracefully.

  • test.yaml
settings:
  sla:
    backups: 3
    debug: false

  notify:
    dry: true
    retry: # Global settings for retry
      times: 5
      interval: 10s

  loglevel: "info"
  timeformat: "2006-01-02 15:04:05 UTC"

http:
 - name: test
   url: http://localhost:8181/

notify:
  discord:
    - name: "Server #Alert"
      webhook: "https://discord.com/api/webhooks//"
      avatar: "https://img.icons8.com/ios/72/appointment-reminders--v1.png"
      thumbnail: "https://freeiconshop.com/wp-content/uploads/edd/notification-flat.png"
      dry: true
      retry:
        times: 3
        interval: 10s

shell content error, no dingding message send

Enviornment (please complete the following information):

  • OS: linux
  • EaseProbe Version
  • latest version

Describe the bug
配置shell脚本,脚本内容出错,钉钉消息未发送通知,日志也未打印发送钉钉通知消息
Configure shell script, script error, but messages are not sent to DingDing notifications, and there is no notifcaiton logs as well.

image

To Reproduce

Expected behavior
期望脚本报错发送钉钉消息,如果是钉钉配置出错,希望把日志打印出来

easeprod start error

/opt # easeprobe -f config.yaml
INFO[0000] The data file data/data.yaml, was not found!
INFO[0000] Load the configuration file successfully!
INFO[0000] Successfully created the PID file: /opt/easeprobe.pid
INFO[0000] Application Log File [Stdout] - Self-Rotate
INFO[0000] Web Access Log File [Stdout] - Self-Rotate
INFO[2022-07-05T03:44:22Z] [Web] HTTP server is listening on 0.0.0.0:8181
INFO[2022-07-05T03:44:22Z] Probe [http] - [ElasticSearch] base options are configured!
INFO[2022-07-05T03:44:22Z] [Metric] Counter <EaseProbe_http_total> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_status> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_sla> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Counter <EaseProbe_http_status_code> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_content_len> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_dns_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_connect_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_tls_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_send_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_wait_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_transfer_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_total_duration> is created!
FATA[2022-07-05T03:44:22Z] No notifies configured, exiting...

config.yaml

http:
  - name: test
    url: http://xxxx.com

HTTP server bind already in use should cause the daemon to exit

When the port is already in used, easeprobe displays the following error but continues to run.

ERRO[2022-05-16T00:56:00Z] [Web] HTTP server error: listen tcp 127.0.0.1:8181: bind: address already in use 

I believe that it would be better if we exited with a non zero status code when this happens.

Support keyword found/not found for HTTP probe

When using http probing, it is common to check for the existence or absence of a keyword. I would love to have such a feature in easeprobe.

Implementation suggestion:

http:
- name: ...
  url: ...
  success_keyword: "Welcome to my site"
  failure_keyword: "Unauthorized"

every hourly Restart stat up and down times

$CMD/data/data.yaml
I want to stat up and down times,start from zero.every hourly.
How it should be configured?

---
name: EaseProbe
version: v1.7.0
---
"":
    name: ""
    endpoint: http://xxxxx:8000/accounts
    time: 2022-07-25T03:40:09.881474243Z
    timestamp: 1658720409881
    rtt: 354.042µs
    status: down
    prestatus: down
    message: 'Error (http): Error: Get "http://xxxx:8000/accounts": dial tcp xxxxx:8000: connect: connection refused'
    latestdowntime: 2022-07-25T02:44:48.241836769Z
    recoverytime: 0s
    stat:
        since: 2022-07-07T02:49:31.152596481Z
        total: 1454
        status:
            up: 1412
            down: 42
        uptime: 23h32m0s
        downtime: 42m0s
    timeformat: 2006-01-02 15:04:05 UTC

Questions about using the unified disable symbol

Describe the bug

  1. Code documentation description error:

    • In line 129 of the config.go file, the description attribute of the PIDFile field does not match the documentation description.
  2. About SLA data persistence and PID file disable configuration and other disable configuration, what symbol to use to indicate the problem:

    • I think we can use "-" to unify the symbols, a unified symbol identifier will reduce the user error rate and make it easier to use.

Add openbsd to the release binaries

Background
I would love to be able to run easeprobe on my OpenBSD server which comes with go 1.17, however easeprobe requires go 1.18+ 😭

Proposal and Expectation
I was able to cross compile the binary from one of my linux boxes with GOOS=openbsd GOARCH=amd64 but it would be really awesome if there was a release available to download directly from github instead.

Would it be possible to add an openbsd target to the releases of easeprobe 🥺 🙏

Thank you in advance

The content would be truncated when send a message on wecom webhook.

Enviornment (please complete the following information):

  • OS: [macos Big Sur]
  • EaseProbe Version [v1.4.0]

Describe the bug
If the double quotes in the markdown message sent to wecom are not escaped,the content would be truncated.

To Reproduce

curl 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key={your-wecom-key}' \
   -H 'Content-Type: application/json' \
-d '
{
    "msgtype": "markdown",
    "markdown": {
        "content": "My ElasticSearch Failure** ❌
http://localhost:9200/ - ⏱ 6ms
Error (http): Error: Get "http://localhost:9200/": dial tcp [::1]:9200: connect: connection refused
> Jordy-probe v1.5.0 @ JordyMacBook-Pro.local at 2022-05-20 13:36:41 UTC"
    }
}' |  jq  .  

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   370  100    57  100   313    136    750 --:--:-- --:--:-- --:--:--   887
{
  "errcode": 0,
  "errmsg": "ok. Warning: wrong json format. "
}

The actual received message content:

My ElasticSearch Failure** ❌
http://localhost:9200/ - ⏱ 6ms
Error (http): Error: Get

Expected behavior

curl 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=key={your-wecom-key}' \
-H 'Content-Type: application/json' \
-d '
{
    "msgtype": "markdown",
    "markdown": {
        "content": "My ElasticSearch Failure** ❌
http://localhost:9200/ - ⏱ 6ms
Error (http): Error: Get \"http://localhost:9200/\": dial tcp [::1]:9200: connect: connection refused
> Jordy-probe v1.5.0 @ JordyMacBook-Pro.local at 2022-05-20 13:36:41 UTC"
    }
}' | jq . 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   370  100    57  100   313    136    750 --:--:-- --:--:-- --:--:--   887
{
  "errcode": 0,
  "errmsg": "ok"
}

How to support multiple configuration files

The config file looks ugly when I have a lot of targets to probe。
I saw in this issue that this function seems to have been supported, but I didn’t see any relevant introductions in the user documentation or configuration file samples. If it is already supported, can you mention it in the documentation?

配置文件中,包含明文密码?

大大好,不到24小时就上线,2个需求, 再次感谢。
我有个建议,不知道好不好啊, 就是在配置中我发现很多需要配置明文的密码。 我们能不能加密一下,比如base64,
或者是先提供一个自定义加密密码的命令, 比如 ./bin/easeprobe -a "加密key“ password > 加密后的密码,
把加密后的密码写入配置文件, 然后启动的时候,带上一个 加密key,用于解密 , 运行比如 ./bin/easeprobe -k ”加密key“

这样 别看看到了配置文件, 也不会知道 生产服务的各个密码。

Native Postgresql connection without ssl enabled

I've following manual on how to config native probe for postgresql (https://github.com/megaease/easeprobe/blob/main/docs/Manual.md#78-native-client-probe-configuration)

here's my code:

client:
  - name: Posgresql
    driver: postgres
    host: postgis:5432
    username: xxxxx
    password: xxxxx

  - name: Pgbouncer
    driver: postgres
    host: pgbouncer:5432
    username: xxxxxx
    password: xxxxxx

and here's report:

Posgresql Failure ❌
postgis:5432 - ⏱ 4ms
Error (client/postgres): pgdriver: SSL is not enabled on the server
> EaseProbe v2.0.0 @ c0d124fc3faa at 2023-02-10 10:49:53 UTC

Pgbouncer Failure ❌
pgbouncer:5432 - ⏱ 2ms
Error (client/postgres): pgdriver: SSL is not enabled on the server
> EaseProbe v2.0.0 @ c0d124fc3faa at 2023-02-10 10:49:54 UTC

i'm using docker deployment method

The notify message's time format of WeCom always in UTC time zone

Enviornment (please complete the following information):

  • OS: Linux ZRB-Base 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • EaseProbe Version :V1.6.0

Describe the bug
When a notify message sent by WeCom BOT,the time format in the end of the message is not the server's time zone setting.It looks like UTC time zone.

To Reproduce
Steps to reproduce the behavior:

  1. trigger a notify message send by WeCom BOT,confirm the time format in the end of the message.

Expected behavior
The time format(include timezone) in the end of the notify message can be configured by the config file, whether in the setting item or notify channel.

support kubernetes installation

Background & X Problem
I would like to have a deploy for kubernetes in order to become an alternative to other synthetic checks on kubernetes

Proposal and Expectation
add deployment option as helm chart or manifests

Is it necessary to send notifications when the status changes from down to up?

Hi, when I looked at the source code I had a question.

We know that the initial state is init, It is currently handled here:

❌ no notification
✅ send notification

  • init -> up Initial Status ❌
  • init -> down
  • up -> down
  • down -> up ✅ // here
  • up -> down
  • down -> down
  • up -> up

So my question is do we need to send a notification when the status changes from down to up?

Cannot be compiled in go1.19

I'm not sure whether the document says' 1.18 + 'includes' 1.19'
In go1.19, using make will not succeed
Using go tool compile -- help, you can find that the - G option is removed from gcflags

# go tool compile --help
usage: compile [options] file.go...
  -% int
    	debug non-static initializers
  -+	compiling runtime
  -B	disable bounds checking
  -C	disable printing of columns in error messages
  -D path
    	set relative path for local imports
  -E	debug symbol export
  -I directory
    	add directory to import search path
  -K	debug missing line numbers
  -L	also show actual source file names in error messages for positions affected by //line directives
  -N	disable optimizations
  -S	print assembly listing
  -V	print version and exit
  -W	debug parse tree after type checking
.......

build Error on mac

commit: a1126c2
os:
ProductName: macOS
ProductVersion: 12.4
BuildVersion: 21F79

make
mkdir -p /private/tmp/easeprobe//build/bin
go mod tidy
CGO_ENABLED=0 go build -a -ldflags '-s -w -extldflags "-static"' -gcflags=-G=3 -o /private/tmp/easeprobe//build/bin/easeprobe /private/tmp/easeprobe/cmd/easeprobe
directory /private/tmp/easeprobe/cmd/easeprobe outside main module or its selected dependencies
make: *** [/private/tmp/easeprobe//build/bin/easeprobe] Error 1

mysql client probe fails to close connections

Enviornment:

  • OS: OpenBSD tester 6.5 GENERIC#13 amd64
  • EaseProbe Version v1.3.0
  • MySQL Ver 15.1 Distrib 10.0.38-MariaDB, for OpenBSD (amd64) using readline 4.3

Describe the bug
When the mysql client probe is activated the connection stays open which lead to exhaustion of the max connections of the database server.

Based on the config below i get one connection per probe interval

To Reproduce
Steps to reproduce the behavior:
Use the mysql client probe on a remote database. I used the following config

settings:
  http:
    ip: 127.0.0.1
    port: 8181
    refresh: 5s

  sla:
    schedule : "daily"
    time: "23:59"
    debug: false

  notify:
    dry: true
    retry: # Global settings for retry
      times: 5
      interval: 10s

  probe:
    timeout: 30s
    interval: 1m # probe every minute for all probes
    logfile: "/var/logs/easyprobe.log"

  loglevel: "info"
  timeformat: "2006-01-02 15:04:05 UTC"

client:
  - name: MariaDB Client
    driver: "mysql"
    host: "10.7.0.253:3306"
    username: "user"
    password: "pass"

Run easeprobe and observe the connections staying open, one for each minute of uptime.

Expected behavior
I expect the connection from each probe to close after its done

Extra details
I have looked online for this kind of behavior it it seems the go-mysql driver has seen quite a few of similar reports (of non closing connections). Based on what i found so far, this problem seems to be observed on long running go applications mostly.

My golang knowledge is limited and I could only try a few simple "suggestions" i saw online without success which included:

  • Adding these on mysql.go right after sql.Open() had: no success
        db.SetConnMaxIdleTime(10 * time.Second)
        db.SetConnMaxLifetime(30 * time.Second)
        db.SetMaxOpenConns(1)
        db.SetMaxIdleConns(1)
  • Removing the defer keyword from db.Close() and moving it right before the return: no success
  • Played with different connection/timeout settings: no success

I will continue to investigate and test just to rule out any configuration issues on the database side and will update here accordingly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.