megaease / easeprobe Goto Github PK
View Code? Open in Web Editor NEWA simple, standalone, and lightweight tool that can do health/status checking, written in Go.
License: Apache License 2.0
A simple, standalone, and lightweight tool that can do health/status checking, written in Go.
License: Apache License 2.0
Background & X Problem
If one domain has different cdn suppliers.
I want to check all supplier ssl cert.
Proposal and Expectation
A clear and concise description of what you want to happen.
Solutions & Y Problems
cURL's "--resolve" option is ok
https://stackoverflow.com/questions/40624248/golang-force-http-request-to-specific-ip-similar-to-curl-resolve
Additional context
Use cURL's "--resolve" option to pin a request to an IP address
https://support-acquia.force.com/s/article/360005257154-Use-cURL-s-resolve-option-to-pin-a-request-to-an-IP-address
大大好,
在配置 HTTP Probe Configuration 的时候, 我们有个别内网页面,是需要通过指定的代理才能访问的,
不晓得 大大能有时间给 HTTP Probe Configuration 探活功能,添加个 指定代理的功能吗?(比如 http代理)。
Environment:
修复建议:
在服务端对用户的输入进行转义:
[1] |(竖线符号)
[2] & (& 符号)
[3];(分号)
[4] $(美元符号)
[5] %(百分比符号)
[6] @(at 符号)
[7] '(单引号)
[8] "(引号)
[9] \'(反斜杠转义单引号)
[10] \"(反斜杠转义引号)
[11] <>(尖括号)
[12] ()(括号)
[13] +(加号)
[14] CR(回车符,ASCII 0x0d)
[15] LF(换行,ASCII 0x0a)
[16] ,(逗号)
[17] \(反斜杠)
老师辛苦了, 如何配置 mongo复制集监控呢? 谢谢
说明文档上,只有单机版的。
Background & X Problem
When I follow the readme to run docker, it fails to start as follow
Using config file: /opt/config.yaml
ERRO[0000] error: stat config.yaml: no such file or directory
ERRO[0000] Fatal: Cannot read the YAML configuration file!
I think we can add a default configuration file to Docker Image, and make docker st
Proposal and Expectation
Simplification startup process when using docker
Solutions & Y Problems
add a default configurate file to Docker Image
Additional context
N/A
Hi,
I accidentally started a second easeprobe process (from a different binary / location) and after pressing ctrl+c i had to wait for a long time until the system exits.
At first i thought that it got stuck but after testing further, it exited after the initiated probes had completed first.
During the tests, the infrastructure the easeprobe was trying to "probe" was not responsive, so i suspect this was the connection timeout that was waiting to kick-in and finish up.
I believe that the active probes should immediately stop operating once SIGINT is received, but i may overlooked something.
Another workaround to this could be to to "release" the signal handler once the signal is received, so consecutive CTRL+C / SIGINT invocations can actually kill the process without waiting if needed?
After investigating it a bit more it seems to needing bit more than just signal.Ignore()
I was wondering if this could support Workwechat?
I'd like to try add it if we would like to support work wechat.
F.Y.R:
Wechat API Doc:https://developer.work.weixin.qq.com/document/path/91770
Currently the notification in EaseProbe is using edge triggered mode which means one notification will be sent only once when the status changes, but people may miss notifications in some situations, below are some:
In other notification systems using level trigger mode as I know, aliyun is using a silence duration to control repeatable alert.
So is it necessary for us to introduce a similar mechanism? What is your opinion? @haoel @proditis
连续换了两个钉钉加签的通知机器人,都发送失败
time="2022-08-26T15:57:06+08:00" level=warning msg="[dingtalk / dingtalk alert service / Notification] Retried to send 1/3 - Error response from Dingtalk [%!d(float64=310000)] - [{"errcode":310000,"errmsg":"description:机器人发送签名不匹配;solution:请确认签名和生成签名的时间戳必须都放在调用的网址中,请确认机器人的密钥加密和填写正确;link:请参考本接口对应文档获得具体要求,或者在https://open.dingtalk.com/document/ 搜索相关文档;"}]"
Background & X Problem
Selective installation to reduce container image.
Proposal and Expectation
Is it possible to install only a given set of probes and notifiers? For example, I want to monitor MySQL database with the notification to be sent to Slack and Teams. Such an installation will reduce container image resulting in efficient deployment.
Solutions & Y Problems
I am not a Go developer. However, I would expect a config.yaml
key for probes
and notifiers
that are arrays for the chosen probes and notifiers. Or, at least, instructions for custom build that can pick specific probes and notifiers would be helpful.
Additional context
None.
Background & X Problem
Currently (2022-05-05), easeprobe does not support hot reloading, so the problem is that if I just modify the config.yaml file (which can be a frequent action), then I need to restart the application for it to take effect.
Proposal and Expectation
We can add a reload option to the configuration file to enable it:
reload:
enabled: true # default is false
period: 10s # scan interval, default is 10s
Solutions & Y Problems
We need to not lose the SLA information that the configuration has not changed when the reload is turned on.
Describe the bug
The helm repository seems to be invalid
> helm repo add easeprobe https://megaease.github.io/easeprobe
Error: looks like "https://megaease.github.io/easeprobe" is not a valid chart repository or cannot be reached: failed to fetch https://megaease.github.io/easeprobe/index.yaml : 404 Not Found
Environment (please complete the following information):
Describe the bug
When looking at shell notification script, values are both provided by JSON format or env variables, however they're misaligned. EASEPROBE_TIME
is part of JSON format but not in env variables.
To Reproduce
Expected behavior
EASEPROBE_TIME
env variable should also exist.
The reason I need this is because, the docker image was built based on alpine so there's quite a little utitlites avaiable out of box, so I cannot run jq
without a self-made image. EASEPROBE_TIMESTAMP
is in env variable but it's a millisecond epoch so date -d
is not able to calc it unless I do the math first.
Environment (please complete the following information):
https://github.com/megaease/easeprobe/blob/main/docs/Manual.md#76-tls-probe-configuration
这里的格式
tls:
- name: expired test
host: expired.badssl.com:443
proxy: socks5://proxy.server:1080 # Optional. Only support socks5.
# Also support the `ALL_PROXY` environment.
insecure_skip_verify: true # dont check cert validity
expire_skip_verify: true # dont check cert expire date
alert_expire_before: 168h # alert if cert expire date is before X, the value is a Duration, see https://pkg.go.dev/time#ParseDuration. example: 1h, 1m, 1s. expire_skip_verify must be false to use this feature.
# root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
# root_ca_pem: |
# -----BEGIN CERTIFICATE-----
- name: untrust test
host: untrusted-root.badssl.com:443
# insecure_skip_verify: true # dont check cert validity
# expire_skip_verify: true # dont check cert expire date
# root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
# root_ca_pem: |
# -----BEGIN CERTIFICATE-----
放到任意一个 yaml 编辑器可以看到是错的,正确的应为:
tls:
- name: expired test
host: expired.badssl.com:443
proxy: socks5://proxy.server:1080 # Optional. Only support socks5.
# Also support the `ALL_PROXY` environment.
insecure_skip_verify: true # dont check cert validity
expire_skip_verify: true # dont check cert expire date
alert_expire_before: 720h # alert if cert expire date is before X, the value is a Duration, see https://pkg.go.dev/time#ParseDuration. example: 1h, 1m, 1s. expire_skip_verify must be false to use this feature.
# root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
# root_ca_pem: |
# -----BEGIN CERTIFICATE-----
- name: untrust test
host: untrusted-root.badssl.com:443
# insecure_skip_verify: true # dont check cert validity
# expire_skip_verify: true # dont check cert expire date
# root_ca_pem_path: /path/to/root/ca.pem # ignore if root_ca_pem is present
# root_ca_pem: |
# -----BEGIN CERTIFICATE-----
主机资源监控怎么配置本机呢?
填 127.0.0.1 没效果。
还有就是如果ssh端口不是默认22,如何配置端口呢?
This registers the pprof debugging handler via the middleware documented in https://pkg.go.dev/github.com/go-chi/chi/middleware#Profiler.
r.Mount("/debug", middleware.Profiler())
some issues about middleware.Profiler()
: go-chi/chi#262
Background & X Problem
I wanted to performing updates between v1.4.0 and the latest version from repo on my systems, as part of the process I wanted to confirm the version of the binary but we currently have no command line option that will simply display the version.
As it currently stands those who are interested in tracking the version somehow, have to go through checksums to the binary which is not optimal imho.
Similarly I could not determine for what version of binary my config was for.
Proposal and Expectation
I propose the following,
-v
or -V
that will just display the version of the current binaryUsage
config.yaml
top key version:
that will allow us to determine what version of binary this config works withversion
key also on the data.yaml
so that future binaries can detect older versions of the data and accommodate any potential changes that need to take placeedit: If anyone wants to take on this please be my guest
Enviornment:
Describe the bug
I removed some options from the config (expecting easeprobe defaults to kick in) and this caused easeprobe to panic.
I will investigate further to see if i can spot where the bug lies exactly.
To Reproduce
Steps to reproduce the behavior:
easeprobe -f test.yaml
Expected behavior
I would expect easeprobe to fail gracefully.
test.yaml
settings:
sla:
backups: 3
debug: false
notify:
dry: true
retry: # Global settings for retry
times: 5
interval: 10s
loglevel: "info"
timeformat: "2006-01-02 15:04:05 UTC"
http:
- name: test
url: http://localhost:8181/
notify:
discord:
- name: "Server #Alert"
webhook: "https://discord.com/api/webhooks//"
avatar: "https://img.icons8.com/ios/72/appointment-reminders--v1.png"
thumbnail: "https://freeiconshop.com/wp-content/uploads/edd/notification-flat.png"
dry: true
retry:
times: 3
interval: 10s
The Golang monkey patching library, https://github.com/bouk/monkey, is being utilized directly against its license:
Copyright Bouke van der Bijl
I do not give anyone permissions to use this tool for any purpose. Don't use it.
I’m not interested in changing this license. Please don’t ask.
It is also archived and not maintained.
Enviornment (please complete the following information):
Describe the bug
配置shell脚本,脚本内容出错,钉钉消息未发送通知,日志也未打印发送钉钉通知消息
Configure shell script, script error, but messages are not sent to DingDing notifications, and there is no notifcaiton logs as well.
To Reproduce
Expected behavior
期望脚本报错发送钉钉消息,如果是钉钉配置出错,希望把日志打印出来
/opt # easeprobe -f config.yaml
INFO[0000] The data file data/data.yaml, was not found!
INFO[0000] Load the configuration file successfully!
INFO[0000] Successfully created the PID file: /opt/easeprobe.pid
INFO[0000] Application Log File [Stdout] - Self-Rotate
INFO[0000] Web Access Log File [Stdout] - Self-Rotate
INFO[2022-07-05T03:44:22Z] [Web] HTTP server is listening on 0.0.0.0:8181
INFO[2022-07-05T03:44:22Z] Probe [http] - [ElasticSearch] base options are configured!
INFO[2022-07-05T03:44:22Z] [Metric] Counter <EaseProbe_http_total> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_status> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_sla> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Counter <EaseProbe_http_status_code> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_content_len> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_dns_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_connect_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_tls_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_send_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_wait_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_transfer_duration> is created!
INFO[2022-07-05T03:44:22Z] [Metric] Gauge <EaseProbe_http_total_duration> is created!
FATA[2022-07-05T03:44:22Z] No notifies configured, exiting...
config.yaml
http:
- name: test
url: http://xxxx.com
When the port is already in used, easeprobe displays the following error but continues to run.
ERRO[2022-05-16T00:56:00Z] [Web] HTTP server error: listen tcp 127.0.0.1:8181: bind: address already in use
I believe that it would be better if we exited with a non zero status code when this happens.
When using http probing, it is common to check for the existence or absence of a keyword. I would love to have such a feature in easeprobe.
Implementation suggestion:
http:
- name: ...
url: ...
success_keyword: "Welcome to my site"
failure_keyword: "Unauthorized"
$CMD/data/data.yaml
I want to stat up and down times,start from zero.every hourly.
How it should be configured?
---
name: EaseProbe
version: v1.7.0
---
"":
name: ""
endpoint: http://xxxxx:8000/accounts
time: 2022-07-25T03:40:09.881474243Z
timestamp: 1658720409881
rtt: 354.042µs
status: down
prestatus: down
message: 'Error (http): Error: Get "http://xxxx:8000/accounts": dial tcp xxxxx:8000: connect: connection refused'
latestdowntime: 2022-07-25T02:44:48.241836769Z
recoverytime: 0s
stat:
since: 2022-07-07T02:49:31.152596481Z
total: 1454
status:
up: 1412
down: 42
uptime: 23h32m0s
downtime: 42m0s
timeformat: 2006-01-02 15:04:05 UTC
Describe the bug
Code documentation description error:
About SLA data persistence and PID file disable configuration and other disable configuration, what symbol to use to indicate the problem:
i want to add new config without restart the service
maybe can watch etcd and hot load config?
I was wondering if this could support Workwechat?
I'd like to try add it if you F.Y.R:
Wechat API Doc:https://developer.work.weixin.qq.com/document/path/91770
Background
I would love to be able to run easeprobe on my OpenBSD server which comes with go 1.17, however easeprobe requires go 1.18+ 😭
Proposal and Expectation
I was able to cross compile the binary from one of my linux boxes with GOOS=openbsd GOARCH=amd64
but it would be really awesome if there was a release available to download directly from github instead.
Would it be possible to add an openbsd target to the releases of easeprobe 🥺 🙏
Thank you in advance
If there are many sites
For example, thousands of sites, all written in a config.yaml is too long and ugly
Can you include conf files like nginx?
That is, include files in a directory to make it easier to distinguish
Enviornment (please complete the following information):
Describe the bug
If the double quotes in the markdown message sent to wecom are not escaped,the content would be truncated.
To Reproduce
curl 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key={your-wecom-key}' \
-H 'Content-Type: application/json' \
-d '
{
"msgtype": "markdown",
"markdown": {
"content": "My ElasticSearch Failure** ❌
http://localhost:9200/ - ⏱ 6ms
Error (http): Error: Get "http://localhost:9200/": dial tcp [::1]:9200: connect: connection refused
> Jordy-probe v1.5.0 @ JordyMacBook-Pro.local at 2022-05-20 13:36:41 UTC"
}
}' | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 370 100 57 100 313 136 750 --:--:-- --:--:-- --:--:-- 887
{
"errcode": 0,
"errmsg": "ok. Warning: wrong json format. "
}
The actual received message content:
My ElasticSearch Failure** ❌
http://localhost:9200/ - ⏱ 6ms
Error (http): Error: Get
Expected behavior
curl 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=key={your-wecom-key}' \
-H 'Content-Type: application/json' \
-d '
{
"msgtype": "markdown",
"markdown": {
"content": "My ElasticSearch Failure** ❌
http://localhost:9200/ - ⏱ 6ms
Error (http): Error: Get \"http://localhost:9200/\": dial tcp [::1]:9200: connect: connection refused
> Jordy-probe v1.5.0 @ JordyMacBook-Pro.local at 2022-05-20 13:36:41 UTC"
}
}' | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 370 100 57 100 313 136 750 --:--:-- --:--:-- --:--:-- 887
{
"errcode": 0,
"errmsg": "ok"
}
The config file looks ugly when I have a lot of targets to probe。
I saw in this issue that this function seems to have been supported, but I didn’t see any relevant introductions in the user documentation or configuration file samples. If it is already supported, can you mention it in the documentation?
Currently, the schedule only supports daily, weekly (Sunday), monthly (Last Day), none.
If this schedule can set minutely, meas easeprobe can support Real-time monitoring.
大大好,不到24小时就上线,2个需求, 再次感谢。
我有个建议,不知道好不好啊, 就是在配置中我发现很多需要配置明文的密码。 我们能不能加密一下,比如base64,
或者是先提供一个自定义加密密码的命令, 比如 ./bin/easeprobe -a "加密key“ password > 加密后的密码,
把加密后的密码写入配置文件, 然后启动的时候,带上一个 加密key,用于解密 , 运行比如 ./bin/easeprobe -k ”加密key“
这样 别看看到了配置文件, 也不会知道 生产服务的各个密码。
I've following manual on how to config native probe for postgresql (https://github.com/megaease/easeprobe/blob/main/docs/Manual.md#78-native-client-probe-configuration)
here's my code:
client:
- name: Posgresql
driver: postgres
host: postgis:5432
username: xxxxx
password: xxxxx
- name: Pgbouncer
driver: postgres
host: pgbouncer:5432
username: xxxxxx
password: xxxxxx
and here's report:
Posgresql Failure ❌
postgis:5432 - ⏱ 4ms
Error (client/postgres): pgdriver: SSL is not enabled on the server
> EaseProbe v2.0.0 @ c0d124fc3faa at 2023-02-10 10:49:53 UTC
Pgbouncer Failure ❌
pgbouncer:5432 - ⏱ 2ms
Error (client/postgres): pgdriver: SSL is not enabled on the server
> EaseProbe v2.0.0 @ c0d124fc3faa at 2023-02-10 10:49:54 UTC
i'm using docker deployment method
For servers that are not linked to the Internet, the server cannot update the upstream digital certificate in time, so there are certificate errors. But these sites are safe for browser access and the certificate is secure. http probes should allow to ignore certificate errors.
Enviornment (please complete the following information):
Linux ZRB-Base 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
When a notify message sent by WeCom BOT,the time format in the end of the message is not the server's time zone setting.It looks like UTC time zone.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The time format(include timezone) in the end of the notify message can be configured by the config file, whether in the setting item or notify channel.
Background & X Problem
I would like to have a deploy for kubernetes in order to become an alternative to other synthetic checks on kubernetes
Proposal and Expectation
add deployment option as helm chart or manifests
If it can not support notify to telegram bot, then support notify to shell command.
Then I can use script like this [Send message to Telegram on any SSH login - Konstantin Bogomolov | Web Developer](https://bogomolov.tech/Telegram-notification-on-SSH-login/) for notification.
Thanks.
大大好
能给 TLS Probe Configuration 加上 http proxy 和 自定义 useragent 吗? 就和之前的 http Probe Configuration一样, 不知道方便添加不,辛苦,各位大大了。 我们只能享受福利了。
Background & X Problem
It's easy to have false postitives when looking at CPUs/latencies. For example, CPU might have a peak at 80%+ but just for a few seconds.
Proposal and Expectation
Just like failure-threshold in https://github.com/TwiN/gatus, it's handy if I'm allowed to set the threshold before sending notifications.
Container port does not have an associated name to it. It's used for prometheus scrapping jobs and traditionally is metrics
Background & X Problem
As above
Proposal and Expectation
at @ group member or custom the msg contents
Hi, when I looked at the source code I had a question.
We know that the initial state is init
, It is currently handled here:
❌ no notification
✅ send notification
init
-> up
Initial Status ❌init
-> down
✅up
-> down
✅down
-> up
✅ // hereup
-> down
✅down
-> down
❌up
-> up
❌So my question is do we need to send a notification when the status changes from down to up?
I'm not sure whether the document says' 1.18 + 'includes' 1.19'
In go1.19, using make
will not succeed
Using go tool compile -- help
, you can find that the - G
option is removed from gcflags
# go tool compile --help
usage: compile [options] file.go...
-% int
debug non-static initializers
-+ compiling runtime
-B disable bounds checking
-C disable printing of columns in error messages
-D path
set relative path for local imports
-E debug symbol export
-I directory
add directory to import search path
-K debug missing line numbers
-L also show actual source file names in error messages for positions affected by //line directives
-N disable optimizations
-S print assembly listing
-V print version and exit
-W debug parse tree after type checking
.......
commit: a1126c2
os:
ProductName: macOS
ProductVersion: 12.4
BuildVersion: 21F79
make
mkdir -p /private/tmp/easeprobe//build/bin
go mod tidy
CGO_ENABLED=0 go build -a -ldflags '-s -w -extldflags "-static"' -gcflags=-G=3 -o /private/tmp/easeprobe//build/bin/easeprobe /private/tmp/easeprobe/cmd/easeprobe
directory /private/tmp/easeprobe/cmd/easeprobe outside main module or its selected dependencies
make: *** [/private/tmp/easeprobe//build/bin/easeprobe] Error 1
Enviornment:
Describe the bug
When the mysql client probe is activated the connection stays open which lead to exhaustion of the max connections of the database server.
Based on the config below i get one connection per probe interval
To Reproduce
Steps to reproduce the behavior:
Use the mysql client probe on a remote database. I used the following config
settings:
http:
ip: 127.0.0.1
port: 8181
refresh: 5s
sla:
schedule : "daily"
time: "23:59"
debug: false
notify:
dry: true
retry: # Global settings for retry
times: 5
interval: 10s
probe:
timeout: 30s
interval: 1m # probe every minute for all probes
logfile: "/var/logs/easyprobe.log"
loglevel: "info"
timeformat: "2006-01-02 15:04:05 UTC"
client:
- name: MariaDB Client
driver: "mysql"
host: "10.7.0.253:3306"
username: "user"
password: "pass"
Run easeprobe and observe the connections staying open, one for each minute of uptime.
Expected behavior
I expect the connection from each probe to close after its done
Extra details
I have looked online for this kind of behavior it it seems the go-mysql driver has seen quite a few of similar reports (of non closing connections). Based on what i found so far, this problem seems to be observed on long running go applications mostly.
My golang knowledge is limited and I could only try a few simple "suggestions" i saw online without success which included:
sql.Open()
had: no success db.SetConnMaxIdleTime(10 * time.Second)
db.SetConnMaxLifetime(30 * time.Second)
db.SetMaxOpenConns(1)
db.SetMaxIdleConns(1)
defer
keyword from db.Close()
and moving it right before the return: no successI will continue to investigate and test just to rule out any configuration issues on the database side and will update here accordingly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.