Giter Site home page Giter Site logo

zhouchangxun / ngx_healthcheck_module Goto Github PK

View Code? Open in Web Editor NEW
259.0 26.0 96.0 193 KB

nginx module for upstream servers health check. support stream and http upstream. 该模块可以为Nginx提供主动式后端服务器健康检查的功能(同时支持四层和七层后端服务器的健康检测)

Shell 0.40% C 99.60%
nginx healthcheck tcp udp http nginx-module loadbalance

ngx_healthcheck_module's Introduction

ngx-healthcheck-module

loading Travis CI build details : Click to see

(中文版本请参看这里)

Health-checker for Nginx upstream servers (support http upstream && stream upstream)
This module can provide NGINX with the capability of active back-end server health check (supports health check of both four and seven back-end servers).

html status ouput

Table of Contents

Status

This nginx module is still under development, you can help improve and it.

The project is also well developed in development, and you are welcome to contribute code, or report bugs. Together to make it better.

If you have any questions, please contact me:

Description

When you use nginx as a load balancer, nginx natively provides only basic retries to ensure access to a normal backend server.

In contrast, this nginx third-party module provides proactive health State Detection for back-end servers.
It maintains a list of back-end servers that guarantee that new requests are sent directly to a healthy back-end server.

Key features:

  • Supports health detection for both four-tier and seven-tier back-end servers
  • Four-layer supported detection type: tcp / udp / http
  • Seven-layer supported detection Type: http / fastcgi
  • Provide a unified http status query interface, output format: html / json / csv / prometheus
  • Provide a unified http status query interface, output format: html / json / csv / prometheus
  • Support judge status according to http response code or body like check_http_expect_body ~ ".+OK.+";

Installation

git clone https://github.com/nginx/nginx.git
git clone https://github.com/zhouchangxun/ngx_healthcheck_module.git

cd nginx/;
git checkout branches/stable-1.12
git apply ../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.12+.patch

./auto/configure --with-stream --add-module=../ngx_healthcheck_module/
make && make install

Back to TOC

Usage

nginx.conf example

user  root;
worker_processes  1;
error_log  logs/error.log  info;
#pid        logs/nginx.pid;

events {
    worker_connections  1024;
}

http {
    server {
        listen 80;
        # status interface
        location /status {
            healthcheck_status json;
        }
        # http front
        location / { 
          proxy_pass http://http-cluster;
        }   
    }
    # as a backend server.
    server {
        listen 8080;
        location / {
          root html;
        }
    }
    
    upstream http-cluster {
        # simple round-robin
        server 127.0.0.1:8080;
        server 127.0.0.2:81;

        check interval=3000 rise=2 fall=5 timeout=5000 type=http;
        check_http_send "GET / HTTP/1.0\r\n\r\n";
        check_http_expect_alive http_2xx http_3xx;
    }
}

stream {
    upstream tcp-cluster {
        # simple round-robin
        server 127.0.0.1:22;
        server 192.168.0.2:22;
        check interval=3000 rise=2 fall=5 timeout=5000 default_down=true type=tcp;
    }
    server {
        listen 522;
        proxy_pass tcp-cluster;
    }
    
    upstream udp-cluster {
        # simple round-robin
        server 127.0.0.1:53;
        server 8.8.8.8:53;
        check interval=3000 rise=2 fall=5 timeout=5000 default_down=true type=udp;
    }
    server {
        listen 53 udp;
        proxy_pass udp-cluster;
    }
    
}

status interface

One typical output is(json format)

root@changxun-PC:~/nginx-dev/ngx_healthcheck_module# curl localhost/status
{"servers": {
  "total": 6,
  "generation": 3,
  "http": [
    {"index": 0, "upstream": "http-cluster", "name": "127.0.0.1:8080", "status": "up", "rise": 119, "fall": 0, "type": "http", "port": 0},
    {"index": 1, "upstream": "http-cluster", "name": "127.0.0.2:81", "status": "down", "rise": 0, "fall": 120, "type": "http", "port": 0}
  ],
  "stream": [
    {"index": 0, "upstream": "tcp-cluster", "name": "127.0.0.1:22", "status": "up", "rise": 22, "fall": 0, "type": "tcp", "port": 0},
    {"index": 1, "upstream": "tcp-cluster", "name": "192.168.0.2:22", "status": "down", "rise": 0, "fall": 7, "type": "tcp", "port": 0},
    {"index": 2, "upstream": "udp-cluster", "name": "127.0.0.1:53", "status": "down", "rise": 0, "fall": 120, "type": "udp", "port": 0},
    {"index": 3, "upstream": "udp-cluster", "name": "8.8.8.8:53", "status": "up", "rise": 3, "fall": 0, "type": "udp", "port": 0}
  ]
}}
root@changxun-PC:~/nginx-dev/ngx_healthcheck_module# 

or (prometheus format)

root@changxun-PC:~/nginx-dev/ngx_healthcheck_module# curl localhost/status
# HELP nginx_upstream_count_total Nginx total number of servers
# TYPE nginx_upstream_count_total gauge
nginx_upstream_count_total 6
# HELP nginx_upstream_count_up Nginx total number of servers that are UP
# TYPE nginx_upstream_count_up gauge
nginx_upstream_count_up 0
# HELP nginx_upstream_count_down Nginx total number of servers that are DOWN
# TYPE nginx_upstream_count_down gauge
nginx_upstream_count_down 6
# HELP nginx_upstream_count_generation Nginx generation
# TYPE nginx_upstream_count_generation gauge
nginx_upstream_count_generation 1
# HELP nginx_upstream_server_rise Nginx rise counter
# TYPE nginx_upstream_server_rise counter
nginx_upstream_server_rise{index="0",upstream_type="http",upstream="http-cluster",name="127.0.0.1:8082",status="down",type="http",port="0"} 0
nginx_upstream_server_rise{index="1",upstream_type="http",upstream="http-cluster",name="127.0.0.2:8082",status="down",type="http",port="0"} 0
nginx_upstream_server_rise{index="1",upstream_type="stream",upstream="tcp-cluster",name="192.168.0.2:22",status="down",type="tcp",port="0"} 0
nginx_upstream_server_rise{index="2",upstream_type="stream",upstream="udp-cluster",name="127.0.0.1:5432",status="down",type="udp",port="0"} 0
nginx_upstream_server_rise{index="4",upstream_type="stream",upstream="http-cluster2",name="127.0.0.1:8082",status="down",type="http",port="0"} 0
nginx_upstream_server_rise{index="5",upstream_type="stream",upstream="http-cluster2",name="127.0.0.2:8082",status="down",type="http",port="0"} 0
# HELP nginx_upstream_server_fall Nginx fall counter
# TYPE nginx_upstream_server_fall counter
nginx_upstream_server_fall{index="0",upstream_type="http",upstream="http-cluster",name="127.0.0.1:8082",status="down",type="http",port="0"} 41
nginx_upstream_server_fall{index="1",upstream_type="http",upstream="http-cluster",name="127.0.0.2:8082",status="down",type="http",port="0"} 42
nginx_upstream_server_fall{index="1",upstream_type="stream",upstream="tcp-cluster",name="192.168.0.2:22",status="down",type="tcp",port="0"} 14
nginx_upstream_server_fall{index="2",upstream_type="stream",upstream="udp-cluster",name="127.0.0.1:5432",status="down",type="udp",port="0"} 40
nginx_upstream_server_fall{index="4",upstream_type="stream",upstream="http-cluster2",name="127.0.0.1:8082",status="down",type="http",port="0"} 40
nginx_upstream_server_fall{index="5",upstream_type="stream",upstream="http-cluster2",name="127.0.0.2:8082",status="down",type="http",port="0"} 43
# HELP nginx_upstream_server_active Nginx active 1 for UP / 0 for DOWN
# TYPE nginx_upstream_server_active gauge
nginx_upstream_server_active{index="0",upstream_type="http",upstream="http-cluster",name="127.0.0.1:8082",type="http",port="0"} 0
nginx_upstream_server_active{index="1",upstream_type="http",upstream="http-cluster",name="127.0.0.2:8082",type="http",port="0"} 0
nginx_upstream_server_active{index="1",upstream_type="stream",upstream="tcp-cluster",name="192.168.0.2:22",type="tcp",port="0"} 0
nginx_upstream_server_active{index="2",upstream_type="stream",upstream="udp-cluster",name="127.0.0.1:5432",type="udp",port="0"} 0
nginx_upstream_server_active{index="4",upstream_type="stream",upstream="http-cluster2",name="127.0.0.1:8082",type="http",port="0"} 0
nginx_upstream_server_active{index="5",upstream_type="stream",upstream="http-cluster2",name="127.0.0.2:8082",type="http",port="0"} 0
root@changxun-PC:~/nginx-dev/ngx_healthcheck_module# 

Back to TOC

Synopsis

check

Syntax

check interval=milliseconds [fall=count] [rise=count] [timeout=milliseconds] [default_down=true|false] [type=tcp|udp|http] [port=check_port]

Default: interval=30000 fall=5 rise=2 timeout=1000 default_down=true type=tcp

Context: http/upstream || stream/upstream

This command can open the back-end server health check function.

Detail

  • interval: the interval of the health check packet sent to the backend.
  • fall (fall_count): the server is considered down if the number of consecutive failures reaches fall_count.
  • rise (rise_count): the server is considered up if the number of consecutive successes reaches rise_count.
  • timeout: timeout for the back-end health request.
  • default_down: set the initial state of the server, if it is true, it means that the default is down, if it is false, is up. The default value is true, which is the beginning of the server that is not available, to wait for the health check package reaches a certain number of times after the success will be considered healthy.
  • type: type of health check pack, now supports the following types
    • tcp: simple tcp connection, if the connection is successful, it shows the back-end normal.
    • udp: simple to send udp packets, if you receive icmp error (host or port unreachable), it shows the back-end exception.(Only UDP type checking is supported in the stream configuration block)
    • http: send an HTTP request, by the state of the back-end reply packet to determine whether the back-end survival.

A example as followed:

stream {
    upstream tcp-cluster {
        # simple round-robin
        server 127.0.0.1:22;
        server 192.168.0.2:22;
        check interval=3000 rise=2 fall=5 timeout=5000 default_down=true type=tcp;
    }
    server {
        listen 522;
        proxy_pass tcp-cluster;
    }
    ...
}

healthcheck

Syntax: healthcheck_status [html|csv|json|prometheus]

Default: healthcheck_status html

Context: http/server/location

A example as followed:

http {
    server {
        listen 80;
        
        # status interface
        location /status {
            healthcheck_status;
        }
     ...
}

You can specify the default display format. The formats can be html, csv or json. The default type is html. It also supports to specify the format by the request argument. Suppose your check_status location is '/status', the argument of format can change the display page's format. You can do like this:

/status?format=html

/status?format=csv

/status?format=json

/status?format=prometheus

At present, you can fetch the list of servers with the same status by the argument of status. For example:

/status?format=json&status=down

/status?format=html&status=down

/status?format=csv&status=up

/status?format=prometheus&status=up

Back to TOC

Todo List

  • add testcase.
  • code style.
  • feature enhance.

Back to TOC

Bugs and Patches

Please report bugs

or submit patches by

Back to TOC

Author

Chance Chou (周长勋) [email protected].

Back to TOC

Copyright and License

The health check part is based on Yaoweibin's healthcheck module nginx_upstream_check_module (http://github.com/yaoweibin/nginx_upstream_check_module);

This module is licensed under the BSD license.

Copyright (C) 2017-, by Changxun Zhou [email protected]

Copyright (C) 2014 by Weibin Yao [email protected]

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Back to TOC

See Also

Back to TOC

ngx_healthcheck_module's People

Contributors

denji avatar glk123 avatar josephmilla avatar stoffus avatar taomaree avatar zhanw15 avatar zhouchangxun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ngx_healthcheck_module's Issues

日志文件重复生成

ngx_healthcheck会同时在配置文件中定义的日志路径与默认路径下生成俩份日志文件

和tengine的ngx_http_upstream_check_module能一起用吗

这个模块和tengine的ngx_http_upstream_check_module是什么关系,可以取代ngx_http_upstream_check_module吗?
由于ngx_http_upstream_check_module只有7层健康检查,这两个module可以一起用吗,指令名都是一样的会不会冲突

Integration with dynamic upstream modules

An you add some functions for integration with dynamic upstream modules ?

for add upstream
ngx_uint_t ngx_http_upstream_check_add_dynamic_peer(ngx_pool_t *pool, ngx_http_upstream_srv_conf_t *us, ngx_addr_t *peer_addr);

for delete upstream
void ngx_http_upstream_check_delete_dynamic_peer(ngx_str_t *name, ngx_addr_t *peer_addr);

For example, i\ve try to build ngx_healthcheck_module with modules

https://github.com/xiaokai-wang/nginx-stream-upsync-module
https://github.com/weibocom/nginx-upsync-module

and have an error :-(
Refenence project, health-check, working with dynamic upstream is https://github.com/xiaokai-wang/nginx_upstream_check_module but it can't check TCP upstream :-(

Support for random load balancing

We recently switched to using random two load balancing in our nginx setup. After adding this we've seen some issues where an upstream considered down actually receives requests. Seeems that random load balancing is not supported by this plugin?

被标记不可用的时间

问一下:
1、当服务被标记为不可用,这个时间维持多久,是检测中的timeout时间吗?

2、当一个服务被标记不可用的时间到了,恰好在检测的时候又请求进来,会转发到这个服务吗?

http 健康检查在rise>1, upstream keepalive 情况下只会探测一次

因为第一次检测成功后,后端是长连接时,这个时候 peer->pc.connection != NULL
第二次就会在ngx_http_upstream_check_begin_handler()中返回,此时shm->rise_count也就没有更新

下面部分是我修改的代码(patch),周大侠看看修改是否妥当,或者怎么修改比较好呢?多谢

`
--- ngx_healthcheck_module-master/ngx_http_upstream_check_module.c 2020-04-23 10:48:26.000000000 +0800
+++ ngx_http_upstream_check_module.c 2020-04-23 10:50:05.000000000 +0800
@@ -931,6 +931,13 @@

 ngx_add_timer(event, ucscf->check_interval / 2);

++// wohaiaini
++ if (peer->pc.connection && peer->shm->rise_count < ucscf->rise_count)
++ {
++ ngx_http_upstream_check_connect_handler(event);
++ return;
++ }
++
/* This process is processing this peer now. */
if ((peer->shm->owner == ngx_pid ||
(peer->pc.connection != NULL) ||
`

@zhouchangxun

not support dynamic add/delete detect node and found some bugs

windows10, msys1.0,vc2019 c/c++,windows 10 sdk编译报错

windows编译报错
编译环境:
msys1.0
git
perl
windows 10 sdk
vc2019 c/c++生成工具
nginx_1.16

操作步骤:
1.下载MSYS-1.0.11
https://nchc.dl.sourceforge.net/project/mingw/MSYS/Base/msys-core/msys-1.0.11/MSYS-1.0.11.exe
2.下载nginx , ngx_healthcheck_module
git clone https://github.com/nginx/nginx.git
git clone https://github.com/zhouchangxun/ngx_healthcheck_module.git

  1. Create a build and lib directories, and unpack zlib, PCRE and OpenSSL libraries sources into lib directory:
    mkdir objs
    mkdir objs
    mkdir objs/lib
    cd objs/lib
    tar -xzf ../../pcre-8.44.tar.gz
    tar -xzf ../../zlib-1.2.11.tar.gz
    tar -xzf ../../openssl-1.1.1g.tar.gz

  2. #打补丁
    cd nginx/;
    patch -p1 < ../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.16+.patch

5.编译
auto/configure
--with-cc=cl
--with-debug
--prefix=
--conf-path=conf/nginx.conf
--pid-path=logs/nginx.pid
--http-log-path=logs/access.log
--error-log-path=logs/error.log
--sbin-path=nginx.exe
--http-client-body-temp-path=temp/client_body_temp
--http-proxy-temp-path=temp/proxy_temp
--http-fastcgi-temp-path=temp/fastcgi_temp
--http-scgi-temp-path=temp/scgi_temp
--http-uwsgi-temp-path=temp/uwsgi_temp
--with-cc-opt=-DFD_SETSIZE=1024
--with-pcre=objs/lib/pcre-8.44
--with-zlib=objs/lib/zlib-1.2.11
--with-openssl=objs/lib/openssl-OpenSSL_1_1_1g
--with-openssl-opt=no-asm
--with-http_ssl_module
--with-stream --add-module=../ngx_healthcheck_module/
6. windows上执行:
nmake

nmake报错,具体报错信息
image

openresty1.17.8.2安装模块后访问status显示server数量为0

openresty版本1.17.8.2,源码编译安装都正常,另外健康检测也运行正常,能正常摘除宕机的主机,但是配置文件启动后访问html页面上显示server一直是0 down 0 total
openresty reload看错误日志提示"2021/12/17 15:21:12 [notice] 14270#14270: [ngx_healthcheck:stream] when init main conf. upstreams num:3"
配置文件

upstream test{
    server 192.168.10.134:8905;
    server 192.168.10.135:8900;
    server 192.168.10.136:8905;
    check interval=20000 rise=1 fall=3 timeout=5000 default_down=false type=http;
    check_http_expect_alive http_2xx http_3xx;
    check_http_send "POST /getPolicyInfo HTTP/1.0\r\n\r\n";
}
server {
    listen 80;
    server_name 192.168.11.43;
    location /status {
        healthcheck_status;
    }
    location / {
        proxy_set_header X-Forwarded-Host $host:$server_port;
        proxy_set_header X-Forwarded-Server $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Host $host;
        proxy_cookie_path / "/; httponly; secure; SameSite=Lax";
        proxy_next_upstream error timeout http_502 http_503 http_504 http_500 non_idempotent;
        proxy_connect_timeout 5s;
        proxy_read_timeout 300s;
        proxy_send_timeout 300s;
        proxy_pass http://test;
    }
}

2021-12-17 15-14-49 的屏幕截图

Build failure

Hi! I've encounter a build issue while building the v1.0 branch.

make[1]: *** Waiting for unfinished jobs.... /root/nginx_build/ngx_healthcheck_module/ngx_stream_upstream_check_module.c: In function ‘ngx_stream_upstream_check_init_shm_zone’: /root/nginx_build/ngx_healthcheck_module/ngx_stream_upstream_check_module.c:2158:31: error: ‘ngx_upstream_check_peer_shm_t’ undeclared (first use in this function) (number ) * sizeof(ngx_upstream_check_peer_shm_t);//last item not use :) ^ /root/nginx_build/ngx_healthcheck_module/ngx_stream_upstream_check_module.c:2158:31: note: each undeclared identifier is reported only once for each function it appears in objs/Makefile:1804: recipe for target 'objs/addon/ngx_healthcheck_module/ngx_stream_upstream_check_module.o' failed

I found out that in files ngx_stream_upstream_check_module.c and ngx_http_upstream_check_module.c there is a part of code which cause a build failure because of undefined ngx_upstream_check_peer_shm_t

https://github.com/zhouchangxun/ngx_healthcheck_module/blob/v1.0/ngx_stream_upstream_check_module.c#L2158

size = sizeof(*peers_shm) + (number ) * sizeof(ngx_upstream_check_peer_shm_t);//last item not use :)

Should I erase all these two lines, or use a source from the master branch?
Thanks.

Nginx1.15是否支持?

目前将您的这个项目交由组内运维同事进行编译测试。测试1.14可以通过、1.15下不通过。

segment fault under some case.

已发现的触发条件:

  1. 不添加http upstream
  2. 启动nginx
  3. 添加http upstream配置(with check setting)
  4. reload nginx 则段错误

异常栈如下(抽空处理下):

(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
__memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:794
794	../sysdeps/x86_64/multiarch/memcmp-sse4.S: 没有那个文件或目录.
(gdb) bt
#0  __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:794
#1  0x000055b6da068798 in ngx_http_upstream_check_find_shm_peer (addr=0x55b6da44cf30, p=0x55b6da406430)
    at ../ngx_healthcheck_module/ngx_http_upstream_check_module.c:3882
#2  ngx_http_upstream_check_init_shm_zone (shm_zone=0x55b6da40a7d8, data=<optimized out>)
    at ../ngx_healthcheck_module/ngx_http_upstream_check_module.c:3787
#3  0x000055b6d9fe35ec in ngx_init_cycle (old_cycle=old_cycle@entry=0x55b6da406480) at src/core/ngx_cycle.c:484
#4  0x000055b6d9ffae35 in ngx_master_process_cycle (cycle=0x55b6da406480) at src/os/unix/ngx_process_cycle.c:235
#5  0x000055b6d9fd18ca in main (argc=1, argv=<optimized out>) at src/core/nginx.c:382
(gdb) quit

不支持动态增删upstream节点以及发现几个Bug

nginx-upsync-module

nginx-upsync-module 模块不能使用缺少 ngx_http_upstream_check_add_dynamic_peer ngx_http_upstream_check_delete_dynamic_peer

git apply 报错

RT

PS D:\workspace-mine\nginx> git checkout
Your branch is up-to-date with 'origin/branches/stable-1.14'.
PS D:\workspace-mine\nginx> git apply ..\ngx_healthcheck_module\nginx_healthcheck_for_nginx_1.14+.patch
../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.14+.patch:9: trailing whitespace.
#if (NGX_HTTP_UPSTREAM_CHECK)
../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.14+.patch:10: trailing whitespace.
#include "ngx_http_upstream_check_module.h"
../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.14+.patch:11: trailing whitespace.
#endif
../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.14+.patch:19: trailing whitespace.
#if (NGX_HTTP_UPSTREAM_CHECK)
../ngx_healthcheck_module/nginx_healthcheck_for_nginx_1.14+.patch:20: trailing whitespace.
        ngx_log_debug1(NGX_LOG_DEBUG_HTTP, pc->log, 0,
error: patch failed: src/http/modules/ngx_http_upstream_hash_module.c:9
error: src/http/modules/ngx_http_upstream_hash_module.c: patch does not apply
error: patch failed: src/http/modules/ngx_http_upstream_ip_hash_module.c:9
error: src/http/modules/ngx_http_upstream_ip_hash_module.c: patch does not apply
error: patch failed: src/http/modules/ngx_http_upstream_least_conn_module.c:9
error: src/http/modules/ngx_http_upstream_least_conn_module.c: patch does not apply
error: patch failed: src/http/ngx_http_upstream_round_robin.c:9
error: src/http/ngx_http_upstream_round_robin.c: patch does not apply
error: patch failed: src/http/ngx_http_upstream_round_robin.h:38
error: src/http/ngx_http_upstream_round_robin.h: patch does not apply
error: patch failed: src/stream/ngx_stream_upstream_hash_module.c:8
error: src/stream/ngx_stream_upstream_hash_module.c: patch does not apply
error: patch failed: src/stream/ngx_stream_upstream_least_conn_module.c:8
error: src/stream/ngx_stream_upstream_least_conn_module.c: patch does not apply
error: patch failed: src/stream/ngx_stream_upstream_round_robin.c:9
error: src/stream/ngx_stream_upstream_round_robin.c: patch does not apply
error: patch failed: src/stream/ngx_stream_upstream_round_robin.h:49
error: src/stream/ngx_stream_upstream_round_robin.h: patch does not apply

无法正确检查“主机不可达”的UDP服务器的健康状态

对于同网段的UDP服务器(不同网段未测试),若服务器关闭或网络断开,健康检查将认为UDP包正常超时,认为该服务器状态正常可用,在ngx_event_connect_peer() 和ngx_stream_upstream_check_peek_one_byte()中都没有捕获到错误。

通过测试将以下代码去注释即可解决以上问题,但不知道最初对其的注释是出于什么原因,去注释后会不会导致什么问题?
/* (changxun): set sock opt "IP_RECVERR" in order to recv icmp error like host/port unreachable. /
/
note: we have invoke 'ngx_event_connect_peer() above. so the code we comment is not required.
int val = 1;
if( setsockopt( c->fd, SOL_IP, IP_RECVERR, &val, sizeof(val) ) == -1 ){
ngx_log_error(NGX_LOG_ERR, event->log, 0,
"setsockopt(IP_RECVERR) failed with peer: %V ",
&peer->check_peer_addr->name);
}
*/

配置未生效

我在nginx1.19.2上编译添加此模块,编译过程无错误
在stream块的upstream中添加了check参数,配置检查无错误,check参数如下
check interval=1000 rise=1 fall=1 timeout=1000 default_down=false type=tcp;

但查看healthcheck_status时显示“stream upstream servers”节点数量是0,json显示如下
{"servers": {
"total": 0,
"generation": 4,
"http": [
],
"stream": [
]
}}

/status?format=json&status=down 获取的返回不是标准的json

使用/status?format=json&status=down 查询状态。
返回的格式是这样的, {"http": [ {}, {}, ]} 。]左边多了个逗号。

{"servers": { "total": 6, "generation": 3, "http": [ {"index": 0, "upstream": "http-cluster", "name": "127.0.0.1:8080", "status": "up", "rise": 119, "fall": 0, "type": "http", "port": 0}, {"index": 1, "upstream": "http-cluster", "name": "127.0.0.2:81", "status": "down", "rise": 0, "fall": 120, "type": "http", "port": 0}, ], "stream": [ {"index": 0, "upstream": "tcp-cluster", "name": "127.0.0.1:22", "status": "up", "rise": 22, "fall": 0, "type": "tcp", "port": 0}, {"index": 1, "upstream": "tcp-cluster", "name": "192.168.0.2:22", "status": "down", "rise": 0, "fall": 7, "type": "tcp", "port": 0}, {"index": 2, "upstream": "udp-cluster", "name": "127.0.0.1:53", "status": "down", "rise": 0, "fall": 120, "type": "udp", "port": 0}, {"index": 3, "upstream": "udp-cluster", "name": "8.8.8.8:53", "status": "up", "rise": 3, "fall": 0, "type": "udp", "port": 0}, ] }}

docker环境UDP检测超时

在同一个Docker实例中,
NC检测结果:
nc -zvu 172.23.83.193 7660
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 172.23.83.193:7660.
Ncat: UDP packet sent successfully
Ncat: 1 bytes sent, 0 bytes received in 2.01 seconds.

插件检测结果:
[ngx_healthcheck:stream][timer]udp check time out with peer: 172.23.83.193:7660, we assum it's up :)

upstream是否可以支持https ?

由于存在upstream检查后端server时是发送了数据的,导致有些服务(比如api-server)会出现TLS握手EOF的刷屏报错。如果能支持私有https的状态检查就更好了(类似于 curl -k https://{node-ip}:{node-port}/url)。
谢谢!

healthcheck server status down not closing active TCP connections

Hi,

Thanks for an amazing module. I have a question about closing existing TCP connections.
Is this the expected behavior that if the server health check shows "down" it only closes new / incoming connections but leaves open (active tcp connections)?

Obviously it works fine for UDP where there is no session mechanism.

make的时候报错

objs/addon/ngx_healthcheck_module/ngx_stream_upstream_check_module.o
objs/addon/ngx_healthcheck_module/ngx_healthcheck_common.o
objs/addon/ngx_healthcheck_module/ngx_healthcheck_status.o
objs/ngx_modules.o
-ldl -lpthread -lpthread -lcrypt -lpcre -lssl -lcrypto -ldl -lz -lprofiler
-Wl,-E
objs/addon/ngx_healthcheck_module/ngx_stream_upstream_check_module.o: In function ngx_stream_upstream_check_init_main_conf': /app/nginx/../ngx_healthcheck_module//ngx_stream_upstream_check_module.c:1776: undefined reference to ngx_stream_upstream_module'
objs/addon/ngx_healthcheck_module/ngx_stream_upstream_check_module.o: In function ngx_stream_upstream_check_init_process': /app/nginx/../ngx_healthcheck_module//ngx_stream_upstream_check_module.c:2227: undefined reference to ngx_stream_module'
collect2: error: ld returned 1 exit status
make[1]: *** [objs/nginx] Error 1
make[1]: Leaving directory `/app/nginx'
make: *** [build] Error 2

完全按步骤做的为什么一直报这个错,不知道该怎么解决

是因为我configure的参数太多了吗

./auto/configure --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/nginx.pid --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-compat --with-debug --with-file-aio --with-google_perftools_module --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_degradation_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module=dynamic --with-http_mp4_module --with-http_perl_module=dynamic --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-http_xslt_module=dynamic --with-mail=dynamic --with-mail_ssl_module --with-pcre --with-pcre-jit --with-stream=dynamic --with-stream_ssl_module --with-stream_ssl_preread_module --with-threads --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic' --with-ld-opt='-Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E' --add-module=/app/ngx_healthcheck_module

check_http_send doesn't work in stream configuration

Hi! thank you for great module!
It seems there is an issue when working with Four-layer balancing. I try to check upstream servers in tcp balancing by http health check on custom endpoint and I cannot define endpoint by 'check_http_send' option. It always send's requests to "/" instead of "/my_custom_endpoint". The same configuration works fine when checking servers in http Seven-later balancing

When tcp check successed, error log(11: Resource temporarily unavailable)

when tcp check successed, still log exist blew error
[ngx-healthcheck][stream] when recv one byte, recv(): -1, fd: 3 (11: Resource temporarily unavailable)
when tcp check failed, error log is blew:
[ngx-healthcheck][stream] when recv one byte, recv(): -1, fd: 3 (111: Connection refused)
and in tcp error , nginx error_log level is [info] .

Unexpected false negatives

I use http-type check to trace health of upstreams. Sometimes module wrongly marks upstream as failed although it gets 200 OK. By checking dump I noticed the following things:

  • if check is marked as failed (recorded to error.log as check time out with peer) (even it gets 200 OK from remote host) then nginx sends RST immediately after getting reply
  • if check is marked as passed then normal session close is happened (with FIN/ACK)
  • if I lower number of nginx workers from auto (40) to 5-10 then false negatives become very rare
  • if I raise timeout from 2-3 seconds to 20-30 seconds then false negatives become very rare too

Does each nginx worker run its own checks for upstream(s) or there's one 'process' which manages these checks?

TCP 检查mysql的时候会导致错误日志一直报Got an error reading communication packets

希望大佬能写一个专门除了http、tcp检查之外,额外增加一个mysql的检车,否则主动监控检查mysql的时候mysql错误日志里会一直出现报错 Got an error reading communication packets,并且如果mysql配置文件里不设置 host_cache_size = 0, 当error次数达到max_connect_errors值得时候,nginx的IP地址就会被mysql禁止连接,导致应用程序报错
image-20200710143918519

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.