Giter Site home page Giter Site logo

napsty / check_zpools Goto Github PK

View Code? Open in Web Editor NEW
21.0 8.0 10.0 28 KB

Monitor the usage and status of ZFS Pools (zpools)

Home Page: https://www.claudiokuenzler.com/monitoring-plugins/check_zpools.php

License: GNU General Public License v2.0

Shell 100.00%
nagios-plugins monitoring-plugins solaris zpool zfs bsd smartos opensolaris

check_zpools's Introduction

check_zpools

A Nagios/Icinga plugin to monitor ZFS Pools (zpools). It is based on "Check Solaris ZFS Pools" but is completely rewritten.

For my environment with different OS using ZFS (Solaris, OpenSolaris, SmartOS, FreeBSD) I needed a Nagios plugin which is running on all OS.

Based on my research (http://www.claudiokuenzler.com/blog/345/monitor-zfs-disk-pools-nagios-plugin-comparison) I finally decided to take an existing plugin and rewrite it.

You may find a full documentation with examples on: http://www.claudiokuenzler.com/monitoring-plugins/check_zpools.php

check_zpools's People

Contributors

kresike avatar mrdsam avatar napsty avatar pv2b avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

check_zpools's Issues

Output truncated when shown in Nagios' Status Information field

Output of plugin when executed in terminal of FreeBSD system:

./check_zpools.sh -p zroot -w 70 -c 80
ZFS POOL zroot usage is CRITICAL (88%|zroot=88%)

Output of plugin as shown in Nagios' Status Information field:

ZFS POOL zroot usage is CRITICAL (88%

License

Nice plugin, this is not a bug, but only requesting to add a License to the plugin.

Inconsistent error messages and only one issue reported per pool

Command output is inconsistent when thresholds are given and using different arguments
for pools, for example:

crash@tesla:~/work/check_zpools$ sudo zpool status
  pool: mail
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
	invalid.  Sufficient replicas exist for the pool to continue
	functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 00:00:00 with 0 errors on Wed Feb 22 14:24:16 2023
config:

	NAME        STATE     READ WRITE CKSUM
	mail        DEGRADED     0     0     0
	  mirror-0  DEGRADED     0     0     0
	    nbd0    UNAVAIL      0     0    40  corrupted data
	    nbd1    ONLINE       0     0     0

errors: No known data errors
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail -c 45 -w 40 || echo $?
ZFS POOL mail health is DEGRADED|mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL -c 45 -w 40 || echo $?
ZFS POOL ALARM: POOL mail usage is CRITICAL (50%)|mail=50%
2

Also the output only considers the first error and reports only that. Here we have two
possible issues, one is that the pool is DEGRADED, the other is that the usage is too high.

After my changes the output is more consistent and hopefully contains all issues:

crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail -c 45 -w 40 || echo $?
ZFS POOL ALARM: mail health is DEGRADED mail usage is CRITICAL (50%) |mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail -c 65 -w 60 || echo $?
ZFS POOL ALARM: mail health is DEGRADED |mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL -c 45 -w 40 || echo $?
ZFS POOL ALARM: mail health is DEGRADED POOL mail usage is CRITICAL (50%) |mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL -c 65 -w 60 || echo $?
ZFS POOL ALARM: mail health is DEGRADED |mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL || echo $?
ZFS POOL ALARM: mail health is DEGRADED|mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail || echo $?
ZFS POOL ALARM: mail health is DEGRADED|mail=50%
2

Also works ok with a normal ONLINE pool:

crash@tesla:~/work/check_zpools$ sudo zpool status
  pool: mail
 state: ONLINE
  scan: resilvered 244M in 00:00:07 with 0 errors on Wed Feb 22 15:28:24 2023
config:

	NAME        STATE     READ WRITE CKSUM
	mail        ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    nbd0    ONLINE       0     0     0
	    nbd1    ONLINE       0     0     0

errors: No known data errors
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail && echo $?
ALL ZFS POOLS OK (mail)|mail=50%
0
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL && echo $?
ALL ZFS POOLS OK (mail)|mail=50%
0
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail -w 40 -c 45 || echo $?
ZFS POOL ALARM: mail usage is CRITICAL (50%) |mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL -w 40 -c 45 || echo $?
ZFS POOL ALARM: POOL mail usage is CRITICAL (50%) |mail=50%
2
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p ALL -w 60 -c 65 && echo $?
ALL ZFS POOLS OK (mail)|mail=50% 
0
crash@tesla:~/work/check_zpools$ ./check_zpools.sh -p mail -w 60 -c 65 && echo $?
ALL ZFS POOLS OK (mail)|mail=50%
0

I will add a pr shortly.

NRPE: Unable to read output

Hello!

I'm trying to use this plugin on OpenIndiana (open solaris fork) and cannot seem to get NRPE to read the output. I'm able to execute the check locally but not remotely. NRPE keeps giving me unable to read output errors. I'm already using NRPE to exectue a different plugin on this host so I don't think its a problem with how I configured NRPE.

LOCAL OS:

OpenIndiana (powered by illumos) SunOS 5.11 oi_151a9 November 2013

LOCAL EXECUTION OUTPUT EXAMPLES:

-bash-4.0$ ./check_zpools.sh -p BigD -w 90 -c 95
ZFS POOL BigD health is DEGRADED|BigD=57%
-bash-4.0$ ./check_zpools.sh -p rpool -w 90 -c 95
ALL ZFS POOLS OK (rpool)|rpool=80%

NRPE COMMAND EXAMPLE:

-bash-4.0$ ./check_nrpe e -H example.host -c check_zpools -a '-p BigD -w 90 -c 95'
NRPE: Unable to read output

any ideas?

Thank you,
Vince

Bug with Warn/Crit detection

these are the original code lines:
elif [[ $CAPACITY -gt $crit ]]; then echo "ZFS POOL $pool usage is CRITICAL (${CAPACITY}%|$pool=${CAPACITY}%)"; exit ${STATE_CRITICAL}
elif [[ $CAPACITY -gt $warn && $CAPACITY -lt $crit ]]; then echo "ZFS POOL $pool usage is WARNING (${CAPACITY}%)|$pool=${CAPACITY}%"; exit ${STATE_WARNING}

When critical parameter is e.g. 95% and warning parameter is 90%, then a check value of 95% results in an OK return value.

These are the corrected lines:

elif [[ $CAPACITY -ge $crit ]]; then echo "ZFS POOL $pool usage is CRITICAL (${CAPACITY}%|$pool=${CAPACITY}%)"; exit ${STATE_CRITICAL}
elif [[ $CAPACITY -ge $warn && $CAPACITY -lt $crit ]]; then echo "ZFS POOL $pool usage is WARNING (${CAPACITY}%)|$pool=${CAPACITY}%"; exit ${STATE_WARNING}

When given nonexistent pool name incorrectly reports OK

root@diskmaskin01:~ # /usr/local/libexec/nagios/check_zpools.sh -p thispooldoesnotexist ; echo $?
cannot open 'thispooldoesnotexist': no such pool
cannot open 'thispooldoesnotexist': no such pool
/usr/local/libexec/nagios/check_zpools.sh: line 125: [: !=: unary operator expected
ALL ZFS POOLS OK (thispooldoesnotexist)|thispooldoesnotexist=%
0
root@diskmaskin01:~ #

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.