richardelling / zpool_prometheus Goto Github PK
View Code? Open in Web Editor NEWA prometheus-style metrics scraper for ZFS pools
License: MIT License
A prometheus-style metrics scraper for ZFS pools
License: MIT License
For anything that's a counter, just do mod 2^52
and Prometheus will handle the counter reset gracefully within rate()
etc.
We have adopted zpool_prometheus for use on one of our clusters. We are prometheus+grafana users with some existing detailed dashboards.
I would very much like to be able to alert on health state and I cannot find a way to do this with the health state information in a label. I can make some useful graphs with the multistat plugin, grouped by label values, but that panel doesn't seem to support alerting.
Other zfs exporters index the health values like this:
0 ONLINE
1 DEGRADED
2 FAULTED
3 OFFLINE
4 UNAVAIL
5 REMOVED
6 AVAIL
7 INUSE
-1 no data/timeout
Is this something you would consider adding? No existing metrics would change, it would just be one additional metric per vdev.
Perhaps I am missing a way to do this in Grafana?
When building and running from the alpine:latest
Docker image, the following error occurs:
zpool_latency_vdev_scrub_histo_seconds_bucket{name="tank",vdev="root",le="+Inf"} 14168517
zpool_latency_vdev_scrub_histo_seconds_sum{name="tank",vdev="root"} 0
zpool_latency_vdev_scrub_histo_seconds_count{name="tank",vdev="root"} 14168517
error: can't get vdev_trim_histo
I do not have any SSDs attached to this machine, so perhaps that's why it isn't working. When I remove the TRIM-related lines, the application runs to completion.
Currently, zpool_prometheus follows the vdev children, which represent the currently active devices in pools. Auxiliary devices, such as spares and caches, are not shown unless or until the spare is activated in a pool.
Questions:
I'm writing the output of zpool_prometheus to a file and reading that file with node_exporter (version 0.18.1) and the parsing of the file is failing with the following error:
May 21 08:18:22 pct-hanas-1.mines.edu node_exporter[123282]: time="2020-05-21T08:18:22-06:00" level=error msg="Error parsing "/var/lib/node_exporter/zpool_prometheus.prom": text format parsing error in line 41: second HELP line for metric name "zpool_stats_state"" source="textfile.go:211"
I've done a little research and found a similar bug elsewhere where it was stated that rules on HELP lines may have changed and there should be only one HELP line per metric name. I have manually removed the HELP line in question to confirm that the parsing does move forward, but I have a large setup with many vdevs so I haven't taken the time to manually remove all the extra HELP lines. I am looking at the code to see if I can figure out how to change it to meet the needs of the new node_exporter versions.
Thanks for creating this tool! Are there any plans to upstream zpool_prometheus
the same way that zpool_influxdb has been?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.