mpounsett / nagiosplugin Goto Github PK

View Code? Open in Web Editor NEW

26.0 3.0 14.0 1.23 MB

A Python class library which helps with writing Nagios (Icinga) compatible plugins.

Home Page: https://nagiosplugin.readthedocs.io/

License: Other

Makefile 0.23% Python 99.77%

python python3 python2 icinga icinga2 nagios nagios-plugin icinga-plugin

nagiosplugin's People

Stargazers

Watchers

Forkers

sbraz stdietrich shepherdjay simonmeggle paulboot ypid-geberit mikesch-mp yaiqsa rincewindshat xunzi josef-friedrich outon dlake-dev rsudre

nagiosplugin's Issues

Documentation lacking around installation of plugin using nagiosplugin

Original report by jonathansd (Bitbucket: jonathansd, GitHub: jonathansd).

If you're installing a plugin that uses nagiosplugin. Since nagiosplugin is not in the standard library what are the installation procedures needed for the plugin.

metric strings should accept more keywords

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

Context.describe should also process context attributes as fmt_metric keywords.

Windows install instructions for documentation

Original report by Mayur Patil (Bitbucket: mayurp7, GitHub: mayurp7).

Installation of nagiosplugin 1.2.4:
OS: Win 7 SP1 64 bits
Python: 2.7.x 32 bits and 3.5.x 64 bits
Installation mode: pip

For python 2.7, C:\Python27\python.exe
For python 3.5, C:\Python35\python3.exe

For nagiosplugin installation, Python 2.7

C:>python -m pip install nagiosplugin

For nagiosplugin installation, Python 3.5

C:>pip install nagiosplugin

Class Summary could be reimplemented as an actual Abstract Base Class

Linting complains about the signatures of several methods in this class (no-self-use), and the issue could be resolved by reimplementing them as @staticmethod methods, but that might be API-breaking if anyone is making use of self references in subclassed implementations. If these can't be @staticmethod then they could be reimplemented as a proper ABC using the @abstractmethod decorator.

Class Context should be reimplemented as an actual Abstract Base Class

Linting complains about the Context.performance() signature, and the issue could be resolved by reimplementing this as a proper ABC, using the @abstractmethod decorator on the performance method.

Logged messages appear after the summary

Original report by Louis Sautier (Bitbucket: sbraz, GitHub: sbraz).

Hello,

I’m using the built-in logger to display additional messages based on verbosity but I noticed that they are displayed after the probe function is done running, not in real time. This is a bit confusing because in case of errors, the verbose messages appear after the summary, could this behaviour be changed?

cannot subclass Performance

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

From: Lofton Newton [email protected]

I have a question about subclassing the nagiosplugin.performance.Performance class. I wanted to override the str method, but ran into a problem (see traceback below). Looking over the documentation for super. It seems the second argument is expected to be an instance or subtype of the first argument. The way super is called in the new method of the nagiosplugin.performance.Performance class it doesn't seem possible to override class methods like below. If I were to swap the arguments in the super call in Performance.new I no longer get the TypeError below. Is this implementation by design? My apologies if I'm misinterpreting something here.

#!python
class PluginPerformance(nagiosplugin.Performance):
     __str__(self):
         # Do some stuff here

Traceback (most recent call last):
  File "./plugins/check_consumer_metrics", line 75, in <module>
    main()
  File "./plugins/check_consumer_metrics", line 71, in main
    check.main()
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/check.py", line 115, in main
    runtime.execute(self, verbose, timeout)
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/runtime.py", line 118, in execute
    with_timeout(self.timeout, self.run, check)
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/platform/posix.py", line 19, in with_timeout
    func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/runtime.py", line 107, in run
    check()
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/check.py", line 100, in __call__
    self._evaluate_resource(resource)
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/check.py", line 86, in _evaluate_resource
    self.perfdata.append(str(metric.performance() or ''))
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/metric.py", line 104, in performance
    return self.contextobj.performance(self, self.resource)
  File "/tmp/kafka-monitoring/monitoring.py", line 171, in performance
    return PluginPerformance(metric.context, metric.value)
  File "/usr/local/lib/python2.7/site-packages/nagiosplugin/performance.py", line 51, in __new__
    return super(cls, Performance).__new__(
TypeError: super(type, obj): obj must be an instance or subtype of type

Wrong argument order for (all ?) super(,) calls

Original report by Vincent Danjean (Bitbucket: vdanjean, GitHub: vdanjean).

Hi,
I tried to subclass the Range class in one of my project. I'm using the official Debian package (ie upstream version 1.2.4).
I see that new in the Range class is calling super(cls, Range) instead of super(Range, cls)

I checkout the last sources and see that, in the repo, the Range object do not have the new method anymore. However, a quick look show me that the remaining call to super(,) have their parameters reverted. With basic usage, it is not a problem (as both parameters refer to the same type), but it leads to problem when subclassing AND overriding the method.

Regards,
Vincent

UTF-8 output broken

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

Using umlauts in various output-related strings causes UnicodeDecodeErrors.

For example, using a ScalarContext like this:

nagiosplugin.ScalarContext(
  'cert_expiration_days',
  fmt_metric="Zeritifikat läuft in {value} Tagen ab."
)

results in

Traceback (most recent call last):
  File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
    self.run()
  File "C:\Python27\lib\threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\Python27\lib\site-packages\nagiosplugin\runtime.py", line 108, in run
    self.output.add(check)
  File "C:\Python27\lib\site-packages\nagiosplugin\output.py", line 27, in add
    self.status = self.format_status(check)
  File "C:\Python27\lib\site-packages\nagiosplugin\output.py", line 41, in format_status
    summary_str = check.summary_str.strip()
  File "C:\Python27\lib\site-packages\nagiosplugin\check.py", line 143, in summary_str
    return self.summary.problem(self.results) or ''
  File "C:\Python27\lib\site-packages\nagiosplugin\summary.py", line 51, in problem
    return str(results.first_significant)
  File "C:\Python27\lib\site-packages\nagiosplugin\result.py", line 57, in __str__
    return '{0} ({1})'.format(desc, self.hint)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 13: ordinal not in range(128)

Same problem on Linux machines.

Multiple calls of a guarded() function retain output

Let me first start by saying that this is not an issue in production, as I am fully aware there should be only one call of the actual guarded() code.

However, when unittesting the plugin I'm writing, I wanted to ensure the various verbose levels each gave me the right output. So I have one test class, with each verbose level a separate test function in that class and all of them getting run when I do my test.

Because of this approach, I of course need to mock out sys.exit() to allow the tests to finish.

I quickly noticed that the output of verbose level 3 included the one from verbose level 2, although there was no code path that could allow it to be generated.

Debugging the test run revealed that indeed, at the end of the verbose 2 test, something was not getting cleaned up. Further investigation led to the conclusion that the Runtime class has been designed as a singleton, ie. the very same instance is always returned when guarded() asks for a Runtime. Furthermore, Runtime accumulates verbose output and then prints it, but does not afterward empty out that storage. Because the entire test run is a single invocation of the interpreter, this global Runtime instance doesn't get reset between tests and the output just accumulates.

I have implemented a pretty simple workaround, where I set nagiosplugin.runtime.Runtime.instance = None at the start of each test to force it to create a new Runtime each time.

Now I don't really expect this you to solve this for me, but I felt it worth noting in case anyone ever runs into a similar issue.

If you do feel like looking into a solution... I just have to ask, why exactly is Runtime a singleton? I can't see what purpose that serves, nor do I even see where else a Runtime instance is created except in guarded()...

Accessing results.by_state creates empty list and corrupts results.most_significant_state

If you access your results with by_state it creates and empty list entry for the accessed value.

Example:
I use len(results.by_state[np.Critical]) > 0: in Summary.problem to print Critical first followed by Warnings. This though creates an empty record for Critical in my tests (where no critical results exist only warning)- making them fail:

pprint((check.results.most_significant_state))
pprint((check.results.most_significant))

Result:

Critical(code=2, text='critical')
[]

This results in the whole check return CRITICAL, despite having no results that are actually critical (only warnings are present). The issue is most likely due to self.by_state = collections.defaultdict(list)

Workaround:
Don't use results.by_state if you are not sure it contains elements, use results.most_significant_state == np.Critical instead. Still this seems rather critical problem to me.

Runtime singleton should be refactored out

The nagiosplugin.runtime.Runtime class is implemented as a singleton. This is an anti-pattern, and doesn't seem to serve any purpose, so should be refactored out of the code.

Since it's possible the side-effects this causes are relied upon by existing code, this should be done in a 2.x release.

nagiosplugin.state.Warn vs. nagiosplugin.state.Warning (considering nagiosplugin.state.Critical)

Hi,

I just stumbled across what I think is an inconsistency. Those states exist currently:

Ok
Warn
Critical
Unknown

I think this module should rather support Warning instead of Warn to comply with the exact wording used in the plugin API spec: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/pluginapi.html

Backwards compatibility with existing checks should be provided by still supporting Warn.

For background, I am using a statement like this getattr(nagiosplugin.state, nagios_state_string, nagiosplugin.state.Unknown) to use an already existing Nagios state (nagios_state_string) in a custom Context class.

What do you think?

Duplicated text in docstring in state.py

Lines 5 and 6 of state.py have duplicated text in the docstring.

Support for extra-opts

It would be nice to provide common/standart solution for plugins to support "Extra-Opts" (https://nagios-plugins.org/doc/extra-opts.html).

That highly required to "hide" passwords for command line arguments.
I think this project is a very good place to provide such a solution.

Support for extra-opts

It would be nice to provide common/standart solution for plugins to support "Extra-Opts", like done in C and Perl plugins: https://nagios-plugins.org/doc/extra-opts.html.

That highly required to hide passwords from command line arguments and for other tasks too.
I think this library is a very good place to provide such a solution, as it used as a base for plugins.

My proposal is to add common/standart solution for this task into library, which already has 'ArgumentParser' part. That part should become smart to properly handle the --extra-opts option and support reading options from a given configuration file.
Then any of Python plugins that use this library will be able to handle arguments defined in file without (or with minor) code changes.

I'm not a Python developer, so I can't suggest an implementation.
Thanks for your project.

FatalContext

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

[email protected]:

I'd like to share one small bit of code which I found myself reusing
again and again, for situations in which you need want to signal that
something went wrong, but cannot do that by crossing a (possibly
unknown) threshold or raising an exception (because an exception might
actually mean critical failure instead of unknown).
This is the idiom I use:

#!python
class FatalContext(nagiosplugin.Context):
    def __init__(self, name):
        super(FatalContext, self).__init__(name)

    def evaluate(self, metric, resource):
        return nagiosplugin.Result(nagiosplugin.state.Critical,
                                   "Fatal Error: %r" % metric.value,
                                   metric
            )

...

    archive_fatal_ctx = FatalContext('fatal_archive')

...
            try:
                sess = get_session(session_id)
            except IOError as e:
                yield nagiosplugin.Metric('fatal_%s' % self.name_postfix, repr(e))
                return
...

If you find this a useful scenario as well, go ahead and put it into the
nagiosplugin distribution.

Move example code out of the nagiosplugin package into its own path

This should probably go alongside the LICENSE, CONTRIBUTORS, etc. once #20 is fixed.

Support of pytest capsys

First off great plugin, we were writing a bunch of our own logic and boiler plate for our nagios style tests and I am considering moving over to this plugin to avoid copy paste errors.

The one issue I've had is converting our tests, our current modules use pytest's capsys fixture. To compare the expected output of a check with test data. Sample from one of these tests below.

[.....]
mock_get.return_value.text = test_data
expected_msg = "OK :load average is 2% and max is 10% | " + \
               "load_average=2;75;85;100; load_max=10;75;85;100;\n"
with pytest.raises(SystemExit) as e:
    cpdmain(test_args)
assert capsys.readouterr().out == expected_msg

What's interesting is none of the pytest fixtures for capturing stdout seem to capture nagiosplugins output. More interesting is that the output does appear in the test results summary like this

test_load.py::test_main FAILED [ 56%]LOAD OK - avg is 2 | avg=2;75;85;0;100 max=10;75;85;0;100

As far as I can tell the reason appears to be related to the Runtime class execute method where it prints with the file keyword argument set to self.stdout which itself is just sys.stdout

Since writing to stdout is the default behavior of print I'm curious if it being explicitly specified is to solve a specific problem or if this can be removed to allow test frameworks like pytest to capture the output?

Control output of scientific number notation

Original report by Thomas Wallrafen (Bitbucket: tom-userlike, GitHub: tom-userlike).

Hello,

is there a way to control whether metrics are printed in scientific notation or not?
I'm using nagiosplugin 1.2.4 under Python 2.7.14

Range.scale(factor)

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

Provide a Range method that returns a new Range where the thresholds are linearly scaled by factor.

Topic guide: error handling

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

```
what CheckError is good for
```
```
unhandled exceptions
```

avoid second-order errors in Summary (e.g., by accessing non-existent results)

debian packaging

Original report by Arthur Lutz (Bitbucket: arthurlogilab, GitHub: arthurlogilab).

A debian package would be nice to deploy these plugins.

We're contemplating contributing the debian packaging.

Should be able to add dynamically created Contexts during Probe execution

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

From: "Joseph L. Casale" [email protected]

When a context is unknown until the time of check execution, how can one build a context and metric during the nagiosplugin.Check probe invocation?
It would be too late during execution of the nagiosplugin.Resource as far as I can see unless I missed someway the resource can gain access to the nagiosplugin.Check instance?

Performance class should be a dataclass

The Performance class inherits from namedtuple and then implements additional methods. This seems like a prime candidate for a dataclass. Care should be taken that re-implementing this doesn't break the API.

Flexible output formatting for Range violations

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

The way output is currently generated from ScalarContexts is hard to understand and customize.

The string "outside range {}" hard-coded in Range.violation is not appropriate for all uses.
ScalarMetric uses Range.violation unconditionally. This should be fixed.

Sample haproxy log, LICENSE, CONTRIBUTORS, etc not installed by package

Verbosity can't be changed for @nagiosplugin.guarded

Original report by Louis Sautier (Bitbucket: sbraz, GitHub: sbraz).

Hi, after looking at the code for the latest revision, it seems there is no way to change the verbosity of the exceptions trapped by @nagiosplugin.guarded without monkey-patching a lot of code.

Until version 1.2.3, I would force it with:

#!python

nagiosplugin.runtime.Runtime._verbose = 0

but since the attribute moved to another object, I don't see a simple way to do it.

Maybe adding arguments to the decorator could be the answer to this problem.

test_examples.py:2: DeprecationWarning: pkg_resources is deprecated as an API

Hi,
It looks like setuptools will eventually remove pkg_resources so the following will fail:

nagiosplugin/tests/test_examples.py

Line 19 in 050651a

sys.executable, pkg_resources.resource_filename(

Running the test suite results in:

tests/test_examples.py:2                                                                                                                                                                                           
  /var/tmp/portage/dev-python/nagiosplugin-1.3.3/work/nagiosplugin-1.3.3/tests/test_examples.py:2: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resource
s.html                                                                                                                                                                                                             
    import pkg_resources

There is documentation here that explains how to migrate to importlib.resources: https://importlib-resources.readthedocs.io/en/latest/migration.html#pkg-resources-resource-filename

perfdata should be single line

Original report by Onur Yalazı (Bitbucket: yalazi, ).

Nagios and icinga2 uses only a single line output to get check results and performance data. In case of multiple metrics output.py:format_perfdata method joins metrics with newlines. This breaks perf data collection. This should join with spaces.

Sample output:

ONLINE OK - Online is 2528
| 'www.xxxx.com'=74;;;0 'www.yyyy.com'=3;;;0 'www.zzzz.com'=306;;;0
'www.ttttt.com'=80;;;0 'www.bbbbbb.com'=819;;;0

result_cls wrapper for Context.evaluate()

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

I want to return just a state singleton from evaluate(). This avoid hard-to-debug errors like this:

#!python
CEPH UNKNOWN: AttributeError: 'Ok' object has no attribute 'state'
Traceback (most recent call last):
  File "/home/ckauhaus/.local/lib64/python3.2/site-packages/nagiosplugin/runtime.py", line 38, in wrapper
    return func(*args, **kwds)
  File "puppet/modules/sys_cluster/files/check_ceph_health.py", line 135, in main
    check.main(args.verbose, args.timeout)
[...]
  File "/home/ckauhaus/.local/lib64/python3.2/site-packages/nagiosplugin/check.py", line 81, in _evaluate_resource
    self.results.add(metric.evaluate())
  File "/home/ckauhaus/.local/lib64/python3.2/site-packages/nagiosplugin/result.py", line 116, in add
    self.by_state[result.state].append(result)
AttributeError: 'Ok' object has no attribute 'state'

Currently, evaluate() must return a Result object. Change base class to wrap Status singletons automatically using the result result class.

Best practice for "missing" metrics?

Allow me to first explain a bit the work I'm doing. I'm trying to build a nagios check based on the contents of a logfile of a different program I wrote. Now I'm maybe being way too defensive, but I'm trying not to assume anything about the correct format of said log.

This of course means that the Resource I built can yield metrics that represent broken/unknown state. So far I have done this by yielding Metrics whose value is None, but for example, a ScalarContext chokes on that because of course None cannot be compared to the ranges.

So I was pondering writing my own ScalarContext and whether or not Metrics that are None even make sense, when I had the idea that I could just not yield those Metrics.

After all, you can provide as many Contexts as you want to a Check, only those that get matched up to a Metric will become Results.

But as I was continuing along that train of thought, it struck me that this would lead to an issue. Missing Metrics do not lead to Results, so when Check tries to determine its state, there is no way to have a missing Metric lead to a critical state, and there's nothing you can do in a Summary to fix it. Yes I could subclass Check, but at some point I start feeling like I'm rewriting the entire library...

Which to some degree I wouldn't mind doing, but at that point I would probably turn it into a pull request. But of course, if you as the maintainer disagree with the choices made, that would be a lot of wasted effort...

And so we reach the discussion if there should not be an "official" stance about the proper practice in this case.

Option 1 - Metrics whose value is None. If this is chosen as best practice, I would recommend changing the ScalarContext to take this into account, so that people have a proper working example of the best practice.

Option 2 - Do not yield missing metrics. This would then require support for this chosen path in Check, so that a missing metric can properly lead to a Critical state. If this is not chosen, perhaps Check should still be adapted so that each Context has to be paired with a Metric (in addition to vice versa), to more clearly dissuade this approach.

Option 3 - Inspired by #3, actually. Yield a Metric with a different name, and have Contexts with both possible names ready to pair up and give a Result. This approach could do with some support, probably in Check.

Like I said, I'm willing to put my money where my mouth is and help code, but we should first find a consensus on what is the best practice in this case.

class Resource should be reimplemented as an actual Abstract Base Class

Linting complains about the Resource.probe() signature, and the issue could be resolved by reimplementing this as a proper ABC, using the @abstractmethod decorator on the probe method.

Default output only shows one failed metric

If more than one result is a failure, it currently arbitrarily picks one to display and does not display the others.

I believe it would more useful for the administrator to see in a single message everything that is wrong, rather than fix one thing and then have to wait for the next check to learn that something else was wrong as well.

On top of which the program can't decide which is the most important failed result (and indeed doesn't try, in first_significant)

I know I can write my own Summary class but it seems like it would be sensible to default to showing all problems.

I believe it should at least show all of the problems of the highest severity (but it would probably be okay to omit problems of lower severity). So if something is critical, display all the things that are critical.

Example, here you can see that queue_size and panic_size are critical, but I won't know about the latter until I fix the former:

$ ./check_exim  -qc 0 -pc 0
EXIMQUEUE CRITICAL - queue_size is 2 (outside range 0:0) | oldest_age=600 panic_size=6;;0 queue_size=2;;0 queue_size_bounce=2 queue_size_frozen=2

$ ./check_exim  -pc 0
EXIMQUEUE CRITICAL - panic_size is 6 (outside range 0:0) | oldest_age=600 panic_size=6;;0 queue_size=2 queue_size_bounce=2 queue_size_frozen=2

output formatting guide

Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).

show how to:

provide custom fmt_metric
subclass Context and override describe()
subclass Summary

Support Nagios passive checks

I have a POC and will open a PR in a few weeks. Some docs will need to be written first.

https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/passivechecks.html

None passed as range defaults to "0:"

Original report by Gabriel Hege (Bitbucket: datensuppe, ).

When passing None as a warning or critical range to the Context or ScalarContext constructor, it uses "0:" for evaluation. This results in a warning or critical status to be written, when the value of the Metric is negative. This is very counterintuitive, because one would expect no warning or critical alert to be generated, especially since the range string does not show in the performance data.

Here is a minimal example:

#!python

import nagiosplugin

class Sensors(nagiosplugin.Resource):
    def probe(self):
        return nagiosplugin.Metric('sens1', -3)

check = nagiosplugin.Check(Sensors(), nagiosplugin.ScalarContext('sens1', warning="-10:10"))
check.main()

which generates the following output:
SENSORS CRITICAL - sens1 is -3 (outside range 0:) | sens1=-3;-10:10

mpounsett / nagiosplugin Goto Github PK

nagiosplugin's People

Stargazers

Watchers

Forkers

nagiosplugin's Issues

Recommend Projects

Recommend Topics

Recommend Org