Giter Site home page Giter Site logo

Comments (11)

jimklimov avatar jimklimov commented on September 26, 2024

Nice finding!.. Just in case - is this with NUT 2.8.0 release (e.g. package)? I think one of the first found and fixed regressions dealt with something about misinterpreted states. Wondering if this issue is different?..

Can you please check if the problem is reproducible with current master (or 2.7.4)?

from nut.

Bomorav avatar Bomorav commented on September 26, 2024

I use 2.8.0 rpm package from EPEL patched and recompiled with cyberpower snmp driver from current master because the driver in 2.8.0 release is not functional.

Sorry, I'm not able to check it with current master.

For my needs, I solved this with my own patch, which adds a status flag for the OFF state and adds a condition to the recalc power value code in upsmon.

from nut.

jimklimov avatar jimklimov commented on September 26, 2024

Thanks. So the driver is from some recent master and should include those earlier fixes (including your PRs of late), right?

I think the fix I had in mind earlier was https://github.com/networkupstools/nut/pull/1432/files and probably does not impact detection of "OFF" state after all (but could impact the "NULL" state, in case this is what might bite you - IF you run the driver built from 2.8.0 code plus your patches, but not recent-ish master plus your patches).

from nut.

Bomorav avatar Bomorav commented on September 26, 2024

Yes, I'm using the latest driver, which works fine and returns the correct UPS states.

https://github.com/networkupstools/ConfigExamples/releases/download/book-3.0-20230319-nut-2.8.0/ConfigExamples.pdf
Pages 22-23

The value of MINSUPPLIES is the key element in determining if a server with multiple power supplies should shut down. When all the UPS units can be contacted, and when their ups.status values are known, then it is the count A of those that are active, that is without [LB], which is determinant.

So if my UPS in sleep mode can be contacted and its ups.status value is known ([OFF]), it is considered as active by upsmon?

from nut.

jimklimov avatar jimklimov commented on September 26, 2024

Sorry about the delay, finally got a moment to look at this intently. While scrolling through upsmon.c::parse_status() I found that it considers several status key words, but not "OFF":

nut/clients/upsmon.c

Lines 1887 to 1939 in 91b3ee0

/* deal with the contents of STATUS or ups.status for this ups */
static void parse_status(utype_t *ups, char *status)
{
char *statword, *ptr;
clear_alarm();
upsdebugx(2, "%s: [%s]", __func__, status);
/* empty response is the same as a dead ups */
if (status == NULL || status[0] == '\0') {
ups_is_gone(ups);
return;
}
ups_is_alive(ups);
/* clear these out early if they disappear */
if (!strstr(status, "LB"))
clearflag(&ups->status, ST_LOWBATT);
if (!strstr(status, "FSD"))
clearflag(&ups->status, ST_FSD);
statword = status;
/* split up the status words and parse each one separately */
while (statword != NULL) {
ptr = strchr(statword, ' ');
if (ptr)
*ptr++ = '\0';
upsdebugx(3, "parsing: [%s]", statword);
if (!strcasecmp(statword, "OL"))
ups_on_line(ups);
if (!strcasecmp(statword, "OB"))
ups_on_batt(ups);
if (!strcasecmp(statword, "LB"))
ups_low_batt(ups);
if (!strcasecmp(statword, "RB"))
upsreplbatt(ups);
if (!strcasecmp(statword, "CAL"))
ups_cal(ups);
/* do it last to override any possible OL */
if (!strcasecmp(statword, "FSD"))
ups_fsd(ups);
update_crittimer(ups);
statword = ptr;
}
}

The "OFF" value is used and status_set() extensively in the codebase however, and is long defined in our standards as "UPS is offline and is not supplying power to the load ":

nut/docs/new-drivers.txt

Lines 210 to 261 in 91b3ee0

Status data
~~~~~~~~~~~
UPS status flags like on line (OL) and on battery (OB) live in
ups.status. Don't manipulate this by hand. There are functions which
will do this for you.
status_init() -- before doing anything else
status_set(val) -- add a status word (OB, OL, etc)
status_commit() -- push out the update
Possible values for status_set:
OL -- On line (mains is present)
OB -- On battery (mains is not present)
LB -- Low battery
HB -- High battery
RB -- The battery needs to be replaced
CHRG -- The battery is charging
DISCHRG -- The battery is discharging (inverter is providing load power)
BYPASS -- UPS bypass circuit is active -- no battery protection is available
CAL -- UPS is currently performing runtime calibration (on battery)
OFF -- UPS is offline and is not supplying power to the load
OVER -- UPS is overloaded
TRIM -- UPS is trimming incoming voltage (called "buck" in some hardware)
BOOST -- UPS is boosting incoming voltage
FSD -- Forced Shutdown (restricted use, see the note below)
Anything else will not be recognized by the usual clients. Coordinate
with the nut-upsdev list before creating something new, since there will be
duplication and ugliness otherwise.
[NOTE]
==============================================================================
- upsd injects `FSD` by itself following that command by a primary upsmon
process. Drivers must not set that value, apart from specific cases (see
below).
- As an exception, drivers may set `FSD` when an imminent shutdown has been
detected. In this case, the "on battery + low battery" condition should not be
met. Otherwise, setting status to `OB LB` should be preferred.
- the `OL` and `OB` flags are an indication of the input line status only.
- the `CHRG` and `DISCHRG` flags are being replaced with
`battery.charger.status`. See the linkdoc:user-manual[NUT command and
variable naming scheme,nut-names] for more information.
==============================================================================

So currently my best guess is that you could benefit from tinkering with code to add handling for such situation (UPS manageable, but its load is powered off) and if there's an iteration that works better than now - propose a PR ;)

I think I remember some discussions about it, that a very closely related case is about support of manageable outlet groups (or individual outlets, especially on ePDUs) - so the UPS overall maybe "OL" but the electric socket your server knows it is fed from is "OFF"; such situations could need additional MONITOR parameters or some extended syntax for powerdevice@hostname to monitor that outlet group. So maybe that got bogged down in talks and neither case got addressed yet.

On a related note, the BYPASS case does not seem to be handled here either. For practical purposes it could be similar to LB I suppose - there's maintenance on the UPS, load is fed from the wall, and might disappear any moment without clear notice. At least, if the UPS is not longer manageable, we might want to consider it dead, same as loss of connection during an outage, maybe? (An unplugged comms cable would be considered a power cut though) Not sure here, either...

For that matter, the internal upsmon bitmask values also do not currently cater for these states:

nut/clients/upsmon.h

Lines 23 to 34 in 91b3ee0

/* flags for ups->status */
#define ST_ONLINE (1 << 0) /* UPS is on line (OL) */
#define ST_ONBATT (1 << 1) /* UPS is on battery (OB) */
#define ST_LOWBATT (1 << 2) /* UPS has a low battery (LB) */
#define ST_FSD (1 << 3) /* primary has set forced shutdown flag */
#define ST_PRIMARY (1 << 4) /* we are the primary (manager) of this UPS */
#define ST_MASTER ST_PRIMARY /* legacy alias */
#define ST_LOGIN (1 << 5) /* we are logged into this UPS */
#define ST_CLICONNECTED (1 << 6) /* upscli_connect returned OK */
#define ST_CAL (1 << 7) /* UPS calibration in progress (CAL) */

from nut.

Bomorav avatar Bomorav commented on September 26, 2024

Thanks.
This OFF state is probably specific to cyberpower snmp UPSes. I modified uspmon.c for my needs, here is my modification against 2.8.0 if it would be useful for someone (sorry, I'm not a programmer :-))

diff -uNrp nut-2.8.0/clients/upsmon.h nut-2.8.0.p/clients/upsmon.h
--- nut-2.8.0/clients/upsmon.h	2022-04-27 00:03:31.000000000 +0200
+++ nut-2.8.0.p/clients/upsmon.h	2023-04-27 15:08:15.044584966 +0200
@@ -31,6 +31,7 @@
 #define ST_LOGIN       (1 << 5)       /* we are logged into this UPS              */
 #define ST_CONNECTED   (1 << 6)       /* upscli_connect returned OK               */
 #define ST_CAL         (1 << 7)       /* UPS calibration in progress (CAL)        */
+#define ST_OFF         (1 << 8)       /* UPS is off or on sleep (OFF)             */
 
 /* required contents of flag file */
 #define SDMAGIC "upsmon-shutdown-file"
diff -uNrp nut-2.8.0/clients/upsmon.c nut-2.8.0.p/clients/upsmon.c
--- nut-2.8.0/clients/upsmon.c	2022-04-23 13:56:06.000000000 +0200
+++ nut-2.8.0.p/clients/upsmon.c	2023-04-27 23:29:00.418401200 +0200
@@ -442,6 +442,7 @@ static void ups_on_batt(utype_t *ups)
 	do_notify(ups, NOTIFY_ONBATT);
 	setflag(&ups->status, ST_ONBATT);
 	clearflag(&ups->status, ST_ONLINE);
+	clearflag(&ups->status, ST_OFF);
 }
 
 static void ups_on_line(utype_t *ups)
@@ -463,6 +464,7 @@ static void ups_on_line(utype_t *ups)
 
 	setflag(&ups->status, ST_ONLINE);
 	clearflag(&ups->status, ST_ONBATT);
+	clearflag(&ups->status, ST_OFF);
 }
 
 /* create the flag file if necessary */
@@ -809,7 +811,7 @@ static void recalc(void)
 		/* crit = (FSD) || (OB & LB) > HOSTSYNC seconds */
 		if (is_ups_critical(ups))
 			upsdebugx(1, "Critical UPS: %s", ups->sys);
-		else
+		else if (!flag_isset(ups->status, ST_OFF))
 			val_ol += ups->pv;
 
 		ups = ups->next;
@@ -1697,7 +1699,11 @@ static void parse_status(utype_t *ups, c
 			upsreplbatt(ups);
 		if (!strcasecmp(statword, "CAL"))
 			ups_cal(ups);
-
+		if (!strcasecmp(statword, "OFF")) {
+			setflag(&ups->status, ST_OFF);
+			clearflag(&ups->status, ST_ONLINE);
+			clearflag(&ups->status, ST_ONBATT);
+		}
 		/* do it last to override any possible OL */
 		if (!strcasecmp(statword, "FSD"))
 			ups_fsd(ups);

from nut.

jimklimov avatar jimklimov commented on September 26, 2024

Seems reasonable, after that research :)

Care to post a PR, to log the changeset in your name? :)

As for "CPS SNMP only" - no, there are many (sub)drivers with different techs that have a status_set("OFF") or equivalent in their sources - e.g. many USB UPSes stay connected even if their load is administratively off, it seems. Even if it were just one device type... you've stepped into this problem - chances are, someone else will.

from nut.

Bomorav avatar Bomorav commented on September 26, 2024

I'll leave it up to you to post a PR, no need to mention my name.

I have been using this code with dual Cyberpower UPS for several months now and have not experienced a problem.

from nut.

jimklimov avatar jimklimov commented on September 26, 2024

Several months? So, you've actually come to similar conclusions, and proofed them IRL? Nice to have convergences like this, builds confidence up a bit :)

from nut.

jimklimov avatar jimklimov commented on September 26, 2024

Hello @Bomorav, would you have a chance to check how the proposed change in https://github.com/jimklimov/nut/tree/issue-2044 behaves for you?

I hope it would still deduct the power value when one of those UPSes reports itself as OFF, return the counter when that UPS goes back ON, and would report this change (and back) among notifications.

from nut.

jimklimov avatar jimklimov commented on September 26, 2024

Behaved well in local tests simulated with NIT, PR merged.

from nut.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.