Comments (14)
I often do something like "pssh -h myhosts -t 5 echo hi" for this purpose. I
believe that this would meet the needs that you describe; is there anything
that it's missing? Let me know what you think.
Original comment by [email protected]
on 20 Dec 2010 at 9:18
from parallel-ssh.
That doesn't really address my problem because if the machines are down, pssh
still exits with 0 so the caller can't determine if all the machines are up.
Normally it makes sense for pssh et al to exit(0) even if some commands fail,
but not always.
The more I think about it now, the more I think all the tools need an extra
option; something like "--exit-one-on-failure" that if passed will cause pssh
et al to exit(1) if any of the requests fail.
That would solve my immediate problem by allowing
"pssh -h hosts -t 10 --exit-one-on-failure exit 0" || doFailureCode()
Original comment by mdennis%[email protected]
on 21 Dec 2010 at 7:26
from parallel-ssh.
Hmm. Shouldn't pssh always exit with an error if there's a single failure. I
had thought that this was already happening. The current behavior sounds like
a bug to me; can you think of any particular reason that it should exit(0) even
if some commands fail?
Original comment by [email protected]
on 21 Dec 2010 at 8:34
from parallel-ssh.
That certainly isn't happening right now.
My argument for it returning 0 is to be able to distinguish between pssh having
a problem and the remote servers and/or ssh having a problem. This could also
be accomplished with using different return codes for each. For example 1 for
pssh failure (couldn't allocate memory, bad args, etc) and 2 for remote/ssh
failure (timeout, key rejected, connection refused, remote command exited with
non-zero return, etc). This is similar to how grep et al works. If grep
matches anything, it exits 0. If it doesn't match anything, it returns 1.
I'm not against making it exit(somethingNotZero) if a ssh command failed by
default, but I figured that was pretty explicit functionality to have in there
so assumed it was done on purpose.
Original comment by mdennis%[email protected]
on 22 Dec 2010 at 12:47
from parallel-ssh.
I like the idea of having different error codes to discern between different
problems. Do you have any suggestions about what the error codes should mean?
One possibility would be to return the number of hosts that failed, perhaps
with a "-1" if it's some fatal early error (such as an invalid hosts file).
Any thoughts?
Original comment by [email protected]
on 9 Jan 2011 at 6:57
- Changed state: Accepted
from parallel-ssh.
negative returns can be somewhat of an issue on most systems, as can numbers
above 255.
As examples try:
python -c 'import sys; sys.exit(-1)'; echo $?
and
python -c 'import sys; sys.exit(256)'; echo $?
http://www.gnu.org/software/libc/manual/html_node/Exit-Status.html may be
helpful to you here.
Since reporting the numbers of failures above 255 isn't possible, I don't think
that's a workable solution since it would limit the use of pssh to less than
256 nodes which would be a real problem.
I would just do something simple like:
0: OK
1: pssh failure (couldn't execute a subprocess for one or more hosts for some
reason)
2: ssh and/or remote failure of one or more hosts (subprocess was executed but
returned non-zero)
Personally I don't think anything more is all that useful as in most cases
there is nothing an automated caller could do to fix it and a interactive
caller can read the output.
Original comment by mdennis%[email protected]
on 9 Jan 2011 at 7:46
from parallel-ssh.
Doesn't a return code of -1 turn into 255? We could return the number of
failed hosts up to 250 or something, with -1 being a pssh failure.
Or more in line with your proposal, it might make sense to have a different
return code if all ssh commands fail than if only some of the ssh commands fail.
I suppose either of these would be better than what we're doing right now, but
at the moment I don't have a strong preference.
Original comment by [email protected]
on 10 Jan 2011 at 2:24
from parallel-ssh.
Hmm. In addition to whether one or more processes failed, there is also the
issue of whether a process returned a non-0 exit status. I need to think about
this a bit more, but I think there are several different values of exit status
that we might want to provide. Here's what I'm thinking right now:
0: all commands successful and returned 0
1: at least one remote command returned a non-0 value (but all commands ran)
2: at least one ssh command returned 255 (connection error, bad password, etc.)
3: at least one ssh process timed out or killed by a signal
4: internal pssh error
Analogous exit statuses would be used for prsync, pscp, etc. (although some
might not exit with a value of 1). Any thoughts? Is there anything else
missing from this list? I'll send an email to the mailing list to solicit
additional input.
Original comment by [email protected]
on 18 Jan 2011 at 11:59
from parallel-ssh.
The errors you mention are not necessarily mutually exclusive. Use a bitfield;
that is, assign powers of two to them and add them up.
Original comment by [email protected]
on 19 Jan 2011 at 3:13
from parallel-ssh.
Indeed they aren't mutually exclusive--my thought was to return the max (most
severe). The bitfield idea is clever, but I'm not sure if I've come across it
in this context. Is there any precedent for using bitfields for exit status
codes? I know that bash provides an arithmetic operator for bitwise AND, but
overall it seems like there isn't much shell-level support for this. What do
you think?
Original comment by [email protected]
on 19 Jan 2011 at 5:39
from parallel-ssh.
I've looked into this, and so far I haven't been able to find any other
programs that use bit fields for exit status. Combined with the fact that the
"test" command doesn't have any bitwise operators, I'm edging towards the
scheme from comment #8, with the plan to make the semantics clear in the man
page.
Original comment by [email protected]
on 19 Jan 2011 at 8:27
from parallel-ssh.
Meaningful exit status codes were added to pssh in commit 4ef1fea. The pssh
man page includes documentation on the subject. I still need to fix the other
commands. Please let me know if you see any problems or if you have any
last-minute feedback.
Original comment by [email protected]
on 21 Jan 2011 at 10:30
from parallel-ssh.
Okay, this is done for the others as well (although we still need to add man
pages for these). I'm going to mark this as closed, but please reopen it if
you see any concrete or subjective problems with the implementation. Thanks.
Original comment by [email protected]
on 21 Jan 2011 at 10:37
- Changed state: Done
from parallel-ssh.
Works in bash:
> bash -c 'bash -c "exit 5"; xit=$?; if (( $xit & 1 )); then echo "1 bit set";
fi; if (( $xit & 2 )); then echo "2 bit set"; fi; if (( $xit & 4 )); then echo
"4 bit set"; fi;'
1 bit set
4 bit set
Works in tcsh:
> /bin/tcsh -c '
> /bin/tcsh -c "exit 5"
> set xit=$?
> if ( ( $xit & 1 ) != 0 ) then
> echo "1 bit set"
> endif
> if ( ( $xit & 2 ) != 0 ) then
> echo "2 bit set"
> endif
> if ( ( $xit & 4 ) != 0 ) then
> echo "4 bit set"
> endif
> '
1 bit set
4 bit set
Generating the errors themselves does not require bitwise operators, just
addition.
Original comment by [email protected]
on 24 Jan 2011 at 5:18
from parallel-ssh.
Related Issues (20)
- API module to enable easier use of PSSH as a library (patch included)
- TypeError when using Input script
- Erorr Code 255
- Not allowing relative paths makes no sense HOT 3
- pslurp can't use rsync HOT 2
- Patch: pass -o SendEnv in a way that is friendly on Mac OS HOT 1
- How to distribute different files to different hosts? HOT 1
- motd being printed only on debian squeeze boxes HOT 4
- SIGINFO handler gives task status HOT 1
- Manpage name problem HOT 1
- Sudo requires allocation of a real pseudoterminal HOT 6
- Installation errors: "setup.py install" HOT 1
- pslurp multiple files HOT 1
- Bandwith limit with pscp (parallel scp) HOT 4
- Summary of failed and successful execution HOT 2
- Teardown code in test classes is never called
- IPv6 host address processing broken
- Allow '-' as an alias for stdin
- cannot parse for more than one -O/-o options; cannot use -i option
- Crash on Archlinux/Python3.4 when asking for password (Includes one line fix) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from parallel-ssh.