Giter Site home page Giter Site logo

Comments (29)

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
The traceback looks right to me. :)  Just kidding.  That definitely looks like 
a 
problem.

It looks like it's a problem with Python <2.5 (not x86_64).  There's a feature 
that 
psshlib uses that was introduced in Python 2.5, and the workaround I did for 
Python 
2.4 seems to be broken.  I think I have access to a machine with Python 2.4 
somewhere, so I think I should be able to test it out there.

Thanks for the report.  I'll let you know when I have something to test.

Original comment by [email protected] on 26 Feb 2010 at 6:27

  • Changed state: Started

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Okay.  I've made a commit that should fix this crash in Python 2.4.  Would you 
mind 
testing to see if this works for you, too?  If it works, I will release a 
version 
2.1.1.  Let me know if you need instructions for cloning the Git repository and 
testing.  Thanks for your help.

Original comment by [email protected] on 26 Feb 2010 at 8:10

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Your fix is full of win. Thank you!

$ pssh -i -H localhost date
[1] 19:02:40 [SUCCESS] localhost
Sat Feb 27 19:02:40 UTC 2010

Pete

Original comment by [email protected] on 27 Feb 2010 at 7:04

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Further issues, probably similar and probably not warranting a separate ticket, 
but if 
you want me to break it out, I will.

When I try to run with more than one host, I see this on my Macbook:

$ pssh -i -H localhost -H localhost date
[1] 11:30:58 [SUCCESS] localhost
Sat Feb 27 11:30:58 PST 2010
[2] 11:30:58 [SUCCESS] localhost
Sat Feb 27 11:30:58 PST 2010

When I run on Python 2.4 (same system as above):

$ pssh -i -H localhost -H localhost date
Traceback (most recent call last):
  File "/usr/bin/pssh", line 5, in ?
    pkg_resources.run_script('pssh==2.1', 'pssh')
  File "/usr/lib/python2.4/site-packages/setuptools-0.6c11-
py2.4.egg/pkg_resources.py", line 489, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.4/site-packages/setuptools-0.6c11-
py2.4.egg/pkg_resources.py", line 1214, in run_script
    exec script_code in namespace, namespace
  File "/usr/bin/pssh", line 119, in ?

  File "/usr/bin/pssh", line 110, in do_pssh

  File "build/bdist.linux-x86_64/egg/psshlib/manager.py", line 61, in run
  File "build/bdist.linux-x86_64/egg/psshlib/manager.py", line 113, in start_tasks
  File "build/bdist.linux-x86_64/egg/psshlib/task.py", line 84, in start
  File "/usr/lib64/python2.4/subprocess.py", line 550, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.4/subprocess.py", line 988, in _execute_child
    data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB
OSError: [Errno 4] Interrupted system call

Pete

Original comment by [email protected] on 27 Feb 2010 at 7:33

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
It looks like this is a bug in Python that was fixed today in Python 3.1 and 
2.6:

http://bugs.python.org/issue1068268

I wonder if there's any way we can work around this.

Original comment by [email protected] on 1 Mar 2010 at 6:28

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
I think I have a workaround for the problem described in comments 4 and 5.  
pemerson, 
would you please do a git pull again and see if this works for you?  Thanks.

Original comment by [email protected] on 1 Mar 2010 at 8:33

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024

Original comment by [email protected] on 1 Mar 2010 at 8:39

  • Changed title: pssh broken with Python 2.4

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
For me it looks like the first host succeeds, and then the second host is just 
hanging. 
When I control-c it, I get this:

Traceback (most recent call last):
  File "/usr/bin/pssh", line 5, in ?
    pkg_resources.run_script('pssh==2.1', 'pssh')
  File "/usr/lib/python2.4/site-packages/setuptools-0.6c11-
py2.4.egg/pkg_resources.py", line 489, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.4/site-packages/setuptools-0.6c11-
py2.4.egg/pkg_resources.py", line 1214, in run_script
    exec script_code in namespace, namespace
  File "/usr/bin/pssh", line 119, in ?

  File "/usr/bin/pssh", line 110, in do_pssh

  File "build/bdist.linux-x86_64/egg/psshlib/manager.py", line 73, in run
  File "build/bdist.linux-x86_64/egg/psshlib/manager.py", line 174, in interrupted
  File "build/bdist.linux-x86_64/egg/psshlib/task.py", line 111, in interrupted
  File "build/bdist.linux-x86_64/egg/psshlib/task.py", line 99, in _kill
OSError: [Errno 3] No such process

Original comment by [email protected] on 2 Mar 2010 at 4:57

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Issue 17 has been merged into this issue.

Original comment by [email protected] on 2 Mar 2010 at 6:13

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
pemerson, is this with commit "7c6d668" ("work around 
http://bugs.python.org/issue1068268")?  
I'll keep on looking at it, but I'm not getting any errors when I run the 
command you posted in  
comment #4.  I'll keep on trying to reproduce it, but is there anything you can 
think of that 
might make it easier for me to reproduce this error?  Thanks.

Original comment by [email protected] on 2 Mar 2010 at 6:20

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
pemerson, I just pushed a commit that should stop the "OSError: [Errno 3] No 
such 
process" error, but the real problem is that it was hanging to begin with.  I'm 
still 
trying to reproduce this hang.

Original comment by [email protected] on 2 Mar 2010 at 6:29

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
This was a nasty problem, but I think I've finally fixed it.  Please do a "git 
pull", 
which should get you commit fe8306c, and let me know if you still see problems. 

Thanks.

Original comment by [email protected] on 2 Mar 2010 at 9:16

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Looks like it's working for me - thanks!

Can you maybe release this as a v2.1.1 when you get a chance?

Original comment by [email protected] on 2 Mar 2010 at 10:28

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
I would love to release this as version 2.1.1, but I'm a little nervous about 
doing it 
before we hear from pemerson.

Original comment by [email protected] on 2 Mar 2010 at 10:34

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
pemerson, have you had a chance to try out the fix from yesterday?  Thanks.

Original comment by [email protected] on 3 Mar 2010 at 8:47

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
So strange, I replied, but it looks like gmail ate the outbound email.

All good here!

I think 12 seconds is far too long for a parallel ssh to two nodes,
but that's probably for a separate thread.

Here's the output:

$ time pssh -i -H localhost -H localhost whoami
[1] 02:39:44 [SUCCESS] localhost
pete
[2] 02:39:45 [SUCCESS] localhost
pete

real    0m12.921s
user    0m10.676s
sys     0m1.402s

Original comment by [email protected] on 4 Mar 2010 at 6:09

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
pemerson, it might be related, so maybe it should still go in this bug report.  
Unfortunately, I'm not having much luck reproducing it.  On my Python 2.4 
system, 
pssh does the parallel ssh to two nodes in 0.33 seconds on average.  Do you 
have any 
other information that would help reproduce it?  If not, I could whip up a 
custom 
commit with a bunch of print statements that might be able to give more 
information.

I should probably go ahead and release pssh 2.1.1 now, to at least get it 
working for 
people with Python 2.4, but let's keep on working on your problem in this issue 
for 
now.

Original comment by [email protected] on 4 Mar 2010 at 6:52

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Well, it's definitely in the script, as this works with all due speed:

$ cat mypssh 
#!/usr/bin/python

import os

os.system("ssh -A localhost whoami")
os.system("ssh -A localhost whoami")

$ time ./mypssh 
pete
pete

real    0m1.236s
user    0m0.014s
sys 0m0.021s

Other than that, I'm not sure how I can help, but I'd be glad to run a custom 
pssh 
when you can add in some debugging / timing statements.

Pete

Original comment by [email protected] on 4 Mar 2010 at 7:00

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
I've released PSSH 2.1.1.  At least people with Python 2.4 shouldn't see 
crashes 
anymore.

pemerson, I just pushed a branch called "issue15".  Would you please do a "git 
pull; 
git checkout issue15" and give me the output?  The debugging info is a little 
crude, 
but if it turns out to be helpful, I might leave it in and add a "--debug" 
option or 
something.

Original comment by [email protected] on 4 Mar 2010 at 7:40

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Did you git push issue15?

$ git clone git://aml.cs.byu.edu/pssh.git
Initialized empty Git repository in /home/pete/pssh/.git/
remote: Counting objects: 771, done.
remote: Compressing objects: 100% (423/423), done.
remote: Total 771 (delta 540), reused 452 (delta 323)
Receiving objects: 100% (771/771), 198.62 KiB, done.
Resolving deltas: 100% (540/540), done.
$ cd pssh
$ git checkout issue15
error: pathspec 'issue15' did not match any file(s) known to git.

Original comment by [email protected] on 4 Mar 2010 at 7:50

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Oops.  That should have been "git checkout origin/issue15".  Sorry for the 
mistake.

Original comment by [email protected] on 4 Mar 2010 at 7:55

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Ah, well, I'm still a git newb (but liking what I've seen so far)!

$ time pssh -i -H localhost -H localhost whoami
Thu Mar  4 20:04:32 2010 process starting
Thu Mar  4 20:04:38 2010 process started
Thu Mar  4 20:04:38 2010 process starting
Thu Mar  4 20:04:44 2010 process started
Thu Mar  4 20:04:44 2010 task still running
Thu Mar  4 20:04:44 2010 task still running
Thu Mar  4 20:04:44 2010 starting select
Thu Mar  4 20:04:44 2010 select finished
Thu Mar  4 20:04:44 2010 closing stderr
Thu Mar  4 20:04:44 2010 task still running
Thu Mar  4 20:04:44 2010 task still running
Thu Mar  4 20:04:44 2010 starting select
Thu Mar  4 20:04:44 2010 select finished
Thu Mar  4 20:04:44 2010 closing stdout
Thu Mar  4 20:04:44 2010 task finished
[1] 20:04:44 [SUCCESS] localhost
pete
Thu Mar  4 20:04:44 2010 task still running
Thu Mar  4 20:04:44 2010 task still running
Thu Mar  4 20:04:44 2010 starting select
Thu Mar  4 20:04:45 2010 select finished
Thu Mar  4 20:04:45 2010 task still running
Thu Mar  4 20:04:45 2010 starting select
Thu Mar  4 20:04:45 2010 select finished
Thu Mar  4 20:04:45 2010 closing stdout
Thu Mar  4 20:04:45 2010 task still running
Thu Mar  4 20:04:45 2010 starting select
Thu Mar  4 20:04:45 2010 select finished
Thu Mar  4 20:04:45 2010 closing stderr
Thu Mar  4 20:04:45 2010 task still running
Thu Mar  4 20:04:45 2010 starting select
Thu Mar  4 20:04:45 2010 handling sigchld
Thu Mar  4 20:04:45 2010 select interrupted
Thu Mar  4 20:04:45 2010 task finished
[2] 20:04:45 [SUCCESS] localhost
pete

real    0m13.008s
user    0m10.684s
sys 0m1.394s

Original comment by [email protected] on 4 Mar 2010 at 8:06

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Fascinating.  I put a timestamp just before the Popen and just after the Popen 
on a 
whim.  I really didn't think there was a chance that the Popen would actually 
be 
hanging.  I have know idea why the Popen call would hang for 6 seconds.  Do you 
have 
any ideas?

Original comment by [email protected] on 4 Mar 2010 at 8:39

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
This probably isn't relevant, but what do you get if you do this in the Python 
interactive interpreter:

os.sysconf("SC_OPEN_MAX")

Original comment by [email protected] on 4 Mar 2010 at 9:40

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
$ python
Python 2.4.3 (#1, Sep  3 2009, 15:37:37) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.sysconf("SC_OPEN_MAX")
1000000

Original comment by [email protected] on 4 Mar 2010 at 9:44

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
What did you do to your system? :)  On mine, SC_OPEN_MAX is 4096.

It looks like what's happening is it's taking forever to close all open file 
descriptors.  In Python 2.6, they added os.closerange to make this more 
efficient 
when the maximum file descriptor is really high.  To improve performance for 
older 
versions of Python, we could set FD_CLOEXEC with fcntl on all of our file 
descriptors.  For more information on the problem, see:

http://bugs.python.org/issue1663329

I'll try to see how bad it is to set FD_CLOEXEC as a long-term workaround.

Original comment by [email protected] on 4 Mar 2010 at 10:11

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Okay, try running the latest master (with the "set FD_CLOEXEC" commit), and see 
if 
that goes more quickly.

Original comment by [email protected] on 4 Mar 2010 at 10:30

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
Oh, HUGE win. Well done!

$ time pssh -i -H localhost -H localhost whoami
[1] 23:41:41 [SUCCESS] localhost
pete
[2] 23:41:41 [SUCCESS] localhost
pete

real    0m0.895s
user    0m0.075s
sys 0m0.031s

Original comment by [email protected] on 4 Mar 2010 at 11:43

from parallel-ssh.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 23, 2024
I'm glad I could make you happy. :)  So why does your system have such a high 
maximum
file descriptor number?

Anyway, this fix will show up in version 2.2, which I'm guessing is about a 
month
away.  One of the main holdups there is man pages; if you want 2.2 to happen 
more
quickly, feel free to help with issue #10. :)

Original comment by [email protected] on 5 Mar 2010 at 3:57

  • Changed state: Verified

from parallel-ssh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.