Comments (13)
@rhc54 Sure thing. I opened #12626
from ompi.
FWIW, the cmd line parser is fine. The problem is that the shell removes your quotes when passing the cmd line into mpirun
. In version v4.x, we added quotes around every single argv before executing it. We didn't do that in v5.
I'm experimenting with adding the quotes again to see what problems (if any) it creates. IIRC, there were times when the quotes raised issues, but perhaps my memory has faded.
from ompi.
from ompi.
Well, that's an opinion and may be a fair one - but still just one opinion.
My point was only that this has nothing to do with the cmd parser - it is accurately parsing what the shell gave us. The cmd mpirun
is executing is accurate:
argv[0]: bash
argv[1]: -c
argv[2]: $@
argv[3]:
argv[4]: echo
argv[5]: 1 2 3
The problem is that echo
takes that argv and echo's the characters one at a time, ignoring the spaces between them. This is an accurate reflection of the real behavior:
$ echo 1 2 3
1 2 3
So in that sense, mpirun
is in fact generating the same output as the bare code would have given.
To get your behavior, I have to do:
$ echo '1 2 3'
1 2 3
which is not what the shell gave us.
The questions therefore are: should we always assume quotes for all argv strings, and what issues does that create? Or is there another way one could format your cmd line so that the shell actually gives the quotes to mpirun
?
from ompi.
Fiddling around a bit, I can get mpirun
to replicate your desired behavior...but only if I add quotes around the $@
argv element and nothing else. Adding quotes to any other argv results in either an error (e.g., if putting quotes around bash
) or the incorrect output (e.g., if putting quotes around the 1 2 3
element, which causes echo
to include the quotes in its output).
We probably need to look more closely at what v4.1 is doing - could be you are just getting lucky over there. This increasingly feels like you may not have the right quoting around your cmd line elements.
from ompi.
my 0.02US$
no Open MPI involved:
$ strace -f -e execve bash -c '"$@"' '' echo '1 2 3'
execve("/usr/bin/bash", ["bash", "-c", "\"$@\"", "", "echo", "1 2 3"], 0xffffffffdb10 /* 118 vars */) = 0
1 2 3
+++ exited with 0 +++
With Open MPI 4.1.6
$ strace -f -e execve ~/local/openmpi-4.1.6/bin/mpirun -np 1 bash -c '"$@"' '' echo '1 2 3'
execve("/home/rist/r00018/local/openmpi-4.1.6/bin/mpirun", ["/home/rist/r00018/local/openmpi-"..., "-np", "1", "bash", "-c", "\"$@\"", "", "echo", "1 2 3"], 0xffffffffdac8 /* 118 vars */) = 0
[pid 490291] execve("/usr/bin/bash", ["bash", "-c", "\"$@\"", "", "echo", "1 2 3"], 0x62fad0 /* 168 vars */) = 0
1 2 3
With Open MPI main
$ strace -f -e execve ~/local/ompi/bin/mpirun -np 1 bash -c '"$@"' '' echo '1 2 3'
execve("/home/rist/r00018/local/ompi/bin/mpirun", ["/home/rist/r00018/local/ompi/bin"..., "-np", "1", "bash", "-c", "\"$@\"", "", "echo", "1 2 3"], 0xffffffffdad8 /* 118 vars */) = 0
execve("/home/rist/r00018/local/ompi/bin/prterun", ["/home/rist/r00018/local/ompi/bin"..., "-np", "1", "bash", "-c", "\"$@\"", "", "echo", "1 2 3"], 0x4a0520 /* 123 vars */) = 0
[pid 490301] execve("/usr/bin/bash", ["bash", "-c", "$@", "", "echo", "1 2 3"], 0x58e800 /* 156 vars */) = 0
1 2 3
I note that execve("/usr/bin/bash", ...)
is invoked with the exact same parameters if bash
is invoked directly or via Open MPI 4.1.6 mpirun
.
However, with the main branch, \"$@\"
has been replaced (i guess by prrte
) with \"$@\"
At first glance, this suggests the surrounding quotes have been removed, but if I read your comment correctly, they have been removed (as we used to) but they were never added back.
from ompi.
@rhc54 from prte.c
main()
pargv = pmix_argv_copy_strip(argv); // strip any incoming quoted arguments
what is the rationale for stripping quoted arguments?
also, pmix_argv_copy_strip()
assumes the last element of argv
is NULL
.
Are you positive this is the case forargv
here (e.g. the second argument of main
)?
I would think that is not the case and we need to use argc
to know where argv
ends.
from ompi.
what is the rationale for stripping quoted arguments?
We've had problems in the past with quoted values as arguments to params - e.g., MCA params. However, I have no problem with just telling people in those scenarios to fix their cmd line. As I said above, it isn't clear to me that there is a perfect answer here. Whether it is mpirun
or a simple bash script launcher, there is always a problem with quotes passing thru layers of indirection.
also, pmix_argv_copy_strip() assumes the last element of argv is NULL.
Are you positive this is the case forargv here (e.g. the second argument of main)?
Last I checked, we always assume argv
is NULL
terminated across the code base - AFAIK, that's part of the C standard. FWIW, this is what I found when searching for it:
Yes. The non-null pointers in the argv array point to C strings, which are by definition null terminated.
The C Language Standard simply states that the array members "shall contain pointers to strings"
(C99 §5.1.2.2.1/2). A string is "a contiguous sequence of characters terminated by and including
the first null character" (C99 §7.1.1/1), that is, they are null terminated by definition.
Further, the array element at argv[argc] is a null pointer, so the array itself is also, in a sense,
"null terminated."
If that isn't true, then there are a lot of places in PMIx, PRRTE, and OMPI that are impacted...and have been for 20+ years. 😄
from ompi.
See openpmix/openpmix#3353 and openpmix/prrte#1984. As I said, we'll try it this way for now and see what problems we encounter.
from ompi.
That's embarrassing, but this is something I have been ignoring for the past 20+ years, and every single time I wrote a parser I wished I could simply pass argv
to execv()
...
Well, better late than never :-)
Thanks for the pointer!
from ompi.
@rhc54 Thanks for the quick fix. It would be great if we can get new prrte 3.0.x and pmix 5.0.x releases in the 1st week of July - we plan to release ompi 5.0.4 in the week following.
from ompi.
I can do that, but first I'd need to see updates in OMPI so I can check MTT. Otherwise, I can't tell if the release branches are in a "good" state. Can someone "pull" the branches?
from ompi.
This should be fixed in 5.0.4 release scheduled in 7/2024
from ompi.
Related Issues (20)
- Remove script wrappers in v6.0.x HOT 3
- 5.0.3: hostfile help file is misnamed HOT 3
- Warnings in oshmem HOT 3
- lower coll accelerator priority
- Improve documentation of the `coll tuned` component
- mpirun V5.0.3 hangs running hello_world in two nodes HOT 15
- Reduce collectives latency by reusing temporary buffers
- unable to find /usr/lib64/libmunge.la HOT 4
- Bad persistent collective communication performance HOT 2
- Wrong configure check for cuMemFree HOT 1
- v5.0.4rc1 release
- Why does the program still listen to 0.0.0.0 after I set the mca parameter btl_tcp_if_include,the btl_tcp_if_include not effect? HOT 11
- Build fails with LTO
- Build fails with strict-aliasing violations
- Memory leaks when calling mpi_file_write_all HOT 1
- Current status of the communication resources instance optimization HOT 2
- io/ompio: mca parameters are not recognized in some instances
- refactor --with-prrte configure option HOT 2
- v5.0.4 release
- Release 5.0.4 failed to build on GitHub Actions macOS 14.5 arm64 HOT 18
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ompi.