Comments (4)
I'm unaware of any limitation on number of concurrent mpiruns, but I don't really understand what you are trying to do. A far cleaner way of doing this would be to start the PRRTE DVM (just prte
) and then use prun
to launch the individual jobs. Avoids all the overhead of starting the RTE over and over again, and loading the file system with creating and removing all the session directories for each of those mpirun instances.
Setting that aside, all the output is telling you is that one of your processes didn't exit properly - likely failed to call MPI_Finalize
before terminating. You'd get a different error message if it had segfault'd, so I suspect that isn't what happened. Probably just something that triggered an error escape in your job.
from ompi.
to start the PRRTE DVM (just prte) and then use prun to launch
I am not experienced in any of this, so its' not something that I know much about, but can look into it
all the output is telling you is that one of your processes didn't exit properly
Yes I can see that, the issue here is that from my point of view, it shouldn't be happening, and it only happens sometimes. With no real clear indication of what's happening or why
Anyway, I haven't seen it since upgrading from fedora 39, to 40, so hopefully it's transient
from ompi.
This may be related to #10117 ?
from ompi.
No - totally unrelated unless you see your procs are crashing, which isn't what you report. It sounds to me like the issue is something in your integration with the OS if upgrading fedora solves the problem. I very much doubt it is something in OMPI causing you to exit improperly - that would almost always show as a segfault.
from ompi.
Related Issues (20)
- Remove script wrappers in v6.0.x HOT 3
- 5.0.3: hostfile help file is misnamed HOT 3
- Warnings in oshmem HOT 3
- lower coll accelerator priority
- Improve documentation of the `coll tuned` component
- mpirun V5.0.3 hangs running hello_world in two nodes HOT 15
- Reduce collectives latency by reusing temporary buffers
- unable to find /usr/lib64/libmunge.la HOT 4
- Bad persistent collective communication performance HOT 2
- Wrong configure check for cuMemFree HOT 1
- v5.0.4rc1 release
- Why does the program still listen to 0.0.0.0 after I set the mca parameter btl_tcp_if_include,the btl_tcp_if_include not effect? HOT 11
- Build fails with LTO
- Build fails with strict-aliasing violations
- Memory leaks when calling mpi_file_write_all HOT 1
- Current status of the communication resources instance optimization HOT 2
- io/ompio: mca parameters are not recognized in some instances
- refactor --with-prrte configure option HOT 2
- v5.0.4 release
- Release 5.0.4 failed to build on GitHub Actions macOS 14.5 arm64 HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ompi.