Comments (5)
I found this thread and checked my PATH and LD_LIBRARY_PATH
horovod/horovod#133
The LD_LIBRARY_PATH does not exist, but I am not an admin on the server and I do not think that's all there is to the problem because I was able to run a different job a few days ago!
from ompi.
The error message is telling you that your application decided to abort for some reason (i.e., it called the MPI_ABORT API function). I'm unfamiliar with CP2K, so I don't know why it would have done that. You might want to look through the output and see if there's other warning/error messages before the abort message.
Also, Open MPI v3.1.1 is fairly ancient. At a bare minimum, I would suggest upgrading to the latest 3.1.x version (v3.1.6), because it contains bunches of bug fixes beyond 3.1.1.
That being said, 3.1.6 is from March of 2020, and is still pretty ancient. We are unlikely to ever make any more releases in the v3.1.x series.
The most recent version of Open MPI is v5.0.3 -- I'd suggest upgrading to that.
from ompi.
Hello @jsquyres Jeff, Thank you for your response!
That was actually the only message in the output and no error file was created. I understand that it is an ancient version, but this server is unfortunately not managed by me and the CP2K package relies on the 3.1.1 version: this is what comes up when I type
module show cp2k
**
**
Unfortunately, the most recent version of openmpi I have access to is 4.1.4.
I also tried running a different simulation and I got another MPI error, albeit a different one:
[[57845,1],0]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: c0279 Another transport will be used instead, although this may result in lower performance. NOTE: You can disable this warning by setting the MCA parameter btl_base_warn_component_unused to 0. -------------------------------------------------------------------------- -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them.
from ompi.
With Open MPI v4.1.4, it looks like you got an additional warning but the same underlying error (i.e., the application invoked MPI_ABORT
). The CP2K application has chosen to abort; you'll have to look at their docs and/or source code for more information on why the application chose to abort.
I'm afraid we can't help you with whatever environment NEU has setup to run CP2K, nor can we help with CP2K itself -- we're not involved in either of those organizations.
from ompi.
Hello Jeff,
I was able to run a few CP2K jobs from a tutorial website - the Shell still outputs MPI errors, but no aborts. I am assuming, like you suggested, that it is a problem with my input files, and not the MPI package. thank you!
from ompi.
Related Issues (20)
- Open MPI fails with 480 processes on a single node HOT 5
- configure: error: C compiler cannot create executables HOT 6
- Unable to run openMPI from two machines HOT 5
- mca_pml_ob1_recv_frag_callback_match occasional segfault HOT 9
- OpenMPI configure script wrongly recognizes which directive to use for ignoring tkr in case of the new LLVM Fortran compiler HOT 9
- mpirun 5.0.2 hangs - ssh works HOT 11
- --with-cuda failes to find libcuda.so HOT 4
- Scaling issue run openmp on a cluster HOT 4
- openmpi osc_ucx_component error HOT 4
- Error using openmpi mpirun in Fedora 40 HOT 5
- Errors when running mpi programs HOT 5
- Trying to run MPI 3.0.6 on docker HOT 6
- problem with MPI_Comm_Create_Group HOT 8
- Error `Could not find viable pmix build` while building in Docker HOT 2
- COLL/UCC doesn't compile against head of UCC at master HOT 2
- Support zero-copy non-contiguous send HOT 4
- OpenMPI/5.0.3 with PMIx/4.2.7 compilation error HOT 2
- Configure --with-tm=/opt/pbs/ with PBS Professional fails with openmpi-5.0.3, succeeds with openmpi-4.1.4 HOT 9
- Failed to build RPM from SRPM because of large UID and old tar command HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ompi.