Comments (3)
@jeffhammond @albandil @langou
I think there is 2 main issues here.
-
First let's ignore MPI. the BLACS functions BI_XXX are scalapack functions and are called from within the Fortran or C routines, thus if Fortran and C have their
Int
compiled as64bits integer
, then the arguments to the BI_XXX functions are of typedef 64bits and thus should beInt
notMpiInt
to avoid erroneous and confusion. Moreover for example dgamx2d_.c line 20 we havelen[0] = len[1] = N;
len is of type MpiInt while N is of type Int. I believe we should fix this to have both of typeInt
, and if MPI provide any 64bit API, it should be handled inside the BI_XXX function. -
related to the above, I am not sure today how compiling scalapack for 64bit will work, because currently
Int
is set by the user to 64bits integer while MpiInt is hardcoded toint
, I guess the compiler is doing the automatic casting but this can be a source of a lot of bugs in particular if the int64 that is passed to MPI overflow. I suggest immediate fix for it, by casting all int64 to int -
Now let talk the MPI issue. according to my understanding, all MPI installation (standard) are 32bit API, thus we need to cast and check for overflow all calls to MPI. casting in the C API is easy but I am not sure about the Fortran call to MPI, we may need to replace all Mpval by a new variable that will be cast. The other solution for 64bits is for user to use their own compilation. OMPI allows compiling it for 64bit but I don't know about MPICH. Also there is https://github.com/jeffhammond/BigMPI . Anyway, I don't think users would want to compile their own MPI which can result in a bad config and usually it is not preferred on suprcomputer.
-
@jeffhammond I am not sure how MPI_Fint could help here?
from scalapack.
- Yes, BLACS can do whatever it wants as long as the C code safely casts to
int
before calling MPI C API. - It is a good idea to have a macro to cast safely from 64 to 32 bits. NWChem was burned by this at one point when we tried to broadcast more than
INT_MAX
elements. - BigMPI is just a prototype of MPI-4 large-count support, which isn't implemented in Open MPI yet. For this, one could define
MpiInt
toMPI_Count
but then all the C API symbols would need the_c
suffix. MPI_Fint
doesn't matter in this specific situation but I wanted to mention it anyways, because it is used in C-Fortran interoperability APIs.
from scalapack.
Suggestion:
I would suggest to replace all MpiInt
originally int
into Int
(similar to when int
was replaced by Int
), so we know all integer computation and indices work with the same typedef (32 or 64), and then any call to MPI_XXX
must be replaced by a wrapper WRAPPER_MPI_XX
with Int
used as input argument and inside it, got cast/verified_below_MAX_INT then call the corresponding MPI call.
This way, we gain two folds.
- first we will be able to check the return status of MPI (that was never done in BLACS) and fail with
BI_BlacsErr
in case of error - second check if any
int
is above MAX_INT then we can either fail as unsupported for now and later if we found that scalapack need above MAX_INT communication, then we might figure out a solution learning from @jeffhammond work for BIGMPI. @jeffhammond I am happy to discuss the solution provided in BIGMPI to learn more how you provided a workaround so maybe we can add similar functionality in the wrappers. A naive solution that might work would be to create our own dataType (which wil work for point to point comm) but for global comm we might split the operation (Reduce, BACST) over chunk that fit MAX_INT, however it might be complex for async comm. Anyway I don't think scalapack needs above MAX_INT comm as most of the comm work for NB block.
I compiled a list of MPI function that are called from BLACS and it seems there is about 41 functions that need wrappers.
any suggestions are welcome.
MPI_Barrier
MPI_Error_class
MPI_Finalize
MPI_Abort
MPI_Init
MPI_Initialized
MPI_Allreduce
MPI_Bcast
MPI_Irecv
MPI_Isend
MPI_Recv
MPI_Reduce
MPI_Rsend
MPI_Send
MPI_Sendrecv
MPI_Testall
MPI_Waitall
MPI_Comm_f2c
MPI_Comm_c2f
MPI_Comm_create
MPI_Comm_dup
MPI_Comm_free
MPI_Comm_get_attr
MPI_Comm_group
MPI_Comm_rank
MPI_Comm_size
MPI_Comm_split
MPI_Group_free
MPI_Group_incl
MPI_Unpack
MPI_Pack
MPI_Pack_size
MPI_Get_count
MPI_Op_create
MPI_Op_free
MPI_Type_commit
MPI_Type_create_struct
MPI_Type_free
MPI_Type_indexed
MPI_Type_vector
MPI_Type_match_size
from scalapack.
Related Issues (20)
- MPI not linked during build? HOT 9
- Adopt a conservative ABI / SOVERSION policy HOT 1
- xshseqr and xdhseqr fail with FPE if run in parallel HOT 4
- build errors on GCC 10 (gfortran) HOT 2
- Wrong lwmin in pdstedc HOT 1
- Compiling With mpich ERROR undefined reference to `MPI_Type_free' HOT 1
- Input parameters checks in p?potri don't work. HOT 1
- Compilation issue - missing function definitions? HOT 8
- Instaled library maintains RUNPATH HOT 2
- Wrong description of punmrq, punmlq functions
- use CMAKE_INSTALL_LIBDIR to allow for library installation in multiarch contexts HOT 1
- p*trrfs not listed in SRC/Makefile or SRC/CMakeLists.txt HOT 1
- Wrong pdlapiv or pdlapv2 output in ScaLAPACK HOT 2
- 2.2.1 release HOT 1
- Argument mismatches
- Build fails with strict-aliasing violations.
- CMake issues for building scalapack with VS2022 in windows
- Build failed due to implicit declaration HOT 2
- Wrong output for `igebs2d`, `igebr2d` (matrix broadcast) in ScaLAPACK
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scalapack.