Giter Site home page Giter Site logo

rabauke / mpl Goto Github PK

View Code? Open in Web Editor NEW
161.0 161.0 27.0 32.68 MB

A C++17 message passing library based on MPI

Home Page: https://rabauke.github.io/mpl/html/

License: BSD 3-Clause "New" or "Revised" License

C++ 96.56% CMake 3.44%
c-plus-plus c-plus-plus-17 cluster-computing header-only high-performance-computing hpc library message-passing-interface mpi mpi-standard super-computing wrapper-library

mpl's People

Contributors

jacobmerson avatar jsharpe avatar mlund avatar rabauke avatar raulppelaez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mpl's Issues

Can API changes be more clearly marked?

I notice that

std::communicator::split

has been renamed

std::communicator::split_tag

and the identifier split is not an object, rather than a class.

How can I be apprised of such API changes?

v0.2.0 doxygen latex fails

! LaTeX Error: Something's wrong--perhaps a missing \item.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...

l.360 \end{DoxyTemplParams}

Error seems to be in inclusion of classmpl_1_1impl_1_1base__communicator.tex. If I comment out the offending block I get an error on the next DoxyTemplParams.

document use from cmake

I notice that the installation contains two .cmake files, but I'm not sure how to use them. Could you document what a user CMakeLists.txt would look like for a program that uses MPL?

Operations on C-style arrays

It is possible to send a

float x[2][3][4]

with an mpl::communcator::send without using the contiguous_layout. However, my attempt to do a reduce on such an array failed.

  float rank2p2p1[2] = { 2*xrank,2*xrank+1 };
  comm_world.allreduce(mpl::plus<float>(), rank2p2p1);

Is there something in the design of operators that prevents this?

need advice: split_shared_memory

  mpl::communicator shared_comm
    ( mpl::communicator::split_shared_memory, world_comm );

gives

/work2/00434/eijkhout/mpl/installation-mpl-0.3.0-clx-gcc12-impi/include/mpl/comm_group.hpp:4109:29: error: static assertion failed: not an enumeration type or underlying enumeration type too large
 4109 |       static_assert(detail::is_valid_tag_v<key_type>,
      |                     ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
/work2/00434/eijkhout/mpl/installation-mpl-0.3.0-clx-gcc12-impi/include/mpl/comm_group.hpp:4109:29: note: ‘mpl::detail::is_valid_tag_v<int>’ evaluates to false

Error with maybe_unused

I find these errors when trying to compile with a Makefile that uses CXX = mpicxx and CXXFLAGS = -std=c++17 -Wall -Wextra -O3. mpicxx --version is: g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20). Are the errors raised by the version of the compiler? Thanks.

In file included from /tcga_data/include/mpl/mpl/mpl.hpp:51,
from vga_paramLayout_job.cpp:13:
/tcga_data/include/mpl/mpl/layout.hpp:383:17: error: expected unqualified-id before ‘[’ token
null_layout([[maybe_unused]] const null_layout &l) noexcept : null_layout() {
^
/tcga_data/include/mpl/mpl/layout.hpp:383:17: error: expected ‘)’ before ‘[’ token
null_layout([[maybe_unused]] const null_layout &l) noexcept : null_layout() {
~^
)
/tcga_data/include/mpl/mpl/layout.hpp:386:17: error: expected unqualified-id before ‘[’ token
null_layout([[maybe_unused]] null_layout &&l) noexcept : null_layout() {
^
/tcga_data/include/mpl/mpl/layout.hpp:386:17: error: expected ‘)’ before ‘[’ token
null_layout([[maybe_unused]] null_layout &&l) noexcept : null_layout() {
~^
)
In file included from /tcga_data/include/mpl/mpl/mpl.hpp:57,
from vga_paramLayout_job.cpp:13:
/tcga_data/include/mpl/mpl/comm_group.hpp:4059:27: error: expected unqualified-id before ‘[’ token
explicit communicator([[maybe_unused]] comm_collective_tag comm_collective,
^
/tcga_data/include/mpl/mpl/comm_group.hpp:4059:27: error: expected ‘)’ before ‘[’ token
explicit communicator([[maybe_unused]] comm_collective_tag comm_collective,
~^
)
/tcga_data/include/mpl/mpl/comm_group.hpp:4072:27: error: expected unqualified-id before ‘[’ token
explicit communicator([[maybe_unused]] group_collective_tag group_collective,
^
/tcga_data/include/mpl/mpl/comm_group.hpp:4072:27: error: expected ‘)’ before ‘[’ token
explicit communicator([[maybe_unused]] group_collective_tag group_collective,
~^
)
/tcga_data/include/mpl/mpl/comm_group.hpp:4073:89: error: expected unqualified-id before ‘)’ token
const communicator &other, const group &gr, tag_t t = tag_t{0}) {
^
/tcga_data/include/mpl/mpl/comm_group.hpp:4088:27: error: expected unqualified-id before ‘[’ token
explicit communicator([[maybe_unused]] split_tag split, const communicator &other,
^
/tcga_data/include/mpl/mpl/comm_group.hpp:4088:27: error: expected ‘)’ before ‘[’ token
explicit communicator([[maybe_unused]] split_tag split, const communicator &other,
~^
)
/tcga_data/include/mpl/mpl/comm_group.hpp:4107:27: error: expected unqualified-id before ‘[’ token
explicit communicator([[maybe_unused]] split_shared_memory_tag split_shared_memory,
^
/tcga_data/include/mpl/mpl/comm_group.hpp:4107:27: error: expected ‘)’ before ‘[’ token
explicit communicator([[maybe_unused]] split_shared_memory_tag split_shared_memory,
~^
)

send a std::variant with isend/irecv

Hi,

Very new to your library, it's really nice. For reference I only found out about it due to this thread I'm following:

mpi-cxx-features

I'd like to send a std::variant. At the moment I get:
class mpl::struct_builder<std::variant<Class1, Class2> >' has no member named 'type'.

Both Class1 and Class2 have been added via the MPL_REFLECTION macro. Before I debug, it occurred to me to check that it's possible to send a std::variant?

Thanks,
Andy

Abuout ialltoallv in IntelMPI 2021.5.1

In the ialltoallv in-place version, the base function "ialltoallv_task" you use the original mpi api "MPI_Ialltoallw" to implement.
image
And, i try to use IntelMPI 2021.5.1 as the base tool, find the case of unit test "ialltoallv_in_place_with_displacements_test" is fail.
Beside that, i try to use openmpi to implement same code is pass, as shown below:

#include <iostream>
#include <vector>
#include "mpi.h"
int main(int argc, char* argv[]) {
    int N_processes, rank;
    MPI_Init(&argc, &argv); 
    MPI_Comm_size(MPI_COMM_WORLD, &N_processes);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    using T = int;
    T val = 1.0;
    T send_val{val};

    std::vector<T> sendrecv_data;
    std::vector<int> counts(3, 1);
    std::vector<MPI_Datatype> datatypes;
    std::vector<int> recv_disps_int;

    for (int i = 0; i <rank; ++i)
        ++send_val;
    int displ{0};
    for (int j = 0; j < N_processes; ++j) {
        const int N_sendrecv{j + rank + 1};  // must be symmetric in j and rank
        for (int i{0}; i < N_sendrecv; ++i) {
            sendrecv_data.push_back(send_val);
        }
        MPI_Datatype new_type;
        MPI_Type_contiguous(N_sendrecv, MPI_INT, &new_type);
        MPI_Type_commit(&new_type);
        datatypes.push_back(new_type);

        recv_disps_int.push_back(sizeof(T) * displ);
        displ += N_sendrecv;
    }

    MPI_Request req1;
    MPI_Ialltoallw(
        MPI_IN_PLACE,
        nullptr,
        nullptr,
        nullptr,
        sendrecv_data.data(),
        counts.data(),
        recv_disps_int.data(),
        datatypes.data(),
        MPI_COMM_WORLD,
        &req1);
    MPI_Wait(&req1, MPI_STATUS_IGNORE);

    if (rank == 0) {
        std::cout << "sendrecv_data.size()= " << sendrecv_data.size() << std::endl;
        for (size_t i{0}; i < sendrecv_data.size(); ++i) {
            std::cout << sendrecv_data[i] << " ";
        }
        std::cout <<  std::endl;
    }

    MPI_Finalize();
    return 0;
}

So, here are my test results. Thanks

Obtain raw MPI_Comm from mpl::communicator

In reference to #22 I am trying to use the raw c-api from MPI:

int MPI_Intercomm_create(MPI_Comm local_comm, int local_leader,
                         MPI_Comm peer_comm, int remote_leader, int tag, MPI_Comm * newintercomm)

How can I obtain the raw MPI communicator for comm_world, and the raw MPI communicator from my group, if the group was constructed from:

mpl::communicator group(mpl::communicator::split, comm_world, index);

where comm_world is an mpl type. I need these raw MPI variables for local_comm and peer_comm. I suppose I'm asking is there a function:

MPI_Comm mpl::communicator::GetRawMPIComm()
{
    return comm_;
}

Within the same code base how can I interact with the MPI library such as using MPI_Intercomm_create (see #22 for reasons why) and the MPL library if I need to get access to the underlying MPI "types/variables" from MPL?

Any help and or advice welcome (and work arounds as I need to get this working).

Many thanks,

Composing derived datatype

I can not figure out how to do this.

    mpl::contiguous_layout<int> type1(7);
    mpl::contiguous_layout< type1 > type2(8); // WRONG1
    mpl::contiguous_layout< mpl::contiguous_layout<int> > type2(8); // WRONG 2

bug in sendrecv?

Some overloads in comm_group.hpp (the ones using iterators I think) such as this one:

mpl/mpl/comm_group.hpp

Lines 1028 to 1041 in c4c83cf

template<typename iterT>
status sendrecv_replace(iterT begin, iterT end,
int dest, tag sendtag, int source, tag recvtag) const {
using value_type=typename std::iterator_traits<iterT>::value_type;
if (detail::is_contiguous_iterator<iterT>::value) {
vector_layout<value_type> l(std::distance(begin, end));
return sendrecv_replace(&(*begin), l,
dest, sendtag, dest, recvtag);
} else {
iterator_layout<value_type> l(begin, end);
return sendrecv_replace(&(*begin), l,
dest, sendtag, dest, recvtag);
}
}

Ignore the "source" argument and instead always pass dest twice to the underlying version.

Is this by design?
If so, could you give me some insight on why?

Thanks!

Visibility is low due to name conflicting boost-mpl.

Hello,

this is probably the most up to date MPI C++ interface, but finding it among hundreds of other libraries were impossible until you mentioned it in the MPI forum. This is because MPL is a very common abbreviation and even has correspondents in C++. Do you have any plans to change this?

While I'm at it; do you have any plans to support the vcpkg build system https://github.com/microsoft/vcpkg which is the de-facto standard package manager for C++?

Support for C++20 ranges

It would be convenient to be able to transmit std::ranges view (C++20), e.g. a filtered or transformed sequence. I think there's a current limitation when passing iterators in that begin() and end() must be of the same type. This is usually not the case for ranges. Until C++20 I use range-v3 but currently need to copy into a STL vector before mpi communication, see here:

https://github.com/mlund/faunus/blob/00ab258465bf09c02593fa66d85a405fc07b2a19/src/move.cpp#L391

(Update: I realise that the example code is not representative as the buffer is here replaced)

Obtain raw MPI_Comm from mpl::communicator - Version 2

Leading on from our discussion on #23 with the requirements for this given in #23 and #22, would the following be more appealing for inclusion into MPL (implemented against 5264b90)?

diff --git a/mpl/comm_group.hpp b/mpl/comm_group.hpp
index a01fd30..1049c8d 100644
--- a/mpl/comm_group.hpp
+++ b/mpl/comm_group.hpp
@@ -279,7 +279,34 @@ namespace mpl {
   protected:
     MPI_Comm comm_{MPI_COMM_NULL};
 
+    /// \brief Obtain access to underlying mpi communicator
+    /// \return raw MPI_Comm communicator
+    const MPI_Comm& get_mpi_comm() const
+    {
+        return comm_;
+    }
+
   public:
+
+    /// \brief Allow raw MPI commands to be run 
+    /// that need access to the MPI_Comm from this
+    /// \param userCode a functor provided by the user taking a const MPI_Comm&
+    template<typename UserCode>
+    void execute_raw_mpi(const UserCode& userCode) const
+    {
+        userCode(get_mpi_comm());
+    }
+
+    /// \brief Allow raw MPI commands to be run 
+    /// that need access to the MPI_Comm from this, and another MPI_Comm from other
+    /// \param userCode a functor provided by the user taking a const MPI_Comm& (from this as the first argument) and a const MPI_Comm& from o
+    /// \param other another communicator whoes MPI_Comm is passed as the second argument
+    template<typename UserCode>
+    void execute_raw_mpi(const UserCode& userCode, const communicator& other) const
+    {
+        userCode(get_mpi_comm(), other.get_mpi_comm());
+    }
+
     /// \brief Equality types for communicator comparison.
     enum class equality_type {
       /// communicators are identical, i.e., communicators represent the same communication

As an example, let's assume you hadn't implemented MPI_Comm_Size in your library (I know you have), then the use case for execute_raw_mpi(const UserCode& userCode) would be (assuming comm_world is an mpl::communicator):

        int checkSize = -1;
        const auto CheckSize = [&checkSize, &comm_world](const MPI_Comm &world)
        {
            const auto result = MPI_Comm_size(world, &checkSize);
            if (result != MPI_SUCCESS)
            {
                std::cout << "mpl::communicator::execute_raw_mpi failed with mpi_error " << result << ", global rank: " << comm_world.rank() << std::endl;
                comm_world.abort(EXIT_FAILURE);
            }  
        };
        comm_world.execute_raw_mpi(CheckSize);
        assert(checkSize == comm_world.size());

This gives an example for the single argument version of execute_raw_mpi allowing users to call raw MPI for functions that take one MPI_Comm. For the use case that interests me (see #23 and #22), that is to say when two MPI_Comm's are required, then the two argument version of execute_raw_mpi can be called. Here is an example (where comm_world and groupcomm are mpl::communicators):

        const int intercomm_create_tag = 99;
        const auto Create = [this, &comm_world](const MPI_Comm &world, const MPI_Comm &group)
        {
            const auto result = MPI_Intercomm_create(group, 0, world, remoteleader, intercomm_create_tag, &intercomm);
            if (result != MPI_SUCCESS)
            {
                std::cout << "mpl::communicator::execute_raw_mpi failed with mpi_error " << result << ", global rank: " << comm_world.rank() << std::endl;
                comm_world.abort(EXIT_FAILURE);
            }
        };
        comm_world.execute_raw_mpi(Create, groupcomm);

Here we can see that the stored comm_ is never actually returned to the user which was the case in my previous suggestion (see #23), which looked like this:

MPI_Intercomm_create(groupcomm.get_mpi_comm(), 0, comm_world.get_mpi_comm(), remoteleader, intercomm_create_tag, &intercomm);

I believe the intent is clear; that is to allow execution of arbitrary code that requires access to the underlying MPI_Comm. That being said, it can still be hijacked, see for example (where comm_world is an mpl::communicator):

        MPI_Comm stealComm = MPI_COMM_NULL;
        const auto Steal = [&stealComm](const MPI_Comm &world)
        {
            stealComm = world;
        };
        comm_world.execute_raw_mpi(Steal);
        int checkSizeSteal = -1;
        MPI_Comm_size(stealComm, &checkSizeSteal);
        assert(checkSizeSteal == comm_world.size());

Clearly, one can use these mechanics to obtain the raw MPI_Comm as per the last code-stub, although it does appear more obvious that it's wrong to do so.

What do you think?

spelling of "weigth"

Throughout the definition of dist_graph_communicator change spelling "weigth" to "weight" please.

const problems with native handle

    const mpl::communicator &comm =
      mpl::environment::comm_world();
    MPI_Comm
      world_extract = comm.native_handle(),

gives error: 'this' argument to member function 'native_handle' has type 'const mpl::communicator', but function is not marked const

but without const

    mpl::communicator &comm =
      mpl::environment::comm_world();

gives error: binding reference of type 'mpl::communicator' to value of type 'const mpl::communicator' drops 'const' qualifier

Warning under older intel compiler

In file included from /home1/apps/intel19/impi19_0/mpl/0.3.0/include/mpl/mpl.hpp(57),
                 from filewrite.cxx(19):
/home1/apps/intel19/impi19_0/mpl/0.3.0/include/mpl/comm_group.hpp(5172): warning #823: reference is to parameter "i" (declared at line 5154) -- under old for-init scoping rules it would have been variable "i" (declared at line 5165)
        MPI_Comm_spawn(command[0].c_str(), args_pointers.data(), max_procs, i.info_, root_rank,
                                                                            ^

"on ramp" for legacy code and interop

@rabauke thanks for writing a great library. I'd like to make use of it in some of my projects, but I face the issue that I have legacy code that expect MPI_Comm to be passed around and make use of other third party C libraries that expect the same. In some sense there is no "on ramp" for me to use to convert my code to mpl.

Based on my understanding of the discussion here and here you have concerns about the user getting access to the raw MPI_Comm object for two reasons:

  1. Ownership may be unclear
  2. You don't want to enable mixing of MPI/MPL code because you would like MPL to be a complete self contained ecosystem

Although I understand the sentiment I would like make the following comments on those two points and ask you to reconsider extracting the raw MPI_Comm handle.

In terms of ownership, MPI_Comm when extracted from MPL has pointer semantics. And in modern C++ people expect that a naked pointer means a non owning pointer. Because, if you intend to transfer ownership you will use a smart pointer. In the case of the communicator handle which isn't a pointer so it will not be nullptr but MPI_COMM_NULL returning a reference has clear semantics that most C++ programmers should not have trouble with around object lifetime and scope.

In terms of code mixing I understand that you want to maintain the cleanliness and purity of your library (probably partially why it is so nice). But, this makes mpl usage a bit of a walled garden that prevents me from slowly bringing it onboard to my C++ code, or interfacing with third party libraries over which I have no control.

I understand why you have chosen the current design and respect that. However, I wanted to bring up the issue again as I'm sick of writing my own MPI wrappers...

MPI_Intercomm_create functionality

Hi,

Is MPI_Intercomm_create supported by your library? I couldn't see it. Is it something that could be added?

Is there a recommended way to do it if I've already used:

mpl::communicator group(mpl::communicator::split, comm_world, index);

How would I make the call to MPI_Intercomm_create using group?

Any help really appreciated,
Thanks,
Andy

v0.2.0 install fails on sphinx

-- Installing: /Users/eijkhout/Installation/mpl/installation-0.2.0-macbookair-gcc/lib/cmake/mpl/mplTargets.cmake
CMake Error at doc/cmake_install.cmake:41 (file):
  file INSTALL cannot find
  "/Users/eijkhout/Installation/mpl/build-0.2.0-macbookair-gcc/doc/sphinx/html":
  No such file or directory.
Call Stack (most recent call first):
  cmake_install.cmake:67 (include)

and indeed

ls /Users/eijkhout/Installation/mpl/build-0.2.0-macbookair-gcc/doc/sphinx
conf.py

formatting error in allreduce 2/4 docs

<h2 class="memtitle">allreduce() <span class="overload">[2/4]</span></h2>

<div class="memitem">
<div class="memproto">


void allreduce            (           F            f,                                                    const T *            send_data,                                                    T *            recv_data,                                                    const contiguous_layout< T > &            l                                         )            const | void allreduce | ( | F | f, |   |   | const T * | send_data, |   |   | T * | recv_data, |   |   | const contiguous_layout< T > & | l |   | ) |   | const | inline
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
void allreduce | ( | F | f,
  |   | const T * | send_data,
  |   | T * | recv_data,
  |   | const contiguous_layout< T > & | l
  | ) |   | const

</dd></dl></div></div>allreduce() [2/4]
void allreduce 	( 	F  	f,
		const T *  	send_data,
		T *  	recv_data,
		const [contiguous_layout](file:///Users/eijkhout/Installation/mpl/docs-0.3.0/html/classmpl_1_1contiguous__layout.html)< T > &  	l 
	) 		const
	inline

Performs a reduction operation over all processes and broadcasts the result.

Template Parameters
    F	type representing the element-wise reduction operation, reduction operation is performed on data of type T
    T	type of input and output data of the reduction operation, must meet the requirements as described in the

    embed:rst:inline :doc:`data_types` 

    section

See that literal embed.

Support std::span

The C++20 std::span should be supported in point-to-point communication operations in a similar way as std::vector is already supported. Note that std::span has no resize, which must accounted for in the implementation and usage of recv and irecv.

Including mpl/mpl.hpp throwing errors

Hello!

I am trying to write a script with MPL. For now, I have written a simple hello world program but I have included the header file.

Code:
#include <iostream>
#include <mpl/mpl.hpp>

using namespace std;

int main() {
    cout << "Hello, World!\n";
    return 0;
}

I am using the following command to compile it:
g++ -std=c++17 -I./mpl hello_world.cpp

However, I am getting the following errors:

In file included from ./mpl/mpl/mpl.hpp:49,
                 from hello_world.cpp:3:
./mpl/mpl/layout.hpp:378:17: error: expected unqualified-id before ‘[’ token
     null_layout([[maybe_unused]] const null_layout &l) noexcept : null_layout() {}
                 ^
./mpl/mpl/layout.hpp:378:17: error: expected ‘)’ before ‘[’ token
     null_layout([[maybe_unused]] const null_layout &l) noexcept : null_layout() {}
                ~^
                 )
./mpl/mpl/layout.hpp:380:17: error: expected unqualified-id before ‘[’ token
     null_layout([[maybe_unused]] null_layout &&l) noexcept : null_layout() {}
                 ^
./mpl/mpl/layout.hpp:380:17: error: expected ‘)’ before ‘[’ token
     null_layout([[maybe_unused]] null_layout &&l) noexcept : null_layout() {}
                ~^
                 )
In file included from ./mpl/mpl/mpl.hpp:54,
                 from hello_world.cpp:3:
./mpl/mpl/comm_group.hpp:3939:27: error: expected unqualified-id before ‘[’ token
     explicit communicator([[maybe_unused]] comm_collective_tag comm_collective,
                           ^
./mpl/mpl/comm_group.hpp:3939:27: error: expected ‘)’ before ‘[’ token
     explicit communicator([[maybe_unused]] comm_collective_tag comm_collective,
                          ~^
                           )
./mpl/mpl/comm_group.hpp:3952:27: error: expected unqualified-id before ‘[’ token
     explicit communicator([[maybe_unused]] group_collective_tag group_collective,
                           ^
./mpl/mpl/comm_group.hpp:3952:27: error: expected ‘)’ before ‘[’ token
     explicit communicator([[maybe_unused]] group_collective_tag group_collective,
                          ~^
                           )
./mpl/mpl/comm_group.hpp:3968:27: error: expected unqualified-id before ‘[’ token
     explicit communicator([[maybe_unused]] split_tag split, const communicator &other,
                           ^
./mpl/mpl/comm_group.hpp:3968:27: error: expected ‘)’ before ‘[’ token
     explicit communicator([[maybe_unused]] split_tag split, const communicator &other,
                          ~^
                           )
./mpl/mpl/comm_group.hpp:3987:27: error: expected unqualified-id before ‘[’ token
     explicit communicator([[maybe_unused]] split_shared_memory_tag split_shared_memory,
                           ^
./mpl/mpl/comm_group.hpp:3987:27: error: expected ‘)’ before ‘[’ token
     explicit communicator([[maybe_unused]] split_shared_memory_tag split_shared_memory,
                          ~^
                           )

Can you please tell me how I can resolve this issue?

Thank you for your time!

Need a hint: window / one-sided

I'm failing to find any window / one-sided functions. I would guess that window is under communicator but I'm not finding it.

Cannot figure out waitany

The crucial lines in my code are:

  if (procno==nprocs-1) {
    mpl::irequest_pool recv_requests;
    vector<int> recv_buffer(nprocs-1);
    for (int p=0; p<nprocs-1; p++) {
      recv_requests.push( comm_world.irecv( recv_buffer[p], p ) );
    }
    printf("Outstanding request #=%d\n",recv_requests.size());
    for (int p=0; p<nprocs-1; p++) {
      auto [success,index] = recv_requests.waitany();

This gives on the waitany call:

Assertion failed in file ./src/include/mpir_request.h at line 313: ((req))->ref_count >= 0
0   libpmpi.12.dylib                    0x000000010a7b44de backtrace_libc + 62
1   libpmpi.12.dylib                    0x000000010a7b4495 MPL_backtrace_show + 21
2   libpmpi.12.dylib                    0x000000010a7502f4 MPIR_Assert_fail + 36
3   libmpi.12.dylib                     0x000000010a579445 MPI_Waitany + 2469
4   irecvsource                         0x000000010a4ea4f0 main + 672
5   libdyld.dylib                       0x00007fff7319c3d5 start + 1

Do you immediately see what I'm doing wrong or do I need to supply a fully functioning reprodcuer?

MPI_ERRORS_RETURN

My program bombs with

Fatal error in MPI_Waitall: See the MPI_ERROR field in MPI_Status for the error code

Undoubtedly a programming error by me. But I can not query that status because the code has exited.
What is the MPL equivalent of MPI_Comm_set_errhandler(comm,MPI_ERRORS_RETURN)?

Receive status

I notice that your recv calls returns a status object that gets created with a reinterpret_cast of the MPI_Status. Fine. But I usually specify MPI_STATUS_IGNORE. In MPL that would be dropping the status object and never assigning it? But does that lead to a memory leak?

(This is the first time I'm looking at MPL and its source so I maybe overlooking something.)

static constexpr members might not compile without optimization

mpl/mpl/cart_comm.hpp

Lines 18 to 25 in c4c83cf

class cart_communicator : public detail::topo_communicator {
public:
enum class periodicity {
periodic, nonperiodic
};
static constexpr periodicity periodic=periodicity::periodic;
static constexpr periodicity nonperiodic=periodicity::nonperiodic;

Compiling a source using "periodic" or "nonperiodic" will not compile with -O0 in some compilers.
For example this code:

#include<mpl/mpl.hpp>                                                                                                 
                                                                                                                      
int main(){                                                                                                           
  const mpl::communicator & comm_world(mpl::environment::comm_world());                                               
  mpl::cart_communicator::sizes dims({{0, mpl::cart_communicator::periodic}});                                        
  auto comm = mpl::cart_communicator(comm_world, mpl::dims_create(comm_world.size(), dims), true);                    
  return 0;                                                                                                           
}

Fails to compile with g++ 4.8 -O0 with the error:

/tmp/ccEc4I5T.o: In function `main':
mpl_error.cpp:(.text+0x2c): undefined reference to `mpl::cart_communicator::periodic'

This happens because C++11 does not guarantee that static constexpr members are implicitly inlined, although they end up inlined with optimizations. In fact the standard requires to define the members outside according to this ( https://stackoverflow.com/questions/8016780/undefined-reference-to-static-constexpr-char/ ).
This can be solved by compiling with optimizations or explicitly defining the members outside the class by adding the following line after the declaration of cart_comunicator:

constexpr cart_communicator::periodicity cart_communicator::periodic;

Of course this is not restricted to this class.
On the other hand g++ 5.5 is more permissive and compiles without issue even with -O0.

Incosistend integer types in api

For historical reasons MPI uses int, MPI_Aint, MPI_Offset, and MPI_Count, which may have different ranges and signedness. Integer usage should be harmonized for MPL:

  • size_t and ssize_t (when negative values are permissible) should be the only integer types.
  • It must be checked for narrowing when passing integer values to underlying MPI library functions.

Edit: MPI 4.0 will introduce large-count support via new functions, see https://eurompi.github.io/assets/papers/2020-09-eurompi2020-mpi4.pdf

Unnecesary template member in communicator

mpl/mpl/comm_group.hpp

Lines 895 to 904 in 3012645

template<typename T>
std::pair<bool, status> iprobe(int source, tag t=tag(0)) const {
check_source(source);
check_recv_tag(t);
int result;
status s;
MPI_Iprobe(source, static_cast<int>(t),
comm, &result, reinterpret_cast<MPI_Status *>(&s));
return std::make_pair(static_cast<bool>(result), s);
}

The iprobe member is declared as a template but the template argument is never used, it is necessary to explicitly provide an argument for this member to compile, even when the argument itself is irrelevant.
This means this simple line wont compile (at least with my OpenMPI-4.0.0 + g++-5.5):

#include<mpl/mpl.hpp>
int main(){
   mpl::environment::comm_world().iprobe(mpl::any_source); 
return 0;
}

Unless you put <literally any typename> after iprobe.
Just commenting line 895 above solves the issue.

scatterv problem in comm_group.hpp

image

As shown in figure, i think in line 3066 should be as below one:
image

Beside that, the major reason of unit test can pass is line 3066 didn't use in below unit test.
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.