Giter Site home page Giter Site logo

Comments (7)

gvallee avatar gvallee commented on July 29, 2024

FYI, at the moment, the PMIx calls are done in the __kmp_register_library_startup() function, which is the best location I found that would vaguely be like a init function. So once we figure out where the places are created, we need to figure out if the __kmp_register_library_startup() function is called before or after. If it is called before, it means that the __kmp_register_library_startup() is not really an early initialization function and therefore we will have to revisit the code.

from libomp.

naughtont3 avatar naughtont3 commented on July 29, 2024

Found that we can enable func entry/exit traces (KA_TRACE()) by setting env var KMP_A_DEBUG=<level> when using a debug build of clang/openmp compiler. This may be useful for determining if we have defined placement info in proper phase.

  export KMP_A_DEBUG=100
  orterun -n 1 ./mpi_omp_hello.dbg

from libomp.

naughtont3 avatar naughtont3 commented on July 29, 2024

Quick update...

I realized we couldn't set the value of a generic key (e.g., moc.myrank) in multiple places without, which makes since after you think about it. So I needed to be able to pass the rank, but needed a way to get something unique. Then I realized we can get the rank from the results of PMIx_Init(myproc, ....) via the myproc.rank. We already call PMIx_Init() from within the startup code for the OMP-RT.
2017.12.19_test-output.txt

So I modified the libomp kmp_runtime.cpp to do just this and calculate the places for each process based on their MPI rank. Currently I just print out the places with the associated cores from each rank. This is the information needed to setup the OMP place details. I'm currently working on getting that in place and then we should be able to dynamically define an even split of the cores for each MPI rank on a node.

Here's an example of the output from the test with the relevant pieces highlighted (cleaned up) for easier viewing (full logfile is attached).

beaker:$ orterun -n 4 ./mpi_omp_hello.dbg
 ...<snip>...
[moc.c:MOC_Init:29] PID: 169376 PMIx_rank: 2
[moc.c:MOC_Init:29] PID: 169377 PMIx_rank: 3
[moc.c:MOC_Init:29] PID: 169375 PMIx_rank: 1
[moc.c:MOC_Init:29] PID: 169374 PMIx_rank: 0
[moc.c:MOC_Init:76] numCPUS: 12
[moc.c:MOC_Init:76] numCPUS: 12
[moc.c:MOC_Init:76] numCPUS: 12
[moc.c:MOC_Init:76] numCPUS: 12
 ...<snip>...
[kmp_runtime.cpp:__kmp_register_library_startup:6444] 4 job procs are running on the node
[kmp_runtime.cpp:__kmp_register_library_startup:6444] 4 job procs are running on the node
[kmp_runtime.cpp:__kmp_register_library_startup:6444] 4 job procs are running on the node
[kmp_runtime.cpp:__kmp_register_library_startup:6444] 4 job procs are running on the node
  ...<snip>...
[kmp_runtime.cpp:__kmp_register_library_startup:6636] OMP-RT (pid:169374) Rank: 0  places: {0}, {1}, {2}
[kmp_runtime.cpp:__kmp_register_library_startup:6636] OMP-RT (pid:169376) Rank: 2  places: {6}, {7}, {8}
[kmp_runtime.cpp:__kmp_register_library_startup:6636] OMP-RT (pid:169375) Rank: 1  places: {3}, {4}, {5}
[kmp_runtime.cpp:__kmp_register_library_startup:6636] OMP-RT (pid:169377) Rank: 3  places: {9}, {10}, {11}
 ...<snip>...

Here's the place prints for a 2 rank case on same machine.

beaker:$ orterun -n 2 ./mpi_omp_hello.dbg
 ...<snip>...
[kmp_runtime.cpp:__kmp_register_library_startup:6636] OMP-RT (pid:169471) Rank: 0  places: {0}, {1}, {2}, {3}, {4}, {5}
[kmp_runtime.cpp:__kmp_register_library_startup:6636] OMP-RT (pid:169472) Rank: 1  places: {6}, {7}, {8}, {9}, {10}, {11}
 ...<snip>...

2017.12.19_test-output.txt

from libomp.

naughtont3 avatar naughtont3 commented on July 29, 2024

I've added initial changes (#3) for libomp to setup an OMP_PLACES that is defined as mentioned above. I also added a changes in the MOC repo to help display places info (See MOC repo PR-2 and PR-3).

For the ticket trails, here's the output for current tests (using the script to pretty print the output from logs).

beaker:$ pwd
/home/tjn/projects/ompi-ecp/source/moc-git-br-master/test
beaker:$ make -f Makefile.devel
clang -I/home/tjn/projects/ompi-ecp/source/llvm/release_50/install/debug/include \
	  -I/home/tjn/projects/ompi-ecp/install/include \
      -L/home/tjn/projects/ompi-ecp/source/llvm/release_50/install/debug/lib \
      -L/home/tjn/projects/ompi-ecp/install/lib \
          -I/home/tjn/projects/ompi-ecp/install/include/openmpi -I/home/tjn/projects/ompi-ecp/install/include/openmpi/opal/mca/hwloc/hwloc2a/hwloc/include -I/home/tjn/projects/ompi-ecp/install/include -I/home/tjn/projects/ompi-ecp/install/include -pthread -fopenmp -g -pthread -L/home/tjn/projects/ompi-ecp/install/lib -Wl,-rpath -Wl,/home/tjn/projects/ompi-ecp/install/lib -Wl,-rpath -Wl,/home/tjn/projects/ompi-ecp/install/lib -Wl,--enable-new-dtags -L/home/tjn/projects/ompi-ecp/install/lib -lmpi \
          -o mpi_omp_hello.dbg mpi_omp_hello.c -lpmix -lmoc
beaker:$ orterun -n 2 ./mpi_omp_hello.dbg >& 2rank
beaker:$ orterun -n 4 ./mpi_omp_hello.dbg >& 4rank
beaker:$ orterun -n 1 ./mpi_omp_hello.dbg >& 1rank
beaker:$ orterun -n 6 ./mpi_omp_hello.dbg >& 6rank
beaker:$ cat SHOW-RESULTS.sh
#!/bin/bash

FILE=${1:x}
if [ "x$FILE" = "x" ] ; then
    echo "Usage: $0  LOG_FILENAME"
    exit 1
fi

grep PROC_BIND $FILE
echo "=========================================="
grep "Number of threads" $FILE |sort -k1
echo "=========================================="
grep OMP-RT $FILE | grep -v OVERRIDE | sort -k5
echo "=========================================="
grep "  \[RANK:" $FILE  | sort -k1
beaker:$ ./SHOW-RESULTS.sh 1rank
[kmp_runtime.cpp:__kmp_partition_places:4763: 77472] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
==========================================
[RANK:0,TID:0] Number of threads = 11
==========================================
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77472) Rank: 0  places: {0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}, {10}, {11}
==========================================
  [RANK:0,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:10,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:10,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:1,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:1,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:2,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:2,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:3,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:3,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:4,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:4,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:5,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:5,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:6,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:6,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:7,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:7,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:8,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:8,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:0,TID:9,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:9,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 6
beaker:$ ./SHOW-RESULTS.sh 2rank
[kmp_runtime.cpp:__kmp_partition_places:4763: 77429] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
[kmp_runtime.cpp:__kmp_partition_places:4763: 77430] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
==========================================
[RANK:0,TID:0] Number of threads = 5
[RANK:1,TID:0] Number of threads = 5
==========================================
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77429) Rank: 0  places: {0}, {1}, {2}, {3}, {4}, {5}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77430) Rank: 1  places: {6}, {7}, {8}, {9}, {10}, {11}
==========================================
  [RANK:0,TID:0,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:1,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:2,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:3,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:4,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 0
  [RANK:1,TID:0,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 7
  [RANK:1,TID:1,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 7
  [RANK:1,TID:2,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 7
  [RANK:1,TID:3,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 7
  [RANK:1,TID:4,PLACE:0] num_places = 1, places_num_procs = 1, places_procids = 7
beaker:$ ./SHOW-RESULTS.sh 4rank
[kmp_runtime.cpp:__kmp_partition_places:4763: 77451] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
[kmp_runtime.cpp:__kmp_partition_places:4763: 77450] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
[kmp_runtime.cpp:__kmp_partition_places:4763: 77448] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
[kmp_runtime.cpp:__kmp_partition_places:4763: 77449] DBG: __kmp_partition_places() BINDING = PROC_BIND_SPREAD
==========================================
[RANK:0,TID:0] Number of threads = 2
[RANK:1,TID:0] Number of threads = 2
[RANK:2,TID:0] Number of threads = 2
[RANK:3,TID:0] Number of threads = 2
==========================================
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77448) Rank: 0  places: {0}, {1}, {2}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77449) Rank: 1  places: {3}, {4}, {5}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77450) Rank: 2  places: {6}, {7}, {8}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77451) Rank: 3  places: {9}, {10}, {11}
==========================================
  [RANK:0,TID:0,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:0,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 1
  [RANK:0,TID:0,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 2
  [RANK:0,TID:1,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:1,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 1
  [RANK:0,TID:1,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 2
  [RANK:1,TID:0,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 3
  [RANK:1,TID:0,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 4
  [RANK:1,TID:0,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 5
  [RANK:1,TID:1,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 3
  [RANK:1,TID:1,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 4
  [RANK:1,TID:1,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 5
  [RANK:2,TID:0,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 6
  [RANK:2,TID:0,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 7
  [RANK:2,TID:0,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 8
  [RANK:2,TID:1,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 6
  [RANK:2,TID:1,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 7
  [RANK:2,TID:1,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 8
  [RANK:3,TID:0,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 9
  [RANK:3,TID:0,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 10
  [RANK:3,TID:0,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 11
  [RANK:3,TID:1,PLACE:0] num_places = 3, places_num_procs = 1, places_procids = 9
  [RANK:3,TID:1,PLACE:1] num_places = 3, places_num_procs = 1, places_procids = 10
  [RANK:3,TID:1,PLACE:2] num_places = 3, places_num_procs = 1, places_procids = 11
beaker:$ ./SHOW-RESULTS.sh 6rank
==========================================
[RANK:0,TID:0] Number of threads = 1
[RANK:1,TID:0] Number of threads = 1
[RANK:2,TID:0] Number of threads = 1
[RANK:3,TID:0] Number of threads = 1
[RANK:4,TID:0] Number of threads = 1
[RANK:5,TID:0] Number of threads = 1
==========================================
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77490) Rank: 0  places: {0}, {1}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77491) Rank: 1  places: {2}, {3}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77492) Rank: 2  places: {4}, {5}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77493) Rank: 3  places: {6}, {7}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77494) Rank: 4  places: {8}, {9}
[kmp_runtime.cpp:__kmp_register_library_startup:6638] OMP-RT (pid:77495) Rank: 5  places: {10}, {11}
==========================================
  [RANK:0,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 0
  [RANK:0,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 1
  [RANK:1,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 2
  [RANK:1,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 3
  [RANK:2,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 4
  [RANK:2,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 5
  [RANK:3,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 6
  [RANK:3,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 7
  [RANK:4,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 8
  [RANK:4,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 9
  [RANK:5,TID:0,PLACE:0] num_places = 2, places_num_procs = 1, places_procids = 10
  [RANK:5,TID:0,PLACE:1] num_places = 2, places_num_procs = 1, places_procids = 11
beaker:$

from libomp.

naughtont3 avatar naughtont3 commented on July 29, 2024

This needs more testing and not sure if the places/binding are as we want. But it is a start. :-)

from libomp.

gvallee avatar gvallee commented on July 29, 2024

So does it mean that __kmp_partition_places() is not a suitable place to set the number of places? Also, is it certain that by setting the OMP_PLACES at the proposed location, the team and the master_th structures will be correctly set so that the appropriate work will happen in __kmp_partition_places()? Maybe I am missing something but at the moment, we only know for sure that the OMP_PLACES env variable is set based on what we are trying to do, not that the actual partitioning and place creation is correct.
In other words, what is the reason for not looking at __kmp_partition_places()?

from libomp.

naughtont3 avatar naughtont3 commented on July 29, 2024

This was just a first step, setting OMP_PLACES. I think the info I have in startup will be used in __kmp_partition_places() for a better next step. I didn't fully understand the relationship between binding for the placement pieces in __kmp_partition_places() so I broke into two steps.

from libomp.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.