Giter Site home page Giter Site logo

ecp-copa / cabana Goto Github PK

View Code? Open in Web Editor NEW
207.0 23.0 50.0 210.29 MB

Performance-portable library for particle-based simulations

License: Other

CMake 2.87% C++ 96.29% Dockerfile 0.21% C 0.04% Python 0.60%
particles exascale-computing exascale kokkos co-design high-performance-computing hpc

cabana's Introduction

Cabana

Cabana is a performance portable library for particle-based simulations. Applications include, but are not limited to, molecular dynamics (MD) with short- and/or long-range atomic interactions; various flavors of particle-in-cell (PIC) methods, including use within fluid/solid mechanics and plasma physics; N-body cosmology simulations; and peridynamics for fracture mechanics.

Cabana provides particle data structures, algorithms, and communication, as well as structured grids, grid algorithms, and particle-grid interpolation to enable simulations on a variety of platforms including many-core CPU and GPU architectures. Cabana is built on Kokkos, with many additional optional library dependencies, including MPI for multi-node simulation.

Cabana is developed as part of the Co-Design Center for Particle Applications (CoPA) within the Exascale Computing Project (ECP) under the U.S. Department of Energy. CoPA is a multi-institutional project with developers from ORNL, LANL, SNL, LLNL, PPNL, and ANL.

Documentation

Instructions for building Cabana on various platforms, an API reference with tutorial links, and links to the Doxygen can be found in our wiki.

For Cabana-related questions you can open a GitHub issue to interact with the developers.

Contributing

We encourage you to contribute to Cabana! Please check the guidelines on how to do so.

Citing

If you use Cabana in your work, please cite the JOSS article. Also consider citing the appropriate release.

License

Cabana is distributed under an open source 3-clause BSD license.

cabana's People

Contributors

abisner avatar aetx avatar aprokop avatar ascheinb avatar brtnfld avatar cwsmith avatar dalg24 avatar davidjoy8 avatar dineshadepu avatar emedwede avatar github-actions[bot] avatar guangyechen avatar juanecopro avatar junghans avatar kwitaechong avatar lebuller avatar patrickb314 avatar rfbird avatar rhalver avatar sfogerty avatar sslattery avatar streeve avatar weinbe2 avatar xzzx avatar yuxingqiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cabana's Issues

c++: error: unrecognized command line option '--diag_suppress=esa_on_defaulted_function_ignored'

Cuda fans, any idea about this error:

c++: error: unrecognized command line option '--diag_suppress=esa_on_defaulted_function_ignored'

I tried a different gcc already....

Details:

Running with gitlab-runner  (c364eff5)
  on ascent_copa_setuid 47b77dfd
Using SetUID Shell executor...
Running on login1...
Fetching changes...
Removing kokkos.install/
Removing kokkos/
HEAD is now at 6e6c9eb try newer gcc
Checking out 6e6c9eb9 as ci-cuda...
Skipping Git submodules setup
Downloading artifacts for BuildKokkos Cuda (38102)...
About to register the Shell executor type...
About to register the Batch Runner...
Downloading artifacts from coordinator... ok        id=38102 responseStatus=200 OK token=xheXy1AZ
$ CI_PROJECT_DIR=${PWD}
$ module load cmake
$ module load cuda
$ module load gcc/6.4.0

Lmod is automatically replacing "xl/16.1.1-1" with "gcc/6.4.0".


Due to MODULEPATH changes, the following have been reloaded:
  1) spectrum-mpi/10.2.0.10-20181214

$ for i in ${BACKENDS}; do CMAKE_OPTS+=( -DCabana_ENABLE_${i}=ON ); done
$ j="$(grep -c processor /proc/cpuinfo 2>/dev/null)" || j=0; ((j++))
$ mkdir build && cd build && cmake -DCMAKE_PREFIX_PATH=${CI_PROJECT_DIR}/kokkos.install -DCabana_ENABLE_TESTING=ON -DCabana_ENABLE_Serial=OFF -DCabana_ENABLE_EXAMPLES=OFF ${CMAKE_OPTS[@]} .. && make -k -j${j} VERBOSE=1 && make test CTEST_OUTPUT_ON_FAILURE=1
-- The CXX compiler identification is GNU 6.4.0
-- Check for working CXX compiler: /sw/ascent/gcc/6.4.0/bin/c++
-- Check for working CXX compiler: /sw/ascent/gcc/6.4.0/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found KOKKOS: /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/kokkos.install  
-- The CUDA compiler identification is NVIDIA 9.2.148
-- Check for working CUDA compiler: /sw/ascent/cuda/9.2.148/bin/nvcc
-- Check for working CUDA compiler: /sw/ascent/cuda/9.2.148/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Enable Devices: Cuda
-- Found Doxygen: /bin/doxygen (found version "1.8.5") found components:  doxygen missing components:  dot
-- Performing Test COMPILER_SUPPORTS_MARCH
-- Performing Test COMPILER_SUPPORTS_MARCH - Failed
-- Found Git: /bin/git (found version "1.8.3.1") 
-- Cabana Revision = '6e6c9eb9aa8964c7cb2856f472ce60829eb4aa2d'
-- Configuring done
-- Generating done
-- Build files have been written to: /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build
/autofs/nccsopen-svm1_sw/ascent/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.12.2-vltu2cerbdpbvzwpil2fvw4skvjmabfl/bin/cmake -H/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana -B/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build --check-build-system CMakeFiles/Makefile.cmake 0
/autofs/nccsopen-svm1_sw/ascent/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.12.2-vltu2cerbdpbvzwpil2fvw4skvjmabfl/bin/cmake -E cmake_progress_start /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/CMakeFiles /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
make -f core/src/CMakeFiles/cabanacore.dir/build.make core/src/CMakeFiles/cabanacore.dir/depend
make -f core/unit_test/CMakeFiles/cabana_core_gtest.dir/build.make core/unit_test/CMakeFiles/cabana_core_gtest.dir/depend
make[2]: Entering directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
cd /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build && /autofs/nccsopen-svm1_sw/ascent/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.12.2-vltu2cerbdpbvzwpil2fvw4skvjmabfl/bin/cmake -E cmake_depends "Unix Makefiles" /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/src /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src/CMakeFiles/cabanacore.dir/DependInfo.cmake --color=
make[2]: Entering directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
cd /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build && /autofs/nccsopen-svm1_sw/ascent/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.12.2-vltu2cerbdpbvzwpil2fvw4skvjmabfl/bin/cmake -E cmake_depends "Unix Makefiles" /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/unit_test /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test/CMakeFiles/cabana_core_gtest.dir/DependInfo.cmake --color=
Dependee "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test/CMakeFiles/cabana_core_gtest.dir/DependInfo.cmake" is newer than depender "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test/CMakeFiles/cabana_core_gtest.dir/depend.internal".
Dependee "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src/CMakeFiles/cabanacore.dir/DependInfo.cmake" is newer than depender "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src/CMakeFiles/cabanacore.dir/depend.internal".
Dependee "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test/CMakeFiles/cabana_core_gtest.dir/depend.internal".
Dependee "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src/CMakeFiles/cabanacore.dir/depend.internal".
Scanning dependencies of target cabanacore
make[2]: Leaving directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
make -f core/src/CMakeFiles/cabanacore.dir/build.make core/src/CMakeFiles/cabanacore.dir/build
make[2]: Entering directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
[  2%] Building CXX object core/src/CMakeFiles/cabanacore.dir/impl/Cabana_Version.cpp.o
cd /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src && /sw/ascent/gcc/6.4.0/bin/c++   -I/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/src -I/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/src -isystem /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/kokkos.install/include  -O3 -DNDEBUG   --std=c++11 -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored -o CMakeFiles/cabanacore.dir/impl/Cabana_Version.cpp.o -c /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/src/impl/Cabana_Version.cpp
Scanning dependencies of target cabana_core_gtest
make[2]: Leaving directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
make -f core/unit_test/CMakeFiles/cabana_core_gtest.dir/build.make core/unit_test/CMakeFiles/cabana_core_gtest.dir/build
make[2]: Entering directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
c++: error: unrecognized command line option '-Xcudafe'
[  5%] Building CXX object core/unit_test/CMakeFiles/cabana_core_gtest.dir/__/__/gtest/gtest/gtest-all.cc.o
cd /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test && /sw/ascent/gcc/6.4.0/bin/c++   -I/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/gtest -I/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test -I/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/unit_test  -DGTEST_HAS_PTHREAD=0 -O3 -DNDEBUG   -std=c++11 -o CMakeFiles/cabana_core_gtest.dir/__/__/gtest/gtest/gtest-all.cc.o -c /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/gtest/gtest/gtest-all.cc
c++: error: unrecognized command line option '--diag_suppress=esa_on_defaulted_function_ignored'
make[2]: *** [core/src/CMakeFiles/cabanacore.dir/impl/Cabana_Version.cpp.o] Error 1
make[2]: Target `core/src/CMakeFiles/cabanacore.dir/build' not remade because of errors.
make[2]: Leaving directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
make[1]: *** [core/src/CMakeFiles/cabanacore.dir/all] Error 2
[  7%] Linking CXX static library libcabana_core_gtest.a
cd /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test && /autofs/nccsopen-svm1_sw/ascent/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.12.2-vltu2cerbdpbvzwpil2fvw4skvjmabfl/bin/cmake -P CMakeFiles/cabana_core_gtest.dir/cmake_clean_target.cmake
cd /root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build/core/unit_test && /autofs/nccsopen-svm1_sw/ascent/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.12.2-vltu2cerbdpbvzwpil2fvw4skvjmabfl/bin/cmake -E cmake_link_script CMakeFiles/cabana_core_gtest.dir/link.txt --verbose=1
/bin/ar qc libcabana_core_gtest.a  CMakeFiles/cabana_core_gtest.dir/__/__/gtest/gtest/gtest-all.cc.o
/bin/ranlib libcabana_core_gtest.a
make[2]: Leaving directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
[  7%] Built target cabana_core_gtest
make[1]: Target `all' not remade because of errors.
make[1]: Leaving directory `/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/build'
make: *** [all] Error 2
make: Target `default_target' not remade because of errors.
ERROR: Job failed: exit status 2

unknown register name '%rdx' in 'asm'

With gcc-4.8.5, I get:

[ 30%] Built target Slice
/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/example/benchmark/Cabana_peakflops.cpp: In function 'void run()':
/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/example/benchmark/Cabana_peakflops.cpp:23:103: error: unknown register name '%rcx' in 'asm'
   asm volatile ("rdtscp;shlq $32,%%rdx;orq %%rdx,%%rax;movq %%rax,%0":"=q"(u)::"%rax", "%rdx", "%rcx");
                                                                                                       ^
/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/example/benchmark/Cabana_peakflops.cpp:23:103: error: unknown register name '%rdx' in 'asm'
/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/example/benchmark/Cabana_peakflops.cpp:23:103: error: unknown register name '%rax' in 'asm'
/root/hpc-gitlab-runner/ecp-copa/cabana/builds/users/junghans/47b77dfd/0/ecp-copa/cabana/core/example/benchmark/Cabana_peakflops.cpp:23:103: error: unknown register name '%rcx' in 'asm'

I guess we should add a check against old compilers!

Release 0.1

Target 0.1 Release Features:

  1. Core AoSoA-related data structures - Stuart (ORNL)
  2. Sort interface, binning/sort data implementation, and permutation implementation for AoSoA - Stuart (ORNL)
  3. Neighbor list interface and Verlet list implementation - Stuart (ORNL)
  4. Linked cell list implementation - Stuart (ORNL)
  5. Basic examples - Stuart (ORNL)
  6. Documentation - Bob (LANL) and Stuart (ORNL)

To do:

  • Change sizes to std::size_t from int
  • Remove field size and rank functions in Slice
  • Renaming Field -> Member
  • Vector length size. At a minimum change 8 to 16 but try to pull from Kokkos. Add configuration time option in future if necessary
  • Add BSD 3-Clause License
  • Create a version 0.1 branch, remove everything not in release, and tag
  • Doxygen build
  • Remove Kokkos words from API
  • Clean examples
  • Move to GitHub
  • Improve README

Solvers Package

This issue is to track the development of a solvers package within the library. I propose the following strategy with each of these steps being a pull request

  • Merge #95 as an example in core/

  • Write design requirements document capture the needs of each application in terms of long range solvers and specify core capabilities of the new solver package.

  • Create a new DirectSolver capability in Cajita: https://github.com/ECP-copa/Cajita

Note that this new package will be developed in Cajita. It will also carry its own set of dependencies from including FFT libraries.

Get rid of duplicated code in core/unit_test/CMakeLists.txt

More specifically avoid duplication of code between

foreach(_device ${CABANA_SUPPORTED_DEVICES})
if(Cabana_ENABLE_${_device})
set(_dir ${CMAKE_CURRENT_BINARY_DIR}/${_device})
file(MAKE_DIRECTORY ${_dir})
foreach(_test AoSoA Slice DeepCopy Tuple Sort LinkedCellList NeighborList Parallel)
set(_file ${_dir}/tst${_test}_${_device}.cpp)
file(WRITE ${_file} "#include <Test${_device}_Category.hpp>\n")
file(APPEND ${_file} "#include <tst${_test}.hpp>\n")
set(_target ${_test}_test_${_device})
add_executable(${_target} ${_file} unit_test_main.cpp)
target_include_directories(${_target} PUBLIC ${_dir})
target_link_libraries(${_target} cabanacore cabana_core_gtest)
if(_device STREQUAL Pthread OR _device STREQUAL OpenMP)
foreach(_thread 1 2)
add_test(NAME ${_target}_${_thread} COMMAND
${_target} --gtest_color=yes --kokkos-threads=${_thread})
endforeach()
else()
add_test(NAME ${_target} COMMAND ${_target} ${gtest_args})
endif()
endforeach()
endif()
endforeach()
and
if(${Cabana_ENABLE_MPI})
foreach(_device ${CABANA_SUPPORTED_DEVICES})
if(Cabana_ENABLE_${_device})
set(_dir ${CMAKE_CURRENT_BINARY_DIR}/${_device})
file(MAKE_DIRECTORY ${_dir})
foreach(_test CommunicationPlan Distributor Halo)
set(_file ${_dir}/tst${_test}_${_device}.cpp)
file(WRITE ${_file} "#include <Test${_device}_Category.hpp>\n")
file(APPEND ${_file} "#include <tst${_test}.hpp>\n")
set(_target ${_test}_test_${_device})
add_executable(${_target} ${_file} mpi_unit_test_main.cpp)
target_include_directories(${_target} PUBLIC ${_dir})
target_link_libraries(${_target} cabanacore cabana_core_gtest)
set(TEST_MPIEXEC_NUMPROCS "")
list(APPEND TEST_MPIEXEC_NUMPROCS 1)
if (MPIEXEC_MAX_NUMPROCS GREATER 1)
list(APPEND TEST_MPIEXEC_NUMPROCS ${MPIEXEC_MAX_NUMPROCS})
endif()
foreach(_np ${TEST_MPIEXEC_NUMPROCS})
add_test(NAME ${_target}_${_np} COMMAND
${MPIEXEC} ${MPIEXEC_NUMPROC_FLAG} ${_np} ${MPIEXEC_PREFLAGS}
${_target} ${MPIEXEC_POSTFLAGS} ${gtest_args})
endforeach()
endforeach()
endif()

Originally posted by @junghans in #101 (comment)

Add a Label to AoSoA and Slice Objects

The Kokkos::View has a string label. The AoSoA and Slice should have one as well (and it should be applied to the Kokkos::View used as an implementation detail). It would be nice if the slice label included the label of the AoSoA from which it was derived in addition to a suffix indicating its member index. For example, an AoSoA with label "MyData" when sliced over member 1 would have a label "MyData_1". This label should also be propagated to mirrors.

Add member `permute`

The implementations of permute currently only work on an entire AoSoA. They should also work on Slice as some algorithms only sort some particle members and not others.

Per further discussion with @streeve we would also like a permute function which applies to a subset of the data structure.

Reuse LinkedCellList in VerletList Construction

If a user has already created a LinkedCellList for another purpose they should be able to reuse it when building a VerletList as long as that LinkedCellList was built with the same parameters.

Using --force_uvm flag for Kokkos breaks Cabana compilation

When using the --force_uvm flag (or the Kokkos_ENABLE_Cuda_UVM flag for Kokkos Cmake),
the compilation process for Cabana is broken, as there is a mismatch in the MemorySpaces
used, e.g. in Cabana_LinkedCellList.cpp.

/home/halver/kokkos/build_test/install/include/Kokkos_View.hpp(2006): error: static assertion failed with "Incompatible View copy construction"
          detected during:
            instantiation of "Kokkos::View<DataType, Properties...>::View(const Kokkos::View<RT, RP...> &, std::enable_if<Kokkos::Impl::ViewMapping<Kokkos::View<DataType, Properties...>::traits, Kokkos::View<RT, RP...>::traits, Kokkos::ViewTraits<DataType, Properties...>::specialize>::is_assignable_data_type, void>::type *) [with DataType=int *, Properties=<Kokkos::LayoutLeft, Kokkos::Cuda>, RT=int *, RP=<Kokkos::CudaSpace::memory_space>]"
/home/halver/kokkos/build_test/install/include/Kokkos_ScatterView.hpp(697): here
            instantiation of "Kokkos::Experimental::ScatterView<DataType, Layout, ExecSpace, Op, 0, contribution>::ScatterView(const Kokkos::View<RT, RP...> &) [with DataType=int *, Op=0, ExecSpace=Kokkos::Cuda, Layout=Kokkos::LayoutLeft, contribution=1, RT=int *, RP=<Kokkos::CudaSpace::memory_space>]"
/home/halver/kokkos/build_test/install/include/Kokkos_ScatterView.hpp(1283): here
            instantiation of "Kokkos::Experimental::ScatterView<RT, Kokkos::ViewTraits<RT, RP...>::array_layout, Kokkos::ViewTraits<RT, RP...>::execution_space, Op, <expression>, <expression>> Kokkos::Experimental::create_scatter_view(const Kokkos::View<RT, RP...> &) [with Op=0, duplication=-1, contribution=-1, RT=int *, RP=<Kokkos::CudaSpace::memory_space>]"
/home/halver/projects/Cabana_SPME/Cabana/core/src/Cabana_LinkedCellList.hpp(228): here
            instantiation of "void Cabana::LinkedCellList<DeviceType>::build(SliceType, std::size_t, std::size_t) [with DeviceType=Kokkos::CudaSpace::memory_space, SliceType=Cabana::Slice<double [3], Kokkos::CudaSpace, Cabana::DefaultAccessMemory, 32, 144>]"
/home/halver/projects/Cabana_SPME/Cabana/core/src/Cabana_LinkedCellList.hpp(101): here
            instantiation of "Cabana::LinkedCellList<DeviceType>::LinkedCellList(SliceType, std::size_t, std::size_t, const SliceType::value_type *, const SliceType::value_type *, const SliceType::value_type *, std::enable_if<Cabana::is_slice<SliceType>::value, int>::type *) [with DeviceType=Kokkos::CudaSpace::memory_space, SliceType=Cabana::Slice<double [3], Kokkos::CudaSpace, Cabana::DefaultAccessMemory, 32, 144>]"
/home/halver/projects/Cabana_SPME/Cabana/core/unit_test/tstLinkedCellList.hpp(76): here

As per Stuart's suggestion, related to: #149

Compiler warnings with Cuda 10.1

I compiled Cabana with gcc and Cuda 10.1 on summit. It compiled without errors, but I got hundreds of these warnings:

warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Cabana::Slice<double [3][2], ::Kokkos::HostSpace, ::Cabana::DefaultAccessMemory, (int)8, (int)172> ::~Slice") is not allowed

(The specific functions in quotes varied.)

Not sure why this is a warning rather than an error, but hopefully it will still run okay.

Check if a header is not included in src/CMakeLists.txt by accident

#43 Changed the build so that GLOB is no longer used to allow for enabling specific components based on optional TPL availability. We want to know if a new header is accidentally not added by a developer where, if any errors occur. Do they occur when building tests or only when someone links against an install and finds it is missing.

Automatically detect vector length from Kokkos arch configuration flag

Currently we have sensible default vector lengths defined in our PerformanceTraits class for different memory spaces (e.g. 16 for CPU and 32 for GPU).

Kokkos configuration allows for the specification of an architecture flag from numerous choices (e.g. Pascal60 for a P100 with compute capability 6.0 or HSW for Intel Haswell). We should be able to grab these flags, create an environment variable, and automatically set the vector length for the user by default. The user should still be able to override, however.

Performance regression benchmarks

Capturing outcome of discussion:

  • A peak flops benchmark (ala GY)
  • BW bound kernel test
  • A demonstration of the performance implications of kernel fusion/fission
  • A demonstration of the performance implications of using multiple lists to track particle properties
  • An MPM like kernel
  • A VPIC like PIC kernel
  • Communication halo exchange/redistribution
  • Neighbor list generation

Add regular CUDA Memory Space

We need to support the regular CUDA memory space in Cabana. The NVIDIA guys have informed me that this will achieve superior performance when using CUDA-aware MPI and GPUDirect implementations for message passing. We already have deepCopy so we just need to expose this in the interface and update the unit tests to copy data explicitly to the executing device.

wiki updates

The docs dir was Stuart's original attempt to generate something that gitlab would render. The GitHub wiki is intended to replace the docs folder at the end of the day - one of the things we should be able to remove for the release!

I have made an attempt to migrate the doc dir to the wiki, and added a preliminary benchmarks page.

Installing Cabana with PGI on Summit

Installation with pgi/18.10 was successful after the following 3 changes:

  • In gtest/gtest/gtest-all.cc, I had to change #include "gtest/gtest.h" to #include "gtest.h"
  • After running cmake, I had to remove an instance of the compiler flag -fast
  • and change instances of --c++11 to --std=c++11

Add a typedef for Slices in the AoSoA

Currently there is no typedef within an AoSoA for a slice of a given member returned by a call to AoSoA::Slice<M>(). This can then force the users to write out the type in certain circumstances. We want an interface that gives the type as AoSoA::slice_type<M> where a templated type alias is used.

Buffered Parallel For

It was indicated by the XGC team that they need to run more particles then available memory on the GPU. Using CUDA UVM allows for increased particles but the manual swapping due to page load overhead does not allow for performance. We propose a buffering strategy that overlaps computation and data fetching in a new parallel for construct. Some requirements:

  • User declares the memory space the data will be provided in
  • User declares the execution space in which the computation will be performed. This is compared against the memory space and if they are different (e.g. CPU memory and GPU computation) a buffering strategy is deployed
  • User declares the maximum number of tuples (particles) allowed to be allocated in the execution space. This should be a number that doesn't overflow memory in the compute space
  • User optionally provides the number of buffers used to break up computation and data movement
  • This will be employed in a new buffered_parallel_for/buffered_simd_parallel_for concept which will implement a fetch/compute/write strategy between the buffers
  • This should work for both Kokkos::RangePolicy and well as Cabana::SimdPolicy - we will handle the begin/end loops over partially filled SoAs
  • NOTE: The vector length of the input AoSoA must match that of the AoSoA that is performant in the execution space
  • NOTE: This will require the implementation of an AoSoA/Slice subview to be performant
  • NOTE: The design of this should conceptually be similar to Kokkos::ScatterView - create an object that manges the memory a user will access (i.e. AoSoA, Slice, Kokkos view) and then give users access to the active compute buffer in their functors.

Add MPI support for Periodic Boundaries and Cartesian Decompositions

Many applications in PIC and MD operate on a Cartesian Domain decomposition and often employ periodic boundary conditions. Our communication structures currently do not consider a Cartesian case and are instead more general. Because of this, we may miss opportunities for optimization. In addition, it makes support for periodic boundaries difficult.

FindKokkos `INTERFACE_COMPILE_OPTIONS` propogation

@junghans will know better than me on this, but I think Gy and I just bumped into a problem.

Currently, we set INTERFACE_COMPILE_OPTIONS, and when we start using a Fortran compiler, the flags also propagate to there.

I think we need a way to limit the flags to be CXX only. A quick google suggests something like:

INTERFACE_COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS_WITHOUT_INCLUDES}>"

There could definitely be a better cmake way to avoid this problem also

Edit: In case I'm way off base with this, the symptom we're seeing is that the Fortran compiler is getting passed flags which include CXX specific things, and as far as I can tell it seems to propagate up from KOKKOS_CXX_FLAGS_WITHOUT_INCLUDES

Unmanaged AoSoA Tutorial

At a minimum we need a C-struct based unmanaged AoSoA tutorial to demonstrate using the capability merged in #114. This should likely be a separate tutorial and include calling algorithms and/or parallel_for.

Use Kokkos::ScatterView where Possible

A number of algorithms including LinkedCellList and VerletList make extensive use of atomics. In many places it should be possible to replace those atomic operations on a regular Kokkos::View to standard operations on a Kokkos::ScatterView

Release 0.2

Feature Set

  • Particle redistribution (implies MPI) - Stuart (ORNL)
  • Halo exchange (implies MPI) - Ensure we capture how N-body and MD are different - Stuart (ORNL)
  • Parallel for loops for AoSoA loops - Bob and Guangye (LANL)
  • Parallel for loops for neighbor list traversal - Stan (SNL)
  • Basic distributed MD code example - Sam (LLNL)
  • Automated performance regression benchmarks - Bob (LANL)
  • Continuous integration with GitHub - Stuart (ORNL)
  • Documentation for new features - Everyone

Target Date: 12/21/18

Release 0.3

Feature Set

  • Fortran support in parallel for operations - Guangye (LANL)
  • Fortran interoperability PIC example - Aaron (PPPL)
  • C++ VPIC push example - Bob (LANL)
  • Spack build - Bob (LANL)

Target Release Date: 3/15/19

Cabana build errors

It looks like recently it requires some additional settings in building Cabana:

  Could NOT find KOKKOS (missing: KOKKOS_SETTINGS_DIR KOKKOS_INCLUDE_DIR
  KOKKOS_LIBRARY)

Errors showed up after using the wiki instruction:

cmake      -D CMAKE_BUILD_TYPE="Debug"      
-D CMAKE_PREFIX_PATH=$KOKKOS_INSTALL_DIR      
-D CMAKE_INSTALL_PREFIX=$CABANA_INSTALL_DIR      
-D CMAKE_CXX_COMPILER=$KOKKOS_SRC_DIR/bin/nvcc_wrapper      
-D Cabana_ENABLE_TESTING=ON      
-D Cabana_ENABLE_EXAMPLES=ON      
-D Cabana_ENABLE_Serial=ON      
-D Cabana_ENABLE_OpenMP=ON      
-D Cabana_ENABLE_Cuda:BOOL=ON      .. ;

Improve FindKOKKOS.cmake

Currently we add -ldl and -fopenmp manually, which isn't great.

However, KOKKOS installs kokkos_generated_settings.cmake, which we could include in our find macro and then generate a imported target for kokkos.

/cc @rfbird

NeighborListMDPerfTest fails to build with OpenMP-only kokkos

If I build kokkos with serial:

../generate_makefile.bash --with-openmp --prefix=$HOME/kokkos
make
make install

and then build Cabana on top:

cmake .. -DCabana_ENABLE_Serial=OFF -DCMAKE_PREFIX_PATH=$HOME/kokkos -DCabana_ENABLE_EXAMPLES=ON
make

I get the following error:

/home/junghans/computing/Cabana/core/example/md_neighbor_perf_test.cpp: In function ‘void perfTest(double, std::size_t, double)’:
/home/junghans/computing/Cabana/core/example/md_neighbor_perf_test.cpp:44:36: error: ‘Serial’ in namespace ‘Kokkos’ does not name a type
     using ExecutionSpace = Kokkos::Serial;
                                    ^~~~~~
/home/junghans/computing/Cabana/core/example/md_neighbor_perf_test.cpp:179:29: error: ‘ExecutionSpace’ was not declared in this scope
         Kokkos::RangePolicy<ExecutionSpace>(0,num_data),
                             ^~~~~~~~~~~~~~
/home/junghans/computing/Cabana/core/example/md_neighbor_perf_test.cpp:179:43: error: template argument 1 is invalid
         Kokkos::RangePolicy<ExecutionSpace>(0,num_data),
                                           ^
make[2]: *** [core/example/CMakeFiles/NeighborListMDPerfTest.dir/build.make:63: core/example/CMakeFiles/NeighborListMDPerfTest.dir/md_neighbor_perf_test.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:222: core/example/CMakeFiles/NeighborListMDPerfTest.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

Add mirror view support for AoSoA

For more efficient heterogenous computing support for AoSoA similar to Kokkos::create_mirror_view and Kokkos::create_mirror_view_and_copy is needed.

Recover peakflops benchmark

It appears like somehow the Cabana_peakflops benchmark has lost the performance, which should be recovered.

Indefinite Blocking in Migrate

In some cases the communication plan is created incorrectly leading to an indefinite wait. This is caused when a recv anticipates a message from the incorrect MPI rank. The problem seems to lie in the indexing that specifies the number of imports from a neighbor. The line should be changed to use iterators for the index.
_num_import[found_neighbor-_neighbors.begin()] = import_sizes[i];
https://github.com/ECP-copa/Cabana/blob/master/core/src/Cabana_CommunicationPlan.hpp#L687

slicing is broken for non-UVM CUDA

There is a bad view(0).template ptr<M>() pattern being used to get pointers for slices. view(0) is an illegal access on the CPU. There should be something like SoA::static_ptr that can be used as soa_type::template static_ptr<M>(view.data()).

support for unstructured mesh with non-uniform particle distribution

Hello,
We are working on plasma physics codes that use an unstructured mesh and are looking for particle data structures that support:

  1. grouping particles by the elements they are positioned within,
  2. a range of particle distributions; e.g., uniform, Gaussian, scale-free/exponential,
  3. coalesced memory access for SIMD/SIMT particle 'push' operations,
  4. sufficient parallelism for particle push operations,
  5. changing the element a particle is associated with without having to rebuild the entire particle structure, and
  6. adding particles to the domain without having to rebuild the entire particle structure.

We want to group the particles by mesh element to minimize the amount of mesh field and topology data replicated and accessed during 'push' operations (new particle positions, scatter, and gather). For example, with particles grouped by element, all of the particles within an element could access the same, relatively small, fast memory that stores the field data associated the mesh vertices of the element.

From what I can tell, Cabana supports items 3 through 5. Particles could be added to the domain, item 6, by calling the AoSoA constructor with a larger number of tuples beyond what is initially needed; the amount of reserve capacity created would be application dependent.

To support grouping by element, item 1, and non-uniform particle distributions, item 2, I can think of a few approaches:

A. We could define a tuple that includes the id of the parent mesh element and then use an AoSoA sort by element id. Relative to the particle operations that compute new particle positions for element fields, scatter, and gather, the sort seems to be a non-trivial expense.

B. We could avoid sorting, and the memory use of each tuple including a mesh element id, by storing the first and last particle/tuple that belongs to each element. Given an initial distribution of particles, we could allocate a larger than necessary set of tuples, and reserve spare tuples at the end of each elements range of active tuples. When a particle changes element we mark the source tuple as spare and write the data to the first spare tuple in the elements range. When there are no spare tuples for a given element, the structure could either be rebuilt with additional capacity, or the spare tuples redistributed (via sorting) to place them in the range of the elements that require them.

C. Define variable length SoAs for each element in the AoSoAs. This would incorporate the ideas of approach B, but without the need to track the first and last tuple for each element. I suspect this approach is not readily possible with Cabana.

Any feedback is appreciated.

Thank-you,
Cameron

Add Unmanaged AoSoA

In our Fortran work it seems that the ability to use Fortran-allocated memory with Cabana would greatly simplify the programming model. We can do this by adding an unmanaged constructor to the AoSoA where the user provides a pointer to an array of SoAs, the number of SoAs, and the size.

In the implementation we will need to do 2 things:

  1. Add state to the AoSoA to indicate whether or not it is unmanaged and to disallow unmanaged containers to do memory operations
  2. Update the container such that the Kokkos View containing the AoSoA data knows that this data is unmanaged. Because this requires a template change to the Kokkos View it may be that an additional memory traits template parameter is needed to trigger this.

Enhance Fortran Interface

Through discussions, greatly informed by #103, we have identified lots of improvements that can be added to the Fortran interface. Some great nexts steps these include but are not limited to:

  • Remove the dependency on static data pointers, likely instead passing pointers
  • Abstract some of "portability" type macros Aaron had to introduce in his XCG port
  • Implement an unmanaged AoSoA such that it can wrap Fortran allocated memory (#114)

Intel build fails with ambigious call to abs()

home/bird/Cabana/core/src/impl/Cabana_CartesianGrid.hpp(109): error: more than one instance of overloaded function "abs" matches the argument list:
            function "abs(int)"
            function "std::abs(long long)"
            function "std::abs(long)"
            argument types are: (double)
          Scalar rz = abs(zp-zc) - 0.5*_dz;

Perhaps we need fabs as we're working with floats (and std::abs might not be CUDA friendly?)

Change Default Compiler Flags to favor aggressive optimization

As per some previous offline discussion, I think it might be good to do some combination of the following:

  1. Make the default build type release (which likely gives -O3)
  2. Consider adding compiler flags to add platform specific code (-march=native, -xHost etc)

@junghans has already done this for another project, so it should be fairly straight forward to port over

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.