Giter Site home page Giter Site logo

ecp-copa / examinimd Goto Github PK

View Code? Open in Web Editor NEW
29.0 8.0 20.0 1016 KB

Molecular dynamics proxy application based on Kokkos

License: Other

Shell 0.15% Makefile 0.55% C++ 95.07% C 3.32% CMake 0.77% CWeb 0.14%
kokkos lammps molecular-dynamics proxy-application

examinimd's Introduction

ExaMiniMD

ExaMiniMD is a proxy application and research vehicle for particle codes, in particular Molecular Dynamics (MD). Compared to previous MD proxy apps (MiniMD, COMD), its design is significantly more modular in order to allow independent investigation of different aspects. To achieve that the main components such as force calculation, communication, neighbor list construction and binning are derived classes whose main functionality is accessed via virtual functions. This allows a developer to write a new derived class and drop it into the code without touching much of the rest of the application.

These modules are included via a module header file. Those header files are also used to inject the input parameter logic and instantiation logic into the main code. As an example, look at modules_comm.h in conjunction with comm_serial.h and comm_mpi.h.

In the future the plan is to provide focused miniApps with a subset of the available functionality for specific research purposes.

This implementation uses the Kokkos programming model, which you can clone from github via:

git clone https://github.com/kokkos/kokkos ~/kokkos

Current Capabilities

Force Fields:

  • Lennard-Jones Cell List
  • Lennard-Jones Neighbor List
  • SNAP Full Neighbor List

Neighbor List:

  • 2D NeighborList creation
  • CSR NeighborList creation

Integrator:

  • NVE (constant energy velocity-Verlet)

Communication

  • Serial
  • MPI

Binning:

  • Kokkos Sort Binning

Input:

  • Restricted LAMMPS input files

Compilation

ExaMiniMD utilizes the standard GNU Make build system of Kokkos. For detailed information about the Kokkos build process please refer to documentation of Kokkos at github.com/kokkos/kokkos ExaMiniMD requires Kokkos version 2.6.00 (March 2018) as a minimum. ExaMiniMD requires a C++11 compiler. Here is some quickstart information which assume that Kokkos was cloned into ${HOME}/kokkos (see above) and you are in the "src" directory:

Intel Sandy-Bridge CPU / Serial / MPI:

  make -j KOKKOS_ARCH=SNB KOKKOS_DEVICES=Serial CXX=mpicxx MPI=1

Intel Haswell CPU / Pthread / No MPI:

  make -j KOKKOS_ARCH=HSW KOKKOS_DEVICES=Pthread CXX=clang MPI=0

IBM Power8 CPU / OpenMP / MPI

  make -j KOKKOS_ARCH=Power8 KOKKOS_DEVICES=OpenMP CXX=mpicxx

IBM Power8 CPU + NVIDIA P100 / CUDA / MPI (OpenMPI)

  export OMPI_CXX=[KOKKOS_PATH]/bin/nvcc_wrapper
  make -j KOKKOS_ARCH=Power8,Pascal60 KOKKOS_DEVICES=Cuda CXX=mpicxx

Running

Currently ExaMiniMD can only get input from LAMMPS input files with a restricted set of LAMMPS commands. An example input file is provided in the input directory. Assuming you build in the src directory:

To run 2 MPI tasks, with 12 threads per task:

mpirun -np 2 -bind-to socket ./ExaMiniMD -il ../input/in.lj --comm-type MPI --kokkos-threads=12

To run 2 MPI tasks, with 1 GPU per task:

mpirun -np 2 -bind-to socket ./ExaMiniMD -il ../input/in.lj --comm-type MPI --kokkos-ndevices=2

To run in serial, writing binary output every timestep to ReferenceDir

./ExaMiniMD -il ../input/in.lj --kokkos-threads=1 --binarydump 1 ReferenceDir 

To run in serial with 2 threads, checking correctness every timestep against ReferenceDir

./ExaMiniMD -il ../input/in.lj --kokkos-threads=2 --correctness 1 ReferenceDir correctness.dat 

examinimd's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

examinimd's Issues

Add LJ variant for HalfNeighbor list which uses data replication

The OpenMP package in LAMMPS utilizes data replication to avoid write conflicts on the force array. This works very well for small amounts of threads. It may be worthwhile to explore to use a combination of data replication and atomics for higher thread counts (like on GPUs) to reduce the conflict rate while keeping the amount of data replication limited.

Add Correctness Checks

We need to add some intrinsic correctness checks, as well as statistical gold file comparisons.

Problems with CUDA device

I tried to run the provided LJ example on a GPU:

./cbnMD -il in.lj --device-type CUDA

The job crashed with the following error message:

Kokkos::Cuda::initialize WARNING: Cuda is allocating into UVMSpace by default
                                  without setting CUDA_LAUNCH_BLOCKING=1.
                                  The code must call Cuda().fence() after each kernel
                                  or will likely crash when accessing data on the host.
terminate called after throwing an instance of 'std::runtime_error'
  what():  cudaDeviceSynchronize() error( cudaErrorIllegalAddress): an illegal memory access was encountered /.../kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:144
Traceback functionality not available

What did I do wrong?

View Bounds Error with Multiple GPUs and LJ

I'm seeing a crash with LJ on more than 1 GPU. It is failing with a bounds error when I use a debug executable:

Using: ForceLJNeighFull Neighbor2D CommMPI BinningKKSort
Atoms: 256000 128000

#Timestep Temperature PotE ETot Time Atomsteps/s
0 1.400000 -6.332812 -4.232820 0.000000 0.000000e+00
10 417750470098862761654662257421581806554862139952389909363017383936.000000 -5.135827 626623257391633461030728237666787479383714598672317509428821622784.000000 0.014616 1.751505e+08
:0: : block: [192,0,0], thread: [0,36,0] Assertion `View bounds error of view Kokkos::SortImpl::BinSortFunctor::bin_count` failed.
:0: : block: [192,0,0], thread: [0,38,0] Assertion `View bounds error of view Kokkos::SortImpl::BinSortFunctor::bin_count` failed.
:0: : block: [1173,0,0], thread: [0,37,0] Assertion `View bounds error of view Kokkos::SortImpl::BinSortFunctor::bin_count` failed.

Add Rectangular NeighborList

Both MiniMD and LAMMPS use rectangular neighbor lists in threaded mode instead of CSR. That means the neighbor list is a simple 2D array. This allows for a simple single pass neighborlist construction at the cost of increased memory consumption.

Add HalfNeighbor List support

Currently only the full neighborlist approach is supported. We need to add half neighbor list including forward communication of forces on halo particles.

Add Bonded Interaction

We need a representation of bond and angle forces as they are commonplace in bio simulations.

Update ExaMiniMD for Kokkos v3.0 promotion

I'm a research computer scientist at the University of Utah, doing contract work for Sandia ABQ. Christian Trott has asked that I update the Kokkos miniApps repository for the Kokkos v3.0 promotion. As I'm not a direct contributor to ExaMiniMD, I cannot self-assign this issue.

I've updated ExaMiniMD and will submit a pull request.

Build error with Kokkos master

With kokkos/kokkos@120d9ce, I get:

mpicxx -I./ -I/home/junghans/kokkos/core/src -I/home/junghans/kokkos/containers/src -I/home/junghans/kokkos/algorithms/src  --std=c++11 -mavx -O3 -g -DEXAMINIMD_ENABLE_MPI  -c binning_kksort.cpp
binning_kksort.cpp: In constructor ‘BinningKKSort::BinningKKSort(System*)’:
binning_kksort.cpp:3:51: error: no matching function for call to ‘Kokkos::BinSort<Kokkos::View<const double* [3], Kokkos::LayoutRight>, Kokkos::BinOp3D<Kokkos::View<const double* [3], Kokkos::LayoutRight> >, Kokkos::Serial, int>::BinSort()’
 BinningKKSort::BinningKKSort(System* s): Binning(s) {}
                                                   ^
In file included from ./binning_kksort.h:4:0,
                 from binning_kksort.cpp:1:
/home/junghans/kokkos/algorithms/src/Kokkos_Sort.hpp:164:3: note: candidate: Kokkos::BinSort<KeyViewType, BinSortOp, ExecutionSpace, SizeType>::BinSort(Kokkos::BinSort<KeyViewType, BinSortOp, ExecutionSpace, SizeType>::const_key_view_type, BinSortOp, bool) [with KeyViewType = Kokkos::View<const double* [3], Kokkos::LayoutRight>; BinSortOp = Kokkos::BinOp3D<Kokkos::View<const double* [3], Kokkos::LayoutRight> >; ExecutionSpace = Kokkos::Serial; SizeType = int; Kokkos::BinSort<KeyViewType, BinSortOp, ExecutionSpace, SizeType>::const_key_view_type = Kokkos::View<const double* [3], Kokkos::LayoutRight, Kokkos::HostSpace>; typename KeyViewType::array_layout = Kokkos::LayoutRight; typename KeyViewType::memory_space = Kokkos::HostSpace; typename KeyViewType::const_data_type = const double* [3]]
   BinSort(const_key_view_type keys_, BinSortOp bin_op_,
   ^~~~~~~
/home/junghans/kokkos/algorithms/src/Kokkos_Sort.hpp:164:3: note:   candidate expects 3 arguments, 0 provided
/home/junghans/kokkos/algorithms/src/Kokkos_Sort.hpp:95:7: note: candidate: Kokkos::BinSort<Kokkos::View<const double* [3], Kokkos::LayoutRight>, Kokkos::BinOp3D<Kokkos::View<const double* [3], Kokkos::LayoutRight> >, Kokkos::Serial, int>::BinSort(const Kokkos::BinSort<Kokkos::View<const double* [3], Kokkos::LayoutRight>, Kokkos::BinOp3D<Kokkos::View<const double* [3], Kokkos::LayoutRight> >, Kokkos::Serial, int>&)
 class BinSort {
       ^~~~~~~
/home/junghans/kokkos/algorithms/src/Kokkos_Sort.hpp:95:7: note:   candidate expects 1 argument, 0 provided
/home/junghans/kokkos/algorithms/src/Kokkos_Sort.hpp:95:7: note: candidate: Kokkos::BinSort<Kokkos::View<const double* [3], Kokkos::LayoutRight>, Kokkos::BinOp3D<Kokkos::View<const double* [3], Kokkos::LayoutRight> >, Kokkos::Serial, int>::BinSort(Kokkos::BinSort<Kokkos::View<const double* [3], Kokkos::LayoutRight>, Kokkos::BinOp3D<Kokkos::View<const double* [3], Kokkos::LayoutRight> >, Kokkos::Serial, int>&&)
/home/junghans/kokkos/algorithms/src/Kokkos_Sort.hpp:95:7: note:   candidate expects 1 argument, 0 provided
make: *** [Makefile:56: binning_kksort.o] Error 1

Add LJ variant with Intensity Dial

This is about adding a variant of LJ where the computational intensity of the force calculation can be chosen, for example by doing the actual pair interaction (i.e. the code inside the cutoff check) multiple times. The goal is to being able to change the ratio of force compute to communication and/or even the amount of work which is done for each atomic add when using a half neighbor list.

View Bounds Error

When I enable view bounds checking, I get an abort, see below. This only happens when running on many processors. This issue was already present before the fix for atom array size in fbe8b07. It doesn't seem to affect the numerics and it doesn't segfault when bounds checking is off.

Using: ForceLJNeighHalf Neighbor2D CommMPI BinningKKSort
Atoms: 2048000 32000

#Timestep Temperature PotE ETot Time Atomsteps/s
0 1.400000 -6.332812 -4.232813 0.000000 0.000000e+00
terminate called after throwing an instance of 'std::runtime_error'
  what():  View bounds error of view  ( 4032 < 4032 , 0 < 3 )
Traceback functionality not available

terminate called after throwing an instance of 'std::runtime_error'
  what():  View bounds error of view  ( 4032 < 4032 , 0 < 3 )
Traceback functionality not available

MPI Issues

I'm seeing some MPI issues. On Cray, I get an error unless I set MPICH_NO_BUFFER_ALIAS_CHECK=1

PMPI_Scan(695): MPI_Scan(sbuf=0x7fffffff3878, rbuf=0x7fffffff3878, count=1, MPI_INT, MPI_SUM, MPI_COMM_WORLD) failed
PMPI_Scan(672): Buffers must not be aliased. Consider using MPI_IN_PLACE or setting MPICH_NO_BUFFER_ALIAS_CHECK

On my Linux box with MPICH I get a similar error:

Fatal error in PMPI_Scan: Internal MPI error!, error stack:
PMPI_Scan(639)........: MPI_Scan(sbuf=0x7fffffffdaa0, rbuf=0x7fffffffdaa0, count=1, MPI_INT, MPI_SUM, MPI_COMM_WORLD) failed
MPIR_Scan_impl(506)...:
MPIR_Scan_generic(155):
MPIR_Localcopy(357)...: memcpy arguments alias each other, dst=0x7fffffffdaa0 src=0x7fffffffdaa0 len=4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.