Giter Site home page Giter Site logo

letteraunica / 3d-density-estimation Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 5.7 MB

OpenMP and MPI parallel estimation of the 3d density of a point cloud

License: MIT License

C++ 19.49% Jupyter Notebook 79.95% Shell 0.56%
high-performance-computing hpc mpi mpich openmp

3d-density-estimation's Introduction

What is this?

This code computes the 3d density of a set of points in a fast and parallel way. We implemented two parallel algorithms, one using OpenMP and the other using MPI, the algorithms are provided in the src/ folder

How to use

Clone the repository

git clone https://github.com/LetteraUnica/3D-density-estimation.git

cd into the src folder

cd src

Compile the code

sh compile

The script will produce 3 executables:

  1. density_omp.x Is the OpenMP version of the code
  2. density_mpi.x Is the MPI version of the code
  3. generate_points.x Script to generate random points, used for test purposes

These executables, when ran correctly, will produce as output a file named density.bin which is the 3D density matrix in binary format of the point distribution.

Note: The script will probably trigger some warnings, however these could be safely ignored, they are caused by compiling pragmas without openmp.

Performance evaluation

Introduction

The OpenMP code was compiled with GCC version 10.3.0 while the MPI code was compiled with MPICH version 3.3.2

I tested the code using 4 MPI processes for the MPI version and 4 omp threads for the OpenMP version, this number corresponds to the number of physical threads of my CPU, an Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

Elapsed time per n points

I fixed the grid number to $N=512$, the radius to $R=1/n^{1/3}$ and I varied the number of points from $10^6$ to $2\times 10^7$

We can see that the elapsed time scales linearly with the number of points, this is expected because we didn't use a brute force approach but we only check the grid points that are close to a certain point. The overall complexity of this algorithm, without counting communication and I/O, is $\Theta(N^3 R^3 n)$, since $R=1/n^{1/3}$ we should have $\Theta(N^3)$ which is a constant. However this doesn't account for the time to read the input file and the time to communicate the points around, which are both operations that scale with $\Theta(n)$. So we improve the speed of the algorithm but it is still linear in the number of points.

Elapsed time per grid number

I fixed the radius to $R=0.05$, the number of points to $n=2^{20} \approx 10^{6}$ and I varied the grid number from $16$ to $512$

We can see that the elapsed time scales with the cube of the grid number, as expected. The green line represents a fitted polynomial of degree 3.
To better see the scaling for high values of $N$ we also plotted the same lines in log-log scale and increased the grid number up to 1024

Comments on MPI vs OpenMP

In the first benchmark MPI is faster than OpenMP, this could be caused by the fact that the OpenMP program doesn't parallelize the I/O operations while the MPI program does, furthermore the MPI program sorts the data before passing it to the processors, these two optimizations are very noticable for high $N$, low $R$ and high $n_{points}$ which is the case here.

In the second benchmark the results are very interesting:

  • For $N<100$ the MPI program is slightly better than OpenMP, in this regime the number of cell updates is similar to the number of points, so the pre-sorting of points that the MPI program does is very beneficial.
  • For $100 < N < 500$ the OpenMP version is faster, from this point on the number of cell updates is much bigger than the number of points, so the OpenMP code doesn't suffer from skipping the pre-sorting, moreover it manages to achieve better work-sharing among the processors because of tasks, so it performs better than the MPI version.
  • For $N>500$ the MPI program returns to be the fastest, in this regime the time spent performing I/O is relevant, for instance when $N=1024$ the density matrix is about $4$ GB, and MPI parallelizes this part while OpenMP doesn't.

3d-density-estimation's People

Contributors

letteraunica avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.