Benchmark N-Body system

Here are some elapsed times (in s) for 5 implementations (of course, these numbers do not characterized the languages but only particular implementations in some languages).

# particles	Py	C++	Fortran	Julia	Rust
1024	30	55	41	45	34
2048	124	231	166	173	137
16384	7220	14640	10914	11100	?

The implementations in C++, Fortran and Julia come from https://www.nbabel.org/ and have recently been used in an article published in Nature Astronomy (Zwart, 2020). The implementation in Python-Numpy is very simple, but uses Transonic and Pythran (>=0.9.8).

To run these benchmarks, go into the different directories and run make bench1k or make bench2k.

To give an idea of what it gives compared to the figure published in Nature Astronomy:

Note: these benchmarks are run sequentially with a Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz.

Note 2: With Numba, the elapsed times are 44 s, 153 s and 11490 s, respectively. This is approximately 20% faster than the C++ implementation.

Note 3: With PyPy, a pure Python implementation (bench_pypy_Point.py) runs for 1024 particles in 151 s, i.e. only 3 times slower than the C++ implementation (compared to ~50 times slower as shown in the figure taken from Zwart, 2020).

Note 4: The directory "julia" contains some more advanced and faster implementations. The sequential optimized Julia implementation runs on my PC in 27.2 s, 104.4 s and 8900 s, respectively (i.e. +- 20% compared to our fast and simple Python implementation).

Note 5: From the high level Numpy implementation (bench_numpy_highlevel.py), if one (i) adds an import from transonic import jit and (ii) decorates the function loop with @jit, the case for 1024 particles runs in 136 s (2.5 times slower than the C++ implementation).

Smaller benchmarks between different Python solutions

We can also compare different solutions in Python. Since some solutions are very slow, we need to compare on a much smaller problem (only 128 particles). Here are the elapsed times (in s):

Transonic-Pythran	Numba	High-level Numpy	PyPy OOP	PyPy lists
0.48	0.87	686	2.6	4.3

For comparison, we have for this case {"c++": 0.85, "Fortran": 0.62, "Julia": 2.57}.

Note that just adding from transonic import jit to the simple high-level Numpy code and then decorating the function loop with @jit, the elapsed time decreases to 2.1 s (a ~ x300 speedup!, with Pythran 0.9.8).

arkoniak / nbabel Goto Github PK

nbabel's Introduction

Benchmark N-Body system

Smaller benchmarks between different Python solutions

nbabel's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent