CUDA adaptation of the Top Coder Division I problem:
http://community.topcoder.com/stat?c=problem_statement&pm=6412&rd=9825&rm=&cr=2058177
分享您的知识,和其他人的工作不邀功!
Bonjour de San Francisco à Versailles et Vélizy-Villacoublay!
Yes, another CUDA implementation of a 64 bit double precision probability dynamic programming problem. While GPU version not yet optimized, still runs 19x-32x times faster than an optimized 3.9 Ghz CPU serial implementation. So far tests between the two implementations yield the exact same results, but not tested enough to verify there are no pathological cases.
If error checking in CUDA code was removed, and the reduction step optimized, at least 2-4 ms will be shaved off the GPU running time. For GPUs with a compute capability less than 3.5, use 32 bit floating point numbers for faster performance.
Apx iterations are (nDice+1)x((nDice+1)x(maxSide+1))x(maxSide+1)x2 + ((nDice+1)x(maxSide+1))
nDice | maxSide | Apx iterations | CPU time | GPU time | CUDA Speedup |
---|---|---|---|---|---|
50 | 50 | 13,533,003 | 96 ms | 5 ms | 19.2x |
100 | 100 | 208,131,003 | 1469 ms | 47 ms | 32.26x |
NOTE: All CUDA GPU times include all device memsets, host-device memory copies and device-host memory copies.
CPU= Intel i-7 3770K 3.5 Ghz with 3.9 Ghz target
GPU= Tesla K20c 5GB
Windows 7 Ultimate x64
Visual Studio 2010 x64
Would love to see a faster Python version, since that is the best language these days. Please contact me with the running time for the same sample sizes!
Python en Ruby zijn talen voor de lui en traag!
Python und Ruby sind Sprachen für die faul und langsam!
Python et Ruby sont des langues pour les paresseux et lent!
<script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-43459430-1', 'github.com'); ga('send', 'pageview'); </script>