Giter Site home page Giter Site logo

cuda_speedtest's Introduction

概要

  • cuda の書き方での速度比較 (sgemm)
  • cuda vs cpuで速度比較 (matmul, 一部gemmを利用)
  • mandelbrotをCUDAで実行

インストール

cmakeでビルドできるのはsgemmのみ
残り二つは、各ディレクトリでnvccを利用してビルド。

sgemm

$ mkdir build && cd build
$ cmake ..
$ make

必要であれば make実行後にmake installをしても良い。

matmul

$ cd src/matmul
$ make

mandelbrot

$ cd src/mandelbrot
$ make

実行

sgemm

matmul

mandelbrot

ディレクトリ構造

以下を予定。

.
├── build
├── install
├── CMakeLists.txt
├── Makefile
├── python
│   ├── matmul_numpy_mkl.ipynb
│   ├── matmul_on_cpu.ipynb
│   ├── matmul_on_gpu.ipynb
│   └── mean_shift.ipynb
├── README.md
└── src
    ├── libsgemm
    │   ├── CMakeLists.txt
    │   ├── include
    │   │   ├── utils
    │   │   │   ├── matrix_generator.h
    │   │   │   └── time_utils.h
    │   │   ├── cpu
    │   │   │   └── sgemm_cpu.h
    │   │   └── cuda
    │   │       ├── sgemm_coalescing.h
    │   │       ├── sgemm_naive.h
    │   │       └── sgemm_smem_block.h
    │   ├── src
    │   │   ├── include # プライベートヘッダー
    │   │   ├── utils
    │   │   │   ├── matrix_generator.cpp
    │   │   │   └── time_utils.cpp
    │   │   ├── cpu # cpu上のsgemm実施
    │   │   └── cuda
    │   │       ├── sgemm_coalescing.cu
    │   │       ├── sgemm_naive.cu
    │   │       └── sgemm_smem_block.cu
    │   └── tools
    ├── mandelbrot 
    │   ├── Makefile
    │   └── mandelbrot_cuda.cu
    ├── matmul
    │   ├── include
    │   │   ├── matrix_generator.h
    │   │   └── time_utils.h
    │   ├── Makefile
    │   ├── matmul_cublas.cu
    │   ├── matmul_cuda.cu
    │   ├── matmul_openacc.cpp
    │   ├── matmul_blas.cpp
    │   ├── matmul_cpu.cpp
    │   ├── matmul_mkl.cpp
    │   ├── matmul_mkl_double.cpp
    │   └── time_utils.cpp
    └── sgemm
        └── runner.cpp # TODO

TODO

  • runner.cpp の整理

(優先度:低) 整理しておきたいこと

include方法は "CMakeLists.txtの修正" を含む

  • sgemm関数内のinclude(ヘッダファイル)に関してinclude方法(#include "../include/helper.h" vs "helper.h")

  • time_utils.cppに関して、interfaceヘッダファイルのinclude方法 -> publicにすれば良い? or ファイル単体を指定したtarget_include~

  • プロジェクト内のソースコードについて、libsgemmの外(runner.cppなど)から、libsgemmのヘッダファイルをincludeする方法

  • matrix_generator.hの分割

  • install先で ヘッダファイルをディレクトリごとに分ける(cpu, cuda, utils) + generalなincludeファイルを作成する。のようにするかどうか。

  • vscodeでCUDAの記法に対するエラーが発生するので、その対応

  • vscodeでCMake extension設定

cuda_speedtest's People

Contributors

jankenshow avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.