Giter Site home page Giter Site logo

cuda-tutorial's Introduction

cuda-tutorial

环境配置

系统选用的是腾讯云GPU服务器,安装有Tesla P40,系统为Ubuntu CentOS 8,按量计费。

  • 确认系统识别到GPU
# lspci  | grep NVIDIA
00:08.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
  • 安装installer
$ wget https://developer.download.nvidia.com/compute/cuda/11.1.0/local_installers/cuda-repo-rhel8-11-1-local-11.1.0_455.23.05-1.x86_64.rpm
$ sudo rpm -i cuda-repo-rhel8-11-1-local-11.1.0_455.23.05-1.x86_64.rpm
$ sudo dnf clean all
  • 安装显卡驱动
sudo dnf -y module install nvidia-driver:latest-dkms
  • 安装CUDA Toolkit
sudo dnf -y install cuda
  • 配置环境变量 在 .bashrc 中添加下列变量,然后执行 source .bashrc
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
  • 安装成功效果
[root@VM-0-11-centos ~]# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
[root@VM-0-11-centos ~]# nvidia-smi
Sun Sep 27 14:39:07 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P40           On   | 00000000:00:08.0 Off |                    0 |
| N/A   25C    P8     9W / 250W |      0MiB / 22919MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

更多安装细节,可以参考CUDA官方教程

Hello World

不妨先写一个cuda C 程序,命名为hello.cu,用它来输出字符串 Hello World

#include <stdio.h>

__global__ void hello_from_gpu()
{
    printf( "\"Hello, world!\", says the GPU.\n" );
}

void hello_from_cpu()
{
    printf( "\"Hello, world!\", says the CPU.\n" );
}

// host code entrance
int main( int argc, char **argv )
{
    hello_from_cpu();
    hello_from_gpu <<< 2, 4>>>();
    cudaDeviceReset();
    return 0;
}

在linux终端下使用以下命令进行编译hello.cu,然后执行程序得到

$ nvcc hello.cu -o hello
$./hello
"Hello, world!", says the CPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.

在上面的代码中,cudaDeviceReset表示重置当前线程所关联过的当前设备的所有资源;修饰符__global__告诉编译器这是一个内核函数,它将从CPU中调用,然后在GPU上执行,在CPU上通过下面的代码启动内核函数

hello_from_gpu <<< 2, 4>>>();

三重尖号意味着从主线程到端代码的调用。2 和4分别表示有2个块区域和4个线程,后续会作相关介绍。

CUDA 编程模型

关于 CUDA 编程模型等更多相关内容,可以参考我总结的博客

参考资料

cuda-tutorial's People

Contributors

houminz avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.