cuda-tutorial

环境配置

系统选用的是腾讯云GPU服务器，安装有Tesla P40，系统为Ubuntu CentOS 8，按量计费。

确认系统识别到GPU

# lspci  | grep NVIDIA
00:08.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)

安装installer

$ wget https://developer.download.nvidia.com/compute/cuda/11.1.0/local_installers/cuda-repo-rhel8-11-1-local-11.1.0_455.23.05-1.x86_64.rpm
$ sudo rpm -i cuda-repo-rhel8-11-1-local-11.1.0_455.23.05-1.x86_64.rpm
$ sudo dnf clean all

安装显卡驱动

sudo dnf -y module install nvidia-driver:latest-dkms

安装CUDA Toolkit

sudo dnf -y install cuda

配置环境变量在 .bashrc 中添加下列变量，然后执行 source .bashrc

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda

安装成功效果

[root@VM-0-11-centos ~]# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
[root@VM-0-11-centos ~]# nvidia-smi
Sun Sep 27 14:39:07 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P40           On   | 00000000:00:08.0 Off |                    0 |
| N/A   25C    P8     9W / 250W |      0MiB / 22919MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

更多安装细节，可以参考CUDA官方教程

Hello World

不妨先写一个cuda C 程序，命名为hello.cu，用它来输出字符串 Hello World

#include <stdio.h>

__global__ void hello_from_gpu()
{
    printf( "\"Hello, world!\", says the GPU.\n" );
}

void hello_from_cpu()
{
    printf( "\"Hello, world!\", says the CPU.\n" );
}

// host code entrance
int main( int argc, char **argv )
{
    hello_from_cpu();
    hello_from_gpu <<< 2, 4>>>();
    cudaDeviceReset();
    return 0;
}

在linux终端下使用以下命令进行编译hello.cu，然后执行程序得到

$ nvcc hello.cu -o hello
$./hello
"Hello, world!", says the CPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.
"Hello, world!", says the GPU.

在上面的代码中，cudaDeviceReset表示重置当前线程所关联过的当前设备的所有资源；修饰符__global__告诉编译器这是一个内核函数，它将从CPU中调用，然后在GPU上执行，在CPU上通过下面的代码启动内核函数

hello_from_gpu <<< 2, 4>>>();

三重尖号意味着从主线程到端代码的调用。2 和4分别表示有2个块区域和4个线程，后续会作相关介绍。

CUDA 编程模型

关于 CUDA 编程模型等更多相关内容，可以参考我总结的博客

参考资料

https://github.com/puttsk/cuda-tutorial

houminz / cuda-tutorial Goto Github PK

cuda-tutorial's Introduction

cuda-tutorial

环境配置

Hello World

CUDA 编程模型

参考资料

cuda-tutorial's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent