Giter Site home page Giter Site logo

nvidia-cuda-tutorial's Introduction

Numba for CUDA Programmers

Author: Graham Markall, NVIDIA [email protected].

What is this course?

This is an adapted version of one delivered internally at NVIDIA - its primary audience is those who are familiar with CUDA C/C++ programming, but perhaps less so with Python and its ecosystem. That said, it should be useful to those familiar with the Python and PyData ecosystem.

It focuses on using CUDA concepts in Python, rather than going over basic CUDA concepts - those unfamiliar with CUDA may want to build a base understanding by working through Mark Harris's An Even Easier Introduction to CUDA blog post, and briefly reading through the CUDA Programming Guide Chapters 1 and 2 (Introduction and Programming Model). Other concepts discussed in the course (such as shared memory) are discussed in later chapters. For expediency, it is recommended to look up concepts in those sections when necessary, rather than reading all the reference material in detail.

What is in this course?

The course is broken into 5 sessions, designed for a session to be presented then the examples and exercises worked through before participants move to the next session. This could be presented at a cadence of one session per week with an hour of presentation time to fit the course around other tasks. Alternatively it could be delivered as a tutorial session over the course of 2-3 days.

Session 1: An introduction to Numba and CUDA Python

Session 1 files are in the session-1 folder. Contents:

  • Presentation: The presentation for this session, along with notes.
  • Mandelbrot example: See the README for exercises.
  • CUDA Kernels notebook: In the exercises folder. Open the notebook using Jupyter.
  • UFuncs notebooks In the exercises folder. Open the notebooks using Jupyter. Contains two notebooks on vectorize and guvectorize on the CPU (as it's a little easier to experiment with them on the CPU target) and one notebook on CUDA ufuncs and memory management.

Session 2: Typing

Session 2 files are in the session-2 folder. Contents:

  • Presentation: The presentation for this session, along with notes.
  • Exercises: In the exercises folder. Open the notebook using Jupyter.

Session 3: Porting strategies, performance, interoperability, debugging

Session 3 files are in the session-3 folder. Contents:

  • Presentation: The presentation for this session, along with notes.
  • Exercises: In the exercises folder. Open the notebook using Jupyter.
  • Examples: In the examples folder. These are mostly executable versions of the examples given in the slides.

Session 4: Extending Numba

Session 4 files are in the session-4 folder. Contents:

  • Presentation: The presentation for this session, along with notes.
  • Exercises: In the exercises folder. Open the notebook using Jupyter. A solution to the exercise is also provided.
  • Examples: In the examples folder. This contains a notebook working through the Interval example presented in the slides.

Session 5: Memory Management

Session 5 files are in the session-5 folder. Contents:

  • Presentation: The presentation for this session, along with notes.
  • Exercises: In the exercises folder. Open the notebook using Jupyter.
  • Examples: In the examples folder. This contains examples of a simple EMM Plugin wrapping cudaMalloc, and an EMM Plugin for using the CuPy pool allocator with Numba.

Sources

Some of the material in this course is derived from various sources. These sources, are:

References

The following references can be useful for studying CUDA programming in general, and the intermediate languages used in the implementation of Numba:

nvidia-cuda-tutorial's People

Contributors

gmarkall avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.