Giter Site home page Giter Site logo

mryab / efficient-dl-systems Goto Github PK

View Code? Open in Web Editor NEW
579.0 13.0 91.0 46.67 MB

Efficient Deep Learning Systems course materials (HSE, YSDA)

License: MIT License

Jupyter Notebook 96.85% Python 3.12% Shell 0.01% Dockerfile 0.02%
deep-learning efficient-deep-learning pytorch cuda distributed-training machine-learning ml-infrastructure mlops

efficient-dl-systems's Introduction

Efficient Deep Learning Systems

This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Science of HSE University and Yandex School of Data Analysis.

This branch corresponds to the ongoing 2024 course. If you want to see full materials of past years, see the "Past versions" section.

Syllabus

  • Week 1: Introduction
    • Lecture: Course overview and organizational details. Core concepts of the GPU architecture and CUDA API.
    • Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
  • Week 2: Experiment tracking, model and data versioning, testing DL code in Python
    • Lecture: Experiment management basics and pipeline versioning. Configuring Python applications. Intro to regular and property-based testing.
    • Seminar: Example DVC+Weights & Biases project walkthrough. Intro to testing with pytest.
  • Week 3: Training optimizations, profiling DL code
    • Lecture: Mixed-precision training. Data storage and loading optimizations. Tools for profiling deep learning workloads.
    • Seminar: Automatic Mixed Precision in PyTorch. Dynamic padding for sequence data and JPEG decoding benchmarks. Basics of profiling with py-spy, PyTorch Profiler, PyTorch TensorBoard Profiler, nvprof and Nsight Systems.
  • Week 4: Basics of distributed ML
    • Lecture: Introduction to distributed training. Process-based communication. Parameter Server architecture.
    • Seminar: Multiprocessing basics. Parallel GloVe training.
  • Week 5: Data-parallel training and All-Reduce
    • Lecture: Data-parallel training of neural networks. All-Reduce and its efficient implementations.
    • Seminar: Introduction to PyTorch Distributed. Data-parallel training primitives.
  • Week 6: Training large models
    • Lecture: Model parallelism, gradient checkpointing, offloading, sharding.
    • Seminar: Gradient checkpointing and tensor parallelism in practice.
  • Week 7: Python web application deployment
    • Lecture/Seminar: Building and deployment of production-ready web services. App & web servers, Docker, Prometheus, API via HTTP and gRPC.
  • Week 8: LLM inference optimizations and software
    • Lecture: Inference speed metrics. KV caching, batch inference, continuous batching. FlashAttention with its modifications and PagedAttention. Overview of popular LLM serving frameworks.
    • Seminar: Basics of the Triton language. Layer fusion in PyTorch and Triton. Implementation of KV caching, FlashAttention in practice.
  • Week 9: Efficient model inference
    • Lecture: Hardware utilization metrics for deep learning. Knowledge distillation, quantization, LLM.int8(), SmoothQuant, GPTQ. Efficient model architectures. Speculative decoding.
    • Seminar: Measuring Memory Bandwidth Utilization in practice. Data-free quantization, GPTq, and SmoothQuant in PyTorch.
  • Week 10: Guest lecture

Grading

There will be several home assignments (spread over multiple weeks) on the following topics:

  • Training pipelines and code profiling
  • Distributed and memory-efficient training
  • Deploying and optimizing models for production

The final grade is a weighted sum of per-assignment grades. Please refer to the course page of your institution for details.

Staff

Past versions

efficient-dl-systems's People

Contributors

adkosm avatar alexeyhorkin avatar artyomrabosh avatar dont-care-didnt-ask avatar ihatereptiloids avatar justheuristic avatar markovka17 avatar mryab avatar msaidow avatar newokaerinasai avatar poedator avatar topshik avatar vladpyzh avatar vladrub1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

efficient-dl-systems's Issues

About homework solutions

So much thanks for such a great course.I have been taking this course for about a month, and I currently get stuck in week 06, can you provide me a solution about week 06 homework. Looking forward to your reply, thanks.

Видео лекций

Добрый день!
Начинаю с 5й недели нет ссылок на видео лекций- семинаров.
Можно как-то получить к ним доступ?
Спасибо!

Lecture videos

Are the lecture videos available anywhere?

Apologies for raising an issue. If the discussion board is available I would've asked there.

HW3: Requirements and typing update

During creating enviroment from requirements.txt for the 3rd week assigment the conflict occurs:

The conflict is caused by: The user requested torch==2.2.0 torchvision 0.16.2 depends on torch==2.1.2

Possible solution: 0.16.2 -> 0.17.0

Also in train.py typings have changed. Now error appears:
AttributeError: module 'torch.optim' has no attribute 'optimizer'. Did you mean: 'Optimizer'?
AttributeError: module 'torch.nn.modules' has no attribute '_Loss'. Did you mean: 'loss'?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.