Giter Site home page Giter Site logo

dl-audio-course's Introduction

logo

Deep Learning for Audio Course, Fall 2023

Description

Topics discussed in course:

  • Digital Signal Processing
  • Automatic Speech Recognition (ASR)
  • Key-word spotting (KWS)
  • Text-to-Speech (TTS)
  • Voice Conversion
  • Unsupervised learning in Audio
  • Music Generation with NNs

Course materials

Materials

# Date Description Slides Video
1 September, 14 Lecture 1: Introduction and Digital Signal Processing slides video
2 September, 21 Lecture 2: Automatic Speech Recognition 1: WER, CTC, LAS, Beam Search slides video
3 September, 28 Seminar 1: Introduction, Spectrograms and Griffin-Lim notebook video
4 October, 5 Seminar 2: Levenstein distance, WER, CER notebook video
5 October, 12 Lecture 3: Automatic Speech Recognition 2: RNN-T, Conformer, Whisper, Language models in ASR, BPE slides video
6 October, 19 Seminar 3: CTC, Beam Search notebook video
7 October, 26 Lecture 4: Key-word spotting (KWS) slides video
8 November, 2 Lecture 5: Text-to-speech: Tacotron, FastSpeech, Guided Attention slides video
9 November, 9 Seminar 4: Key-word spotting notebook video
10 November, 16 Seminar 5: Text-to-speech: Tacotron2 notebook video
11 November, 23 Lecture 6: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) slides video
12 November, 30 Lecture 7: Voice Conversion: AutoVC, CycleGAN-VC, StarGAN-VC slides video
13 December, 7 Lecture 8: Self-supervised learning in Audio slides video

Homeworks

Homework Date Deadline Description Link
1 October, 8 October, 22
  1. Audio classification
  2. Audio preprocessing
Open In Github
2 November, 3 November, 18 ASR-1: CTC Open In Github
3 November, 3 December, 3 ASR-2: RNN-T Open In Github
[Additional] Text-to-speech: FastPitch Open In Github

Game rules

  • 4 homeworks each of 2 points = 8 points
  • final test = 2 points
  • maximum points: 8 + 2 = 10 points

Authors

Pavel Severilov

Daniel Knyazev

dl-audio-course's People

Contributors

severilov avatar oorgien avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.