Topics discussed in course:
- Digital Signal Processing
- Automatic Speech Recognition (ASR)
- Key-word spotting (KWS)
- Text-to-Speech (TTS)
- Voice Conversion
- Unsupervised learning in Audio
- Music Generation with NNs
# | Date | Description | Slides | Video |
---|---|---|---|---|
1 | September, 14 | Lecture 1: Introduction and Digital Signal Processing | slides | video |
2 | September, 21 | Lecture 2: Automatic Speech Recognition 1: WER, CTC, LAS, Beam Search | slides | video |
3 | September, 28 | Seminar 1: Introduction, Spectrograms and Griffin-Lim | notebook | video |
4 | October, 5 | Seminar 2: Levenstein distance, WER, CER | notebook | video |
5 | October, 12 | Lecture 3: Automatic Speech Recognition 2: RNN-T, Conformer, Whisper, Language models in ASR, BPE | slides | video |
6 | October, 19 | Seminar 3: CTC, Beam Search | notebook | video |
7 | October, 26 | Lecture 4: Key-word spotting (KWS) | slides | video |
8 | November, 2 | Lecture 5: Text-to-speech: Tacotron, FastSpeech, Guided Attention | slides | video |
9 | November, 9 | Seminar 4: Key-word spotting | notebook | video |
10 | November, 16 | Seminar 5: Text-to-speech: Tacotron2 | notebook | video |
11 | November, 23 | Lecture 6: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) | slides | video |
12 | November, 30 | Lecture 7: Voice Conversion: AutoVC, CycleGAN-VC, StarGAN-VC | slides | video |
13 | December, 7 | Lecture 8: Self-supervised learning in Audio | slides | video |
- 4 homeworks each of 2 points = 8 points
- final test = 2 points
- maximum points: 8 + 2 = 10 points
Pavel Severilov
- telegram: @severilov
- e-mail: [email protected]
Daniel Knyazev
- telegram: @Oorgien
- e-mail: [email protected]