ishine's Projects
Dual-Stage Attention-Based Recurrent Neural Net for Time Series Prediction
π **Unofficial** PyTorch Implementation of DA-RNN (arXiv:1704.02971)
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech
Multiple DOA estimation & delay-and-sum beamforming
The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"
darknet yolov3 tiny train model demo
Deep Audio Segmenter
PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language"
PyTorch implementation of Data2Vec self-supervised approach for vision use cases.
The codebase for Data-driven general-purpose voice activity detection.
Streamlit app to visualize and edit TTS datasets
A dual-branch attention-in-attention transformer (dubbed DB-AIAT) to focus on both coarse and fine-grained regions of spectrum in parallel, i.e., spectral magnitude and lost complex spectral details. The source code will be released soon
A scoring neural backend for x-vector based speaker verification.
Code for DCASE 2020 task 1a and task 1b.
Author's repository for reproducing DcaseNet, an integrated pre-trained DNN that performs acoustic scene classification, audio tagging, and sound event detection. Implemented using PyTorch.
implementation of "DCCRN-Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement" by pytorch
DCCRN with various loss functions
Official implementation of "DCT-Net: Domain-Calibrated Translation for Portrait Stylization", SIGGRAPH 2022 (TOG); Multi-style cartoonization
ηΎεΊ¦εΌζΊηδΎεε₯ζ³εζη³»η»
Official implementation of SawSing (ISMIR'22)
DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)
Audio Source Separation Without Any Training Data.
Disfluency Detection using Auto-Correlational Neural Networks
The convertor/conversion of deep learning models for different deep learning frameworks/softwares.
Deep Speaker: an End-to-End Neural Speaker Embedding System.