markyouyuren,github

audio-to-midi

An application of vocal melody extraction.

autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

avocodo

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

This repository contains the implementation of the AI-based "BeatNet" Joint beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. 2021's state-of-the-art online model - (ISMIR 2021).

bert-vits2

vits2 backbone with multilingual-bert

cross-lingual-voice-cloning

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

crystal

Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.

deepforcedaligner

deeplearningexamples

Deep Learning Examples

diffgan-tts

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

diffsinger

PyTorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

diffsinger-1

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

emotional-speech-data

This is the GitHub page for publicly available emotional speech data.

generspeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

gpt-sovits

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

gpt2-chinese

Chinese version of GPT2 training code, using BERT tokenizer.

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

languagecodec

Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

linly-talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

markyouyuren Goto Github PK

markyouyuren's Projects

Recommend Projects

Recommend Topics

Recommend Org