positivewon,Suwon Yang,github

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

anytext

apisr

APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

audiogpt

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audioldm2

Text-to-Audio/Music Generation

audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

audiosep

Official implementation of "Separate Anything You Describe"

auffusion

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

autogen

Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ

awesome-llm-for-recsys

Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.

awesome-llm4ad

A curated list of awesome LLM for Autonomous Driving resources (continually updated)

awesome-video-diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

awesome-video-diffusion-models

[Arxiv] A Survey on Video Diffusion Models

awesomekorean_data

한국어 데이터 세트 링크

bark-with-voice-clone

🔊 Text-prompted Generative Audio Model - With the ability to clone voices

bert-vits2

vits2 backbone with bert

break-a-scene

Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]

chatglm2-6b

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

comospeech

one-step diffusion based speech synthesis

cross-speaker-emotion-transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

dc-comix-tts

Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer

ddsp

DDSP: Differentiable Digital Signal Processing

ddsp-svc

Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)

deepfilternet

Noise supression using deep filtering

deeplabv3-tensorflow

Reimplementation of DeepLabV3

diff2lip

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

positivewon Goto Github PK

Suwon Yang's Projects

Recommend Projects

Recommend Topics

Recommend Org