maxmax2016,MaxMax,github

only-noisy-training

** A self-supervised speech denoising strategy named Only-Noisy Training (ONT), which solves the speech denoising problem with only noisy audio signals in audio space for the first time.

open-llms

经典 📋 A list of open LLMs available for commercial use.

openit

致力于打造免费无感的翻墙环境

openmmd

MMD舞蹈 OpenMMD is an OpenPose-based application that can convert real-person videos to the motion files (.vmd) which directly implement the 3D model (e.g. Miku, Anmicius) animated movies.

opensvip

歌声合成工程转换 An open framework and intermediary model for converters among project files of various singing voice synthesizers

opentts

语音合成服务器开发 Open Text to Speech Server

openutau

OpenUTAU renderer for diffsinger / 适用于diffsinger的OpenUTAU渲染器，使用方法：https://github.com/xunmengshe/OpenUtau/wiki/%E4%BD%BF%E7%94%A8%E6%96%B9%E6%B3%95%EF%BC%88%E4%B8%AD%E6%96%87%EF%BC%89

openvoice

Instant voice cloning by MyShell

overflow

Probabilistic speech syntheses by mixing neural HMM TTS with normalising flows

paddlespeech_tts_cpp

未完成 PaddleSpeech TTS cpp

pafx

音效 Python Audio Effects

palm-rlhf-pytorch

对话模型 Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

paper2gui

实用工具 Convert AI papers to GUI，Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术

paralip

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

parallel-tacotron2

可微时长模型 PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

parselmouth

【Praat音频分析】 in Python, the Pythonic way

pasd

Pixel-Aware Stable Diffusion

pats

数字人手势生成 PATS Dataset. Aligned Pose-Audio-Transcripts and Style for co-speech gesture research

penn

基音预测 Pitch Estimating Neural Networks (PENN)

percepnet

RNNoise升级版，比赛实时赛道第二名 (Work In Progress) Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech