Light

ishine Goto Github PK

followers: 109.0 following: 111.0 repos: 3.2K gists: 1.0

Type: User

Company: gerzz.inc

Bio: speech asr/speech-recognition tts/text-to-speech vc/voice-conversion

Location: shanghai

Blog: dubbing-ai.com

Hi 👋, I'm ishine.

🔭 I’m currently working on TTS, VC, SVS, ASR.
voice conversion/changer @ dubbing-ai.com

ishine's Projects

wespeaker

wesubtitle

用 OCR 提取视频硬字幕

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

wfst-lm-decoder

wfst-based language model decoder

whisper

whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

whisper-finetune

微调Whisper语音识别模型，支持无时间戳数据训练，有时间戳数据训练、无语音数据训练。加速推理，支持Web部署、Windows桌面部署和Android部署

whisper-finetuning

[WIP] Scripts for fine-tuning a Whisper model

whisper-punctuator

Zero-shot Punctuation Insertion using Whisper

whisper-subtitles

Apple PodCast Transcription with OpenAI's Whisper

whisper.cpp

Port of OpenAI's Whisper model in C/C++

whisperbiasing

whispering-llama

EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction

whisperx

WhisperX: Automatic Speech Recognition with Accurate Word-level Timestamps.

white-box-cartoonization

Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

wikipron

Massively multilingual pronunciation mining

wmseg

woo-music-mixed-speech-recognition

word-discovery

速度更快、效果更好的中文新词发现

word-discovery-1

Word Discovery in Visually Grounded, Self-Supervised Speech Models

word2bits

Quantized word vectors that take 8x-16x less space than regular word vectors

wordrepo

从互联网数据生成中文词库

world-vocoder

worlds-best-audiobook

HTML player for W3C Audiobooks

wpd-plus-plus

wrapper-filter-speech-emotion-recognition

Implementation of our paper "A Hybrid Deep Feature Selection Framework for Emotion Recognition from Human Speeches" [Multimedia Tools and Applications, Springer]

write-a-speaker

Mocap Dataset of “Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation”

wsrglow

wtrans

Web based transcription tool

wu-manber-algorithm-for-chinese

wu-manber-algorithm-for-chinese

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.