ishine's Projects
A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"
A Tensorflow Implementation like "Neural Speech Synthesis with Transformer Network" Port From OpenSeq2Seq
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Transcribing Speech with Multinomial Diffusion, training code and models.
Online translation as a Python module & command line tool. No key, no authentication needed.
phone tokenizer and grapheme-to-phoneme model for 8k languages
A toy-like Text-to-Speech for Chinese/Mandarin synthesize, inspired by Tacotron & FastSpeech2 & RefineGAN.
Transformer for abstractive summarization on cnn/daily-mail and gigawords
TripleNet: Triple Attention Network for Multi-Turn Response Selection in Retrieval-based Chatbots (CoNLL2019)
It is a complete project of voiceprint recognition or speaker recognition.Before, I upload a very classic VGG based model for speaker recognition . The model simply use softmax-loss to train super-parameters. But during testing stage,we found the model is not very reliable。for example, the model can easily distinguish man-man group, and man-woman group, but difficultly in woman-woman. So, we try another method called triplet-group to retrain our model, of course, we use triplet-loss as the loss for back propagation. The I upload our core code, and training curve for the two training stage. Why, I refer to "two training stage"? That need you to understand the triplet-group method. And very very welcome to my mailbox: [email protected]
回声工坊是一个一站式视频制作工具集,可以方便快捷地制作 gal 风格 replay 视频。
开源易用的中文离线OCR,识别率媲美大厂,并且提供了易用的web页面及web的接口,方便人类日常工作使用或者其他程序来调用~
Translating Synthetic RIRs to Real RIRs
transformer based neural network for speech enhancement in time domain
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
微软azure文本转语音 音频下载
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
基于Real-Time-Voice-Cloning语音克隆中文普通话实现
End-2-end speech synthesis with recurrent neural networks
TTS-frontend with Bert and CRF/lstm (For Tacotron)
Middleware module for our speech synthesis systems
Objective metrics used in several text-to-speech (TTS) papers.
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models