Comments (4)
Hi @creotiv , thanks for your suggestion. Unfortunately, I'd already tried several methods including DGA to improve attention as described in here, but the output quality is degraded even with the improved (strong) attention. This might be due to the model architecture or the loss design so I'm trying to change them now.
from portaspeech.
Hi @creotiv , thanks for your suggestion. Unfortunately, I'd already tried several methods including DGA to improve attention as described in here, but the output quality is degraded even with the improved (strong) attention. This might be due to the model architecture or the loss design so I'm trying to change them now.
Hey keonlee we would like to help collaborate to find a solution. What other methods have you tried? We tried using clustering and embedding of training data and output and find we aren’t able to clearly identify the problem. the words that echo and regular output words have similar results
from portaspeech.
Hi @creotiv , thanks for your suggestion. Unfortunately, I'd already tried several methods including DGA to improve attention as described in here, but the output quality is degraded even with the improved (strong) attention. This might be due to the model architecture or the loss design so I'm trying to change them now.
Hey keonlee we would like to help collaborate to find a solution. What other methods have you tried? We tried using clustering and embedding of training data and output and find we aren’t able to clearly identify the problem. the words that echo and regular output words have similar results
Hi @michaellin99999 , thanks for your attention! I just updated the repo with the fixation (there was a bug in code, see this commit). I've also applied several techniques to improve the alignment including DGA. So please check out the note section in README for the details. Also, try the new pre-trained model and demo :) Any suggestions are always welcome!
from portaspeech.
Close due to inactivity
from portaspeech.
Related Issues (20)
- Is the preprocess alignment necessary? HOT 4
- Noise at the end of the speech HOT 1
- Weird sound from the beginning of the sentence "hello" HOT 5
- A run Problem(LJSpeech) HOT 2
- What's the key difference between this repo and the official repo? HOT 1
- Who can share the pre-trained model which is the AISHELL3 HOT 4
- Training data required HOT 1
- RuntimeError: The size of tensor a (256) must match the size of tensor b (45) at non-singleton dimension 2
- The meaning of inputs[11:] in model.loss.py HOT 4
- Multi-speaker TTS HOT 1
- After training the AISHELL3 dataset, The synthesized sound is all electric current sound
- About def get_mask_from_lengths(lengths, max_len=None):
- What happens with word_encoder if some words are out of vocab ? HOT 1
- Is PortaSpeech a better choice than FastSpeech2 or DiffSpeech? HOT 2
- A questions about the output of Phoneme Encoding HOT 3
- RuntimeError: Found dtype Long but expected Float HOT 2
- small(320000.pth.tar) weights incompatibility HOT 1
- why valid loss is raising?
- Can this project run in windows?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from portaspeech.