Comments (9)
I went through the code again and I figured out the reason. Problem solved, closing this issue now.
from fftnet.
I have trained this model for over 100k iters, it is surprisingly fast, but when I try to synthesize a wav file, the inference is not as fast as I expect. For a 17s wav, it takes ~20m to finish synthesizing, is there anyone get a better preformance?
from fftnet.
sorry for late reply, i also found that the buffer cannot accelerate the generation, may be we need to write a cuda op? i also test other repos, the speed is very slow.
from fftnet.
@azraelkuan Thank you very much.
I tried some other repos too, same low speed, guess we miss the trick to implement fast generation... oh, by the way, could you please tell me why using lws in preprocessing the wavform in your repo, what's the difference between lws.stft and librosa.stft, I tried to train your model on the mels extracted by using librosa, but I did not get good results (totally nothing but noise) and I suspect that it has something to do with the data preprocess.
Thanks~
max
from fftnet.
there is no much difference between lws and librosa stft, lws is a fast way to do stft, may be u should check the frame length and hop length?
from fftnet.
Okay, I will check the hyperparams, closing this issue now, thanks for the quick reply.
from fftnet.
@azraelkuan Hello again, I am a little confused about the following codes in cmu_arctic.py
if hparams.use_injected_noise:
noise = np.random.normal(0.0, 1.0 / hparams.quantize_channels, wav.shape)
wav += noise
...
if hparams.rescaling:
wav = wav / np.abs(wav).max() * hparams.rescaling_max
According to my humble understanding, the first part injects noise into raw wav and the second part is actually doing a normalization, which makes the "value" of the wav fall in [-1,1].
However, if I am getting it right, the np.abs(wav).max()
varies, since it is very likely that two different clips of wav have different max value. So, if we add noise first, then norm, the distribution of noise may change from N(0, 1/256) to something else.
I think the right order is to norm the wav first and then apply noise injection, preventing the distribution of noise from being changed.
What is your opinion?
Thanks in advance~
max
from fftnet.
Yes, i think u are right. Thanks.
from fftnet.
Hey, quick update here, I tried to change the order of processing, but it is really embarrassing to find that the loss went to nan immediately, since I am focusing on something else right now, I do not have time to figure out why, if you have any idea, please let me know~ thanks
from fftnet.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fftnet.