Giter Site home page Giter Site logo

About the inference speed about fftnet HOT 9 CLOSED

azraelkuan avatar azraelkuan commented on August 23, 2024
About the inference speed

from fftnet.

Comments (9)

Maxxiey avatar Maxxiey commented on August 23, 2024

I went through the code again and I figured out the reason. Problem solved, closing this issue now.

from fftnet.

Maxxiey avatar Maxxiey commented on August 23, 2024

I have trained this model for over 100k iters, it is surprisingly fast, but when I try to synthesize a wav file, the inference is not as fast as I expect. For a 17s wav, it takes ~20m to finish synthesizing, is there anyone get a better preformance?

from fftnet.

azraelkuan avatar azraelkuan commented on August 23, 2024

sorry for late reply, i also found that the buffer cannot accelerate the generation, may be we need to write a cuda op? i also test other repos, the speed is very slow.

from fftnet.

Maxxiey avatar Maxxiey commented on August 23, 2024

@azraelkuan Thank you very much.

I tried some other repos too, same low speed, guess we miss the trick to implement fast generation... oh, by the way, could you please tell me why using lws in preprocessing the wavform in your repo, what's the difference between lws.stft and librosa.stft, I tried to train your model on the mels extracted by using librosa, but I did not get good results (totally nothing but noise) and I suspect that it has something to do with the data preprocess.

Thanks~
max

from fftnet.

azraelkuan avatar azraelkuan commented on August 23, 2024

there is no much difference between lws and librosa stft, lws is a fast way to do stft, may be u should check the frame length and hop length?

from fftnet.

Maxxiey avatar Maxxiey commented on August 23, 2024

Okay, I will check the hyperparams, closing this issue now, thanks for the quick reply.

from fftnet.

Maxxiey avatar Maxxiey commented on August 23, 2024

@azraelkuan Hello again, I am a little confused about the following codes in cmu_arctic.py

if hparams.use_injected_noise:
noise = np.random.normal(0.0, 1.0 / hparams.quantize_channels, wav.shape)
wav += noise
...
if hparams.rescaling:
wav = wav / np.abs(wav).max() * hparams.rescaling_max

According to my humble understanding, the first part injects noise into raw wav and the second part is actually doing a normalization, which makes the "value" of the wav fall in [-1,1].

However, if I am getting it right, the np.abs(wav).max() varies, since it is very likely that two different clips of wav have different max value. So, if we add noise first, then norm, the distribution of noise may change from N(0, 1/256) to something else.

I think the right order is to norm the wav first and then apply noise injection, preventing the distribution of noise from being changed.

What is your opinion?

Thanks in advance~
max

from fftnet.

azraelkuan avatar azraelkuan commented on August 23, 2024

Yes, i think u are right. Thanks.

from fftnet.

Maxxiey avatar Maxxiey commented on August 23, 2024

Hey, quick update here, I tried to change the order of processing, but it is really embarrassing to find that the loss went to nan immediately, since I am focusing on something else right now, I do not have time to figure out why, if you have any idea, please let me know~ thanks

from fftnet.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.