Giter Site home page Giter Site logo

Comments (8)

yannier912 avatar yannier912 commented on September 3, 2024

另外,同一台机器上wavernn我也在同步训练:
torch.cuda.device_count() 0
Using device: cpu
wavernn默认是cpu上训练吗?可以改到gpu吗?

from tacotronv2_wavernn_chinese.

xuexidi avatar xuexidi commented on September 3, 2024

@yannier912
请问您训练tacotronvV2时候尝试过直接restore预训练的模型接着训练吗?我restore模型报错了,提示我err, "a mismatch between the current graph and the graph".....,请问您碰到过这个问题吗?

另外,我尝试重新训练,确实感觉很慢, Tesla V100的单GPU只有 3.5 second/step.....我算了一下要训练250k得10天....

from tacotronv2_wavernn_chinese.

yannier912 avatar yannier912 commented on September 3, 2024

@xuexidi 我还没有finetune,你可以参考#11
我现在也是重新训练,速度和你差不多吧,好像还不如你快。

from tacotronv2_wavernn_chinese.

xuexidi avatar xuexidi commented on September 3, 2024

@xuexidi 我还没有finetune,你可以参考#11
我现在也是重新训练,速度和你差不多吧,好像还不如你快。

@yannier912
我改成4张V100显卡训练了,速度大概是1.7sec/step.....

from tacotronv2_wavernn_chinese.

xuexidi avatar xuexidi commented on September 3, 2024

@xuexidi 我还没有finetune,你可以参考#11
我现在也是重新训练,速度和你差不多吧,好像还不如你快。

@yannier912
谢谢您
“如果要finetune自己的数据,可以看这里,将tacotron_hparams.py中的tacotron_fine_tuning改为True”

不过我早上已经是按照说明把tacotron_fine_tuning改为True了,然后报了那个错误....心态炸裂。。。:(

from tacotronv2_wavernn_chinese.

yannier912 avatar yannier912 commented on September 3, 2024

@xuexidi 怎么改的多卡训练呀?求教~

from tacotronv2_wavernn_chinese.

xuexidi avatar xuexidi commented on September 3, 2024

@xuexidi 怎么改的多卡训练呀?求教~

@yannier912
CUDA_VISIBLE_DEVICES=0,1 python tacotron_train.py

这个时候你在终端输入nvidia-smi就会看到有两个GPU被调用起来了哈(0、1号GPU)

from tacotronv2_wavernn_chinese.

lturing avatar lturing commented on September 3, 2024

首先我这个是不支持多gpu,我是从Tacotron-2改过来的,原始是支持多gpu。

我当时在我的笔记本上训练的(i5 7300hq, gtx 1060 6g),模型被我调小了(就是现在的模型)。训练200k 步 花了2天多,batch_size为32,1.05 sec / step ,(100k的时候感觉就收敛)。

为了加快训练的速度,可以尝试:
1,根据你机器上的cpu的线程数(htop),修改这里,例如把8调你的线程数的整数倍
2,同样根据你机器上的cpu,修改这里
3,由于我当时是6g的gpu,batch_size为32,可以根据自己的gpu的显存,调整batch_size
4,求出你数据集中输入的最大长度(或取一个合适的最大值,过滤掉长度大于这个最大的句子),以tfrecord形式加载数据。
5,将forward attention 改为 gmm attention(需要调整代码,我不提供),当gmm中的高斯分布数为3时,在batch_size为32时,可以达到0.6 sec / step 左右。
7,根据Tacotron-2加上多gpu的代码
8,尝试非自回归的模型(fastspeech)

wavernn默认是cpu上训练吗?可以改到gpu吗?
这行改为:
device = torch.device('cuda')

from tacotronv2_wavernn_chinese.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.