trained and sample result very strange （我自己训练复现的效果很奇怪） about latte HOT 21 CLOSED

huangjch526 commented on July 28, 2024

trained and sample result very strange （我自己训练复现的效果很奇怪）

from latte.

Comments (21)

maxin-cn commented on July 28, 2024 1

Regarding convert_videos_to_frames.py, is there a significant performance/speed increase associated with that approach over extracting frames via opencv in python?

There should be no noticeable speed or performance gains.

from latte.

huangjch526 commented on July 28, 2024

训练loss打印出来，降到大概0.03上下，这正常吗

from latte.

maxin-cn commented on July 28, 2024

trained and sample result very strange （我自己训练复现的效果很奇怪，我在ucf数据集上面从头训练XL/2，训练到100000step，然后sample一些视频出现发现非常丑陋，根本没有规律）

sample.mp4

This is not a normal result. Did you notice a sudden increase in gradient during your training？How many Gpus did you train on？Can you provide a detailed training configuration？Thanks~

from latte.

huangjch526 commented on July 28, 2024

Thank you so much, you're so nice. My training configuration is as follows:

My batchsize change to 1, ddp training on eight v100 32g. Because all the other parameters are completely unchanged and it's fine for me to sample the video using the checkpoint you provided. So I suspect it's because I changed the batchsize?

from latte.

maxin-cn commented on July 28, 2024

Thank you so much, you're so nice. My training configuration is as follows:

My batchsize change to 1, ddp training on eight v100 32g. Because all the other parameters are completely unchanged and it's fine for me to sample the video using the checkpoint you provided. So I suspect it's because I changed the batchsize?

Check your training log for any sudden gradient increases. I suspect there may be something wrong with the training process.

from latte.

huangjch526 commented on July 28, 2024

At 100step Gradient Norm: 1.1843
Gradual decrease, no mutation
At 50000step Gradient Norm: 0.03
Is that normal?

from latte.

maxin-cn commented on July 28, 2024

At 100step Gradient Norm: 1.1843 Gradual decrease, no mutation At 50000step Gradient Norm: 0.03 Is that normal?

It is normal. How long have you been training?

from latte.

huangjch526 commented on July 28, 2024

50000step, about 17 hours

from latte.

maxin-cn commented on July 28, 2024

50000step, about 17 hours

I think it has not converged, please train for a day or two

from latte.

huangjch526 commented on July 28, 2024

How many steps did you train before the sampled video was normal?

from latte.

maxin-cn commented on July 28, 2024

How many steps did you train before the sampled video was normal?

Because we use different training equipment, I can't give you an exact number. It takes about 2 days on the training equipment I use. You can also refer to Fig. 8 in our paper.

from latte.

huangjch526 commented on July 28, 2024

May I ask what GPU's you were using, I'm training with 8 A100 80G now, roughly how many steps do I need to train? I see your paper converged at 150k.

from latte.

maxin-cn commented on July 28, 2024

May I ask what GPU's you were using, I'm training with 8 A100 80G now, roughly how many steps do I need to train? I see your paper converged at 150k.

I have confirmed with someone who uses the same training equipment as you to repeat latte on ucf101 recently, and it will take about 10w iterations to get a normal video.

from latte.

huangjch526 commented on July 28, 2024

非常感谢您，我找到原因了，其实是因为我的数据集文件夹格式和你dataset代码的读取默认格式不一样，所以我训成了无条件生成，但是推理又用了类别条件。（Thank you very much, I found the reason, actually it's because my dataset folder format is not the same as the read default format of your dataset code, so I trained it to unconditional generation, but then used the category condition for inference.）

from latte.

huangjch526 commented on July 28, 2024

顺便一问，您的Taichi数据集是从哪里下载的，为啥我下载的全是mp4文件，可我看你dataset是按照图像frames来读取的？

from latte.

maxin-cn commented on July 28, 2024

顺便一问，您的Taichi数据集是从哪里下载的，为啥我下载的全是mp4文件，可我看你dataset是按照图像frames来读取的？

I used the Taichi dataset after converting the videos into images.

from latte.

huangjch526 commented on July 28, 2024

Could you provide your code to convert the videos into images?

from latte.

huangjch526 commented on July 28, 2024

https://github.com/universome/stylegan-v/blob/master/src/scripts/convert_videos_to_frames.py

Are you using this code?

from latte.

maxin-cn commented on July 28, 2024

https://github.com/universome/stylegan-v/blob/master/src/scripts/convert_videos_to_frames.py

Are you using this code?

You can use it.

from latte.

MHRosenberg commented on July 28, 2024

Regarding convert_videos_to_frames.py, is there a significant performance/speed increase associated with that approach over extracting frames via opencv in python?

from latte.

ivylilili commented on July 28, 2024

trained and sample result very strange （我自己训练复现的效果很奇怪，我在ucf数据集上面从头训练XL/2，训练到100000step，然后sample一些视频出现发现非常丑陋，根本没有规律）
sample.mp4

This is not a normal result. Did you notice a sudden increase in gradient during your training？How many Gpus did you train on？Can you provide a detailed training configuration？Thanks~

Hi maxin~ I noticed that you mentioned "the sudden increase in gradient". I've met the same problem. Did you know the reason why the gradient explosion happens? Would you be kind to tell how you solved this? Thanks very much!

from latte.

trained and sample result very strange （我自己训练复现的效果很奇怪） about latte HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent