Giter Site home page Giter Site logo

about training about neuricam HOT 12 OPEN

vb000 avatar vb000 commented on June 19, 2024
about training

from neuricam.

Comments (12)

vb000 avatar vb000 commented on June 19, 2024

A full training run took about 48 hours on a 4 V100 GPU machine. I think you could train for about 1-1.5 days to get good results, but not quite best..

from neuricam.

newtreeaa avatar newtreeaa commented on June 19, 2024

A full training run took about 48 hours on a 4 V100 GPU machine. I think you could train for about 1-1.5 days to get good results, but not quite best..

@vb000 I use 4 3090 GPU machine.But one epoch training run took about 2 hours. A full training run took about 200 hours on a 4 3090 GPU machine.It runs much more slowly than yours. Do you know the reason? In addition, the training set contains 83876 videos. Are there so many videos in your training set?

from neuricam.

newtreeaa avatar newtreeaa commented on June 19, 2024

@vb000 In your params.json, the batchsize is 8. I use 4 3090 GPU machine, should I change the batchsize to 32?

from neuricam.

vb000 avatar vb000 commented on June 19, 2024

The training set had somewhere close to 64k videos. The exact link to the dataset is this.

No, we used batch size 8 with 4 GPUs. You could try 32 batch size for faster training at probably a small cost of accuracy..

from neuricam.

newtreeaa avatar newtreeaa commented on June 19, 2024

The training set had somewhere close to 64k videos. The exact link to the dataset is this.

No, we used batch size 8 with 4 GPUs. You could try 32 batch size for faster training at probably a small cost of accuracy..

@vb000 Hi, is your code mixed precision or single precision?

from neuricam.

vb000 avatar vb000 commented on June 19, 2024

It's single precision.. float32

from neuricam.

newtreeaa avatar newtreeaa commented on June 19, 2024

@vb000
I hava some questions:

  1. The septuplet dataset consists of 91701 7-frame sequences with fixed resolution 448 x 256, extracted from 39k selected video clips from Vimeo-90k. The test set of it consists of 7823 7-frame sequences. Why your training set had somewhere close to 64k videos?
  2. Do you use the vimeo-90k test set as the validation set during training?
  3. The epoch in your paper is set to 80,but in your code is set to 100. Should I set to 100 or 80?
    Thank you in advance for your answer.

from neuricam.

vb000 avatar vb000 commented on June 19, 2024
  1. The septuplet dataset consists of 91701 7-frame sequences with fixed resolution 448 x 256, extracted from 39k selected video clips from Vimeo-90k. The test set of it consists of 7823 7-frame sequences. Why your training set had somewhere close to 64k videos

Vimeo-90k train list has 64612 sequences, please use this link get the dataset we used.

  1. Do you use the vimeo-90k test set as the validation set during training?

No, because it only has 7 frame sequences. We used validation set from REDS dataset.

  1. The epoch in your paper is set to 80,but in your code is set to 100. Should I set to 100 or 80?

100 in the script is the max number of epochs. 80th epoch was the best performing epoch based on validation metrics.

from neuricam.

newtreeaa avatar newtreeaa commented on June 19, 2024
  1. The septuplet dataset consists of 91701 7-frame sequences with fixed resolution 448 x 256, extracted from 39k selected video clips from Vimeo-90k. The test set of it consists of 7823 7-frame sequences. Why your training set had somewhere close to 64k videos

Vimeo-90k train list has 64612 sequences, please use this link get the dataset we used.

  1. Do you use the vimeo-90k test set as the validation set during training?

No, because it only has 7 frame sequences. We used validation set from REDS dataset.

  1. The epoch in your paper is set to 80,but in your code is set to 100. Should I set to 100 or 80?

100 in the script is the max number of epochs. 80th epoch was the best performing epoch based on validation metrics.

@vb000 REDS dataset means REDS4? REDS4 is set of 4 1280x720videos each containing 100 frames.

from neuricam.

vb000 avatar vb000 commented on June 19, 2024

Hi,

Refer to the footnote in page 6, in the following paper for REDS train and val sets: https://openaccess.thecvf.com/content/CVPR2021/papers/Chan_BasicVSR_The_Search_for_Essential_Components_in_Video_Super-Resolution_and_CVPR_2021_paper.pdf

from neuricam.

newtreeaa avatar newtreeaa commented on June 19, 2024

Hi,

Refer to the footnote in page 6, in the following paper for REDS train and val sets: https://openaccess.thecvf.com/content/CVPR2021/papers/Chan_BasicVSR_The_Search_for_Essential_Components_in_Video_Super-Resolution_and_CVPR_2021_paper.pdf

@vb000 Thank you very much for your reply. I still have questions:

  1. I want to confirm again that the validation set is REDSval4?
  2. In addtion, is it convenient for you to provide a training log? I want to confirm whether my training process is normal.
  3. I want to know whether the low resolution video you input will be processed into grayscale video in advance or the color low resolution video will be input and read in the grayscale mode in dataset.py?
  4. I also used 4 v100GPUs for training, and the number of videos in the training set is 64k. But training an epoch takes 2 hours, which is much slower than yours. What is the reason? Do you have any suggestions?
  5. Last, is it convenient for you to view your PCIe?Use the instruction lspci -vv to check LnkSta
    according this linkhttps://unix.stackexchange.com/questions/393/how-to-check-how-many-lanes-are-used-by-the-pcie-card

from neuricam.

vb000 avatar vb000 commented on June 19, 2024

Hi,

Sorry for the late reply. Responses inline..

  1. I want to confirm again that the validation set is REDSval4?

Yes.

  1. I want to know whether the low resolution video you input will be processed into grayscale video in advance or the color low resolution video will be input and read in the grayscale mode in dataset.py?

Both modes work, we trained it using the later approach: we convert color LR frame to Lab color space and provide only L-channel to the model.

  1. In addtion, is it convenient for you to provide a training log? I want to confirm whether my training process is normal.
  2. Last, is it convenient for you to view your PCIe?Use the instruction lspci -vv to check LnkSta
    according this linkhttps://unix.stackexchange.com/questions/393/how-to-check-how-many-lanes-are-used-by-the-pcie-card

We trained it on cluster, where various machines are used based on availability. So, we currently do not have access to this data.

  1. I also used 4 v100GPUs for training, and the number of videos in the training set is 64k. But training an epoch takes 2 hours, which is much slower than yours. What is the reason? Do you have any suggestions?

I think that might be normal. I might have misquoted the runtimes, as I might have remembered them wrong. One suggestion I have is, you might want to make sure data loading is not the bottleneck.

from neuricam.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.