Giter Site home page Giter Site logo

The input size during test about swinir HOT 13 CLOSED

IceClear avatar IceClear commented on May 3, 2024
The input size during test

from swinir.

Comments (13)

JingyunLiang avatar JingyunLiang commented on May 3, 2024

I'm sorry for the misunderstanding. In fact, SwinIR does not need it as well. It can deal with different sizes of images because we will pad them to be a multiple of window_size.

We input the image_size=48 or 64 into main_test_swinir.py just for the benefit of differentiating two pre-trained models that we provide:

  1. patch_size=48, dataset=DIV2K
  2. patch_size=64, dataset=DIV2K+Flickr2K

As can be seen in Table 2 of the paper, we train SwinIR on two different settings in classical image SR for fair comparison with two different kinds of models.

from swinir.

IceClear avatar IceClear commented on May 3, 2024

Thanks for the reply. But I found if I use the default settings in this repo to test, there will be a reshape error. I wonder if there is anything wrong with my code.
image

from swinir.

JingyunLiang avatar JingyunLiang commented on May 3, 2024

May I ask which dataset and which task do you use? It works fine for me. Could you print the network input shape before this line?

output = model(img_lq)
.

If the network input size is a multiple of window_size, there should be no problem.

from swinir.

IceClear avatar IceClear commented on May 3, 2024

Well, I just test it on REDS to regard it as a baseline. Thus, my input is (1, 3, 720//4, 1280//4). So I need to change the window_size to make it work?

from swinir.

IceClear avatar IceClear commented on May 3, 2024

Got it! I change window_size to 4 and it works. So if I just want to train a new network without finetuning, the setting of 'img_size' does not make a difference, right?

from swinir.

JingyunLiang avatar JingyunLiang commented on May 3, 2024
  • I think you need add the padding codes as follows (lines 57 to 62) before inputting your testing images into the model. The principle is that the network input size has to be a multiple of window_size.

_, _, h_old, w_old = img_lq.size()

  • I think you cannot change the window_size once the model is trained. This may lead to bad results.

  • If you want to train a new one, you can set up your own img_size. (a bit confusing. Here, the img_size is just the size of the whole image we input into the network during training. It equals to the patch_size which we often refer to in image SR.)

from swinir.

IceClear avatar IceClear commented on May 3, 2024

Well, from my settings, during training I cropped the image into 64×64 and for validation, the input is the whole image 180×320. I predefine the network before training and validation. Thus, I wonder if I have to change the img_size setting before validation.

from swinir.

JingyunLiang avatar JingyunLiang commented on May 3, 2024

No, you don't need to. This is exactly what I did in experiments.

from swinir.

IceClear avatar IceClear commented on May 3, 2024

Great! Nice talk with you and thanks for your patient reply :)

from swinir.

yzcv avatar yzcv commented on May 3, 2024

Hi, @JingyunLiang

Can I ask why we use the cropped like $ 128 \times 128 \times 3 $ (so-called patch size) for training, instead of using the same size as used in the validation phase?

Thank you.

from swinir.

yzcv avatar yzcv commented on May 3, 2024

@JingyunLiang Besides, the window size (namely the patch_size in your network_swinir.py ) for denoising is set to 1, what does that mean? Can I understand in this way: actually there is no so-called window, instead, the denoising is based on pixel-wise attention?

from swinir.

JingyunLiang avatar JingyunLiang commented on May 3, 2024
  • In training, you need to use the same image size for batch training. In validation, different testing images may have different sizes. The common practice is to randomly crop 64x64 or 128x128 image patches for training.
  • Sorry for the abuse of notations. Just forget all notations/definition and I will give you a concrete example. Given a 433x532 training image, we crop a 128x128 image patch for training. The 128x128 image is divided into non-overlapping 8x8 windows. In side each 8x8 window, we compute the attention matrix between every pixel (1x1).

from swinir.

yzcv avatar yzcv commented on May 3, 2024

Thank you very much for the example. I totally understand your setting now. ^_^ @JingyunLiang

from swinir.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.