Hi, Jingyun, nice work! I just wonder why SwinIR function needs to set the 'img_size'.

I think you need add the padding codes as follows (lines 57 to 62) before inputt

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

The input size during test,about jingyunliang/swinir

JingyunLiang commented on May 3, 2024

I'm sorry for the misunderstanding. In fact, SwinIR does not need it as well. It can deal with different sizes of images because we will pad them to be a multiple of window_size.

We input the image_size=48 or 64 into main_test_swinir.py just for the benefit of differentiating two pre-trained models that we provide:

patch_size=48, dataset=DIV2K
patch_size=64, dataset=DIV2K+Flickr2K

As can be seen in Table 2 of the paper, we train SwinIR on two different settings in classical image SR for fair comparison with two different kinds of models.

from swinir.

IceClear commented on May 3, 2024

Thanks for the reply. But I found if I use the default settings in this repo to test, there will be a reshape error. I wonder if there is anything wrong with my code.

from swinir.

JingyunLiang commented on May 3, 2024

May I ask which dataset and which task do you use? It works fine for me. Could you print the network input shape before this line?

SwinIR/main_test_swinir.py

Line 62 in 5bd10ce

output = model(img_lq)

.

If the network input size is a multiple of window_size, there should be no problem.

from swinir.

IceClear commented on May 3, 2024

Well, I just test it on REDS to regard it as a baseline. Thus, my input is (1, 3, 720//4, 1280//4). So I need to change the window_size to make it work?

from swinir.

IceClear commented on May 3, 2024

Got it! I change window_size to 4 and it works. So if I just want to train a new network without finetuning, the setting of 'img_size' does not make a difference, right?

from swinir.

JingyunLiang commented on May 3, 2024

I think you need add the padding codes as follows (lines 57 to 62) before inputting your testing images into the model. The principle is that the network input size has to be a multiple of window_size.

SwinIR/main_test_swinir.py

Line 57 in 5bd10ce

_, _, h_old, w_old = img_lq.size()

I think you cannot change the window_size once the model is trained. This may lead to bad results.
If you want to train a new one, you can set up your own img_size. (a bit confusing. Here, the img_size is just the size of the whole image we input into the network during training. It equals to the patch_size which we often refer to in image SR.)

from swinir.

IceClear commented on May 3, 2024

Well, from my settings, during training I cropped the image into 64×64 and for validation, the input is the whole image 180×320. I predefine the network before training and validation. Thus, I wonder if I have to change the img_size setting before validation.

from swinir.

JingyunLiang commented on May 3, 2024

No, you don't need to. This is exactly what I did in experiments.

from swinir.

IceClear commented on May 3, 2024

Great! Nice talk with you and thanks for your patient reply :)

from swinir.

yzcv commented on May 3, 2024

Hi, @JingyunLiang

Can I ask why we use the cropped like $ 128 \times 128 \times 3 $ (so-called patch size) for training, instead of using the same size as used in the validation phase?

Thank you.

from swinir.

yzcv commented on May 3, 2024

@JingyunLiang Besides, the window size (namely the patch_size in your network_swinir.py ) for denoising is set to 1, what does that mean? Can I understand in this way: actually there is no so-called window, instead, the denoising is based on pixel-wise attention?

from swinir.

JingyunLiang commented on May 3, 2024

In training, you need to use the same image size for batch training. In validation, different testing images may have different sizes. The common practice is to randomly crop 64x64 or 128x128 image patches for training.
Sorry for the abuse of notations. Just forget all notations/definition and I will give you a concrete example. Given a 433x532 training image, we crop a 128x128 image patch for training. The 128x128 image is divided into non-overlapping 8x8 windows. In side each 8x8 window, we compute the attention matrix between every pixel (1x1).

from swinir.

yzcv commented on May 3, 2024

Thank you very much for the example. I totally understand your setting now. ^_^ @JingyunLiang

from swinir.

The input size during test about swinir HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent