yulunzhang / rcan Goto Github PK
View Code? Open in Web Editor NEWPyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"
Hi,
while running your code, there is a problem with the saving model. It gives an error.
The error arises in the test section of the trainer when saving the model.
Please, can you look into the problem?
Regards,
Saeed
Hi,
What is the difference between your proposed CA Module and SE Module?
Is there a modification on SE Module for low-level visual tasks?
您好,这段代码的集成度较高,因此不太清楚想要多卡训练的话,在哪里添加DataParallel的声明?
并且,添加声明后option中的参数是否有需要添加或修改的地方?
我是学生,深度学习的新手,麻烦大神赐教!!
Wonderful work! I got a question, if I want to train my own dataset, like there are 3000 images in the training set in total, the question is how to set the # of training images and # of validation images? For example, in your provided code, for the div2k dataset, --n_train = 800, and --n_val = 5. Is there any underlying reason to choose those two numbers? Thanks
Good afternoon.
Thank you for sharing your excellent work and codes.
But when I run the code, I didn't find the normalization function of the input. And the value feed into the network is not in [-0.5,0.5]. I am confused. I am wondering if i have wrong configs.
Looking forward to your reply.
Hi, i want to train a BD model from the beginning, how can i get the LR data?
I found the operation of the BD in the Prepare_TestData_HR_LR.m, is this the same code you use to obtain the LR data? And if i get the LR data from BD, it seems that i should rename it "DIV2K_LR_bicubic" to make it train.
hello, I cut the images into 48*48 patches, but I found out that the mean is 10^-2 order. The dataset I made is not the same with you. I can't find the details of the input processing, can you point out a way? Thanks very much.
Hello,thanks for sharing the source code and experimental results.
I want to train a model with my own dataset, but there exist some problems in the process of training, hoping you can give some suggestion, I will be very appreciate!
the problem described as below:
Preparing loss function:
1.000 * L1
[Epoch 1] Learning rate: 1.00e-4
Traceback (most recent call last):
File "main.py", line 19, in
t.train()
File "/home/weihq/superresolution1/RCAN-master/RCAN-master/RCAN_TrainCode/code/trainer.py", line 45, in train
for batch, (lr, hr, _, idx_scale) in enumerate(self.loader_train):
File "/home/weihq/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 286, in next
return self._process_next_batch(batch)
File "/home/weihq/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):
File "/home/weihq/superresolution1/RCAN-master/RCAN-master/RCAN_TrainCode/code/dataloader.py", line 47, in _ms_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/weihq/superresolution1/RCAN-master/RCAN-master/RCAN_TrainCode/code/dataloader.py", line 47, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/weihq/superresolution1/RCAN-master/RCAN-master/RCAN_TrainCode/code/data/srdata.py", line 90, in getitem
lr, hr = self._get_patch(lr, hr)
File "/home/weihq/superresolution1/RCAN-master/RCAN-master/RCAN_TrainCode/code/data/srdata.py", line 126, in _get_patch
lr, hr, patch_size, scale, multi_scale=multi_scale
File "/home/weihq/superresolution1/RCAN-master/RCAN-master/RCAN_TrainCode/code/data/common.py", line 22, in get_patch
img_in = img_in[iy:iy + ip, ix:ix + ip, :]
IndexError: too many indices for array
Hello, how can i get X8 LR Bic images since DIV2K dataset only contains x2, x3 and x4 LR images?
Thank you for your wonderful work !
Recently I have trained a model with my own datasets, but when I test it ,there seems to be something wrong I can't understand. The error is described as below:
(base) luomeilu@Ubuntu:~/CNN/RCAN-master/RCAN-master/RCAN_TestCode/code$ python main.py --data_test MyImage --scale 4 --model RCAN --n_resgroups 10 --n_resblocks 20 --n_feats 64 --pre_train ../model/RCAN_BIX4.pt --test_only --save_results --chop --save 'RCAN' --testpath ../LR/LRBI --testset Set5
Traceback (most recent call last):
File "main.py", line 7, in
from option import args
File "/home/luomeilu/CNN/RCAN-master/RCAN-master/RCAN_TestCode/code/option.py", line 19, in
help='random seed')
File "/home/luomeilu/anaconda3/lib/python3.6/argparse.py", line 1338, in add_argument
action = action_class(**kwargs)
TypeError: init() got an unexpected keyword argument 'defaut'
Hope you can give me some suggestions, thanks a lot!
Hello, I am very interested in your research, but I am running the main.py script and there is no "module" error. How can I solve this problem?
Traceback (most recent call last):
File "", line 1, in
runfile('/home/renxue/RCAN/RCAN_TrainCode/code/main.py', wdir='/home/renxue/RCAN/RCAN_TrainCode/code')
File "/home/renxue/anaconda3/envs/3d-AAE/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "/home/renxue/anaconda3/envs/3d-AAE/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/renxue/RCAN/RCAN_TrainCode/code/main.py", line 17, in
model = model.Model(args, checkpoint)
File "/home/renxue/RCAN/RCAN_TrainCode/code/model/init.py", line 35, in init
cpu=args.cpu
File "/home/renxue/RCAN/RCAN_TrainCode/code/model/init.py", line 102, in load
self.get_model().load_state_dict(
File "/home/renxue/RCAN/RCAN_TrainCode/code/model/init.py", line 61, in get_model
return self.model.module
File "/home/renxue/anaconda3/envs/3d-AAE/lib/python3.6/site-packages/torch/nn/modules/module.py", line 535, in getattr
type(self).name, name))
AttributeError: 'RCAN' object has no attribute 'module'
look forward to your reply!
Hi, I recently read your paper, and want to ask you a question.
You don't write total training epochs in your paper, so I find it in your code. The option.py writes '--epochs 1000', which means all the models trained for 1000 epochs?
when I try to run main.py.
an error occur that : No module named 'data.Database'
what should I do to fix it.
thank you .
您在每个小组和大组的堆叠的RCAB后与残差操作前,均加了一个卷积层,请问这个卷积层的作用是什么,能否去除?
For anyone who encounters a problem when using pytorch 1.0. saying that _worker_manager_loop is not found. There is also a same issue occuring in proSR repository. You can find the issue here: https://github.com/fperazzi/proSR/issues/31 . This issue tells us to change _worker_memory_loop to _pin_memory_loop to fix the problem.
There are some additional steps that need to be change after changing from _worker_manager_loop to _pin_memory_loop.
go to code/dataloader.py and change:
self.worker_result_queue = multiprocessing.SimpleQueue()
to
self.worker_result_queue = multiprocessing.Queue()
then change:
self.worker_manager_thread = threading.Thread( target=_worker_manager_loop, args=(self.worker_result_queue, self.data_queue, self.done_event, self.pin_memory, maybe_device_id)) self.worker_manager_thread.daemon = True self.worker_manager_thread.start()
to
self.pin_memory_thread = threading.Thread( target=_pin_memory_loop, args=(self.worker_result_queue, self.data_queue, maybe_device_id, self.done_event )) self.pin_memory_thread.daemon = True self.pin_memory_thread.start()
Hello, are you using DIV2K dataset for training? is there pretrained model for inference.
I just want to check result
您好,我想请教一下对图像不止rgb三个通道地输入您是怎么实现的,我在使用vgg时曾尝试六个通道因为‘’ValueError: 'arr' does not have a suitable array shape for any mode.‘’未成功,请问您是怎么解决scipy.misc.imsave对图像通道限制的局限的?
您好,看您的代码,三倍模型是在二倍模型的基础上训练的,但是这样Upsample层的参数就不一样,加载模型的时候会报错,请问您是如何解决的呢?
谢谢您的分享。
hi @yulunzhang
just want to thank you, for your stunning work, amazing Super-Resolution result, from my heart thank you man.
I have tried many combinations of learning rate and decay still i am getting the same error.
)
Preparing loss function:
1.000 * L1
[Epoch 1] Learning rate: 1.00e-6
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "main.py", line 19, in
t.train()
File "/home/administrator/Desktop/Projects/RCAN/RCAN_TrainCode/code/trainer.py", line 51, in train
sr = self.model(lr, idx_scale)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/Desktop/Projects/RCAN/RCAN_TrainCode/code/model/init.py", line 54, in forward
return self.model(x)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/Desktop/Projects/RCAN/RCAN_TrainCode/code/model/rcan.py", line 110, in forward
res = self.body(x)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/Desktop/Projects/RCAN/RCAN_TrainCode/code/model/rcan.py", line 62, in forward
res = self.body(x)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/Desktop/Projects/RCAN/RCAN_TrainCode/code/model/rcan.py", line 44, in forward
res = self.body(x)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/anaconda2/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu:58
Hi,
I had a a technical problem during training and it got stuck after 800 epochs.
Is there a way to restart with the exact same parameters and have it continue plotting the log?
I tried '--resume -1' which loads the model but not the parameters and doesn't continue the log.
Is there an automatic way to do this?
Thank you
Thank you for your impressive work and sharing code so soon, I have read your paper RDN and RCAN ,I'm little confused about your choice of activation function, why did you always choose RELU rather than RRELU or PRELU as activation function? RELU may have many dead neurons, right?
xiexie
Thank you for your great work!
As I don't have enough GPUs to train the model for 1000 epochs, much fewer epochs are used for my training. But I really want to make sure if my training procedure has come to the plateau state, so it would be beneficial if I could compare your learning curves.
Would you kindly provide the learning curves(loss vs epochs, and test psnr (on DIV2K validation) vs epochs, which are generated automatically by the released code) for training your RCAN model? If not all, x2 BI is enough for me. Thank you very much! :)
Hi, I am trying to train DIV2K with Y channel using the following scripts, but get some error. First I convert DIV2K_train_HR
and DIV2K_train_LR_bicubic
from rgb to y channel, and rename DIV2K
to DIV2K_y
, correspondingly delete + '/DIV2K'
in div2k.py
to keep the training path is right.
CUDA_VISIBLE_DEVICES=1 python3 main.py --n_GPUs 1
--dir_data /root/dataset/super-resolution/DIV2K_y
--model RCAN --save RCAN_BIX2_G10R20P48
--scale 2 --n_resgroups 10 --n_resblocks 20 --n_feats 64
--reset --chop --save_results --print_model --patch_size 96
--n_colors=1 --batch_size=48 --n_threads=8 >&1 | tee $LOG
Then, I set --n_colors=1
, but the error comming:
Traceback (most recent call last):
File "main.py", line 19, in <module>
t.train()
File "/root/kindlehe/project/pytorch/RCAN-master/RCAN_TrainCode/code/trainer.py", line 45, in train
for batch, (lr, hr, _, idx_scale) in enumerate(self.loader_train):
File "/usr/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 336, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):
File "/root/kindlehe/project/pytorch/RCAN-master/RCAN_TrainCode/code/dataloader.py", line 47, in _ms_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/root/kindlehe/project/pytorch/RCAN-master/RCAN_TrainCode/code/dataloader.py", line 47, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/root/kindlehe/project/pytorch/RCAN-master/RCAN_TrainCode/code/data/srdata.py", line 90, in __getitem__
lr, hr = self._get_patch(lr, hr)
File "/root/kindlehe/project/pytorch/RCAN-master/RCAN_TrainCode/code/data/srdata.py", line 126, in _get_patch
lr, hr, patch_size, scale, multi_scale=multi_scale
File "/root/kindlehe/project/pytorch/RCAN-master/RCAN_TrainCode/code/data/common.py", line 23, in get_patch
img_tar = img_tar[ty:ty + tp, tx:tx + tp, :]
IndexError: too many indices for array
Could you please give some advice about y channel training?
[Epoch 3] Learning rate: 1.00e-4
[1200/12000] [L1: 7.5750] 141.7+3.7s
[2400/12000] [L1: 7.6471] 142.1+0.0s
[3600/12000] [L1: 7.6028] 142.2+0.0s
[4800/12000] [L1: 7.6049] 145.2+0.0s
[6000/12000] [L1: 616.0927] 143.3+0.0s
Skip this batch 510! (Loss: 11261144.0)
THCudaCheck FAIL file=c:\programdata\miniconda3\conda-bld\pytorch_1524543037166\work\aten\src\thc\generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "main.py", line 20, in <module>
t.train()
File "C:\Users\motor\RCAN-master\RCAN_TrainCode\code\trainer.py", line 51, in train
sr = self.model(lr, idx_scale)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\RCAN-master\RCAN_TrainCode\code\model\__init__.py", line 54, in forward
return self.model(x)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\RCAN-master\RCAN_TrainCode\code\model\rcan.py", line 110, in forward
res = self.body(x)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\RCAN-master\RCAN_TrainCode\code\model\rcan.py", line 62, in forward
res = self.body(x)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\RCAN-master\RCAN_TrainCode\code\model\rcan.py", line 44, in forward
res = self.body(x)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "C:\Users\motor\Anaconda3\envs\TENSORFLOW\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\motor\RCAN-master\RCAN_TrainCode\code\model\rcan.py", line 25, in forward
return x * y
RuntimeError: cuda runtime error (2) : out of memory at c:\programdata\miniconda3\conda-bld\pytorch_1524543037166\work\aten\src\thc\generic/THCStorage.cu:58
command: python main.py --model RCAN --save RCAN_BDX2_G10R20P48 --scale 2 --n_resgroups 10 --n_resblocks 20 --n_feats 64 --reset --chop --save_results --print_model --patch_size 96 --dir_data (my dir)--pre_train ../experiment/RCAN_BDX2_G10R20P48/model/model_latest.pt --ext sep --batch_size 12
Environment: WIN 10 GTX1060 6gb ANACONDA3 Python 3.6 Pytorch 4.0 and LR *.npy from jpg for jpeg artifact reduction training
Issues: Because of the problem with VRAM, I reduced the --batch_size to 12(even 8) . However, after a certain amount of training(3 to 10 epoch), I get an error saying that I don't have enough memory. Don't you have this problem in a Linux environment?
Thank you!
Hi, @yulunzhang
First of all, thank you for your open source code, and the results of the reconstruction are impressive. I read the EDSR project and your project source code. I use the commad CUDA_VISIBLE_DEVICES=0,1,2 python main.py --model RCAN --save RCAN_BIX2_G10R20P48 --scale 2 --n_resgroups 10 --n_resblocks 20 --n_feats 64 --reset --chop --save_results --print_model --patch_size 96 --ext sep_reset --n_GPUs 3
to using multiple gpu. The code is ok. But I use the commad watch -n 0.1 nvidia-smi
to surveillance the gpu and memory usage. We all know in pytorch the model will copy the model to other gpu if we use multiple gpu, and the data will will be distributed equally to each gpu according to the batch size. In this way, our memory usage should be the same, but in practice, the memory usage is decremented in turn, I would like to ask the author how this is going on. Is it that I ignore the details, but also ask the author to help answer. Thank you.
I run the command to use 2 GPU for trainning:
CUDA_VISIBLE_DEVICES=0,1 python3 main.py --n_GPUs 2 --dir_data /root/dataset/super-resolution --model RCAN --save RCAN_BIX2_G10R20P48 --scale 2 --n_resgroups 10 --n_resblocks 20 --n_feats 64 --reset --chop --save_results --print_model --patch_size 96 2>&1 | tee $LOG
but get follow error:
Unexpected end of /proc/mounts line `overlay / overlay rw,relatime,lowerdir=/data2/docker/overlay2/l/76VGRCNKB4276UVDYJIQ4K44VI:/data2/docker/overlay2/l/MM3UKJSDI6OMZYJEQHBG5K5EBU:/data2/docker/overlay2/l/3TUQTOAGEKBLNX7DPFOKXKXUD5:/data2/docker/overlay2/l/5ZHVFRGKBYJ5MGWORLVPCB67H4:/data2/docker/overlay2/l/MGTNS2XZPIFDXQLJDPBWMZHSFF:/data2/docker/overlay2/l/NBUTJL2W2ZFDXG2JAE3Y6V4M3Z:/data2/docker/overlay2/l/WZ4AKFUGVNF4YJNSHH5XQEZVAV:/data2/docker/overlay2/l/W5VI2B4IEWSZLIUN7VC2PP3LD4:/data2/docker/overlay2/l/JBVVURDZXDPD7SAEKMXLQGX2YS:/dat'
Unexpected end of /proc/mounts line `a2/docker/overlay2/l/2ISST5GDKCNKQHI3D6LITRSPPC:/data2/docker/overlay2/l/QA7MQGMCVTSS4DQ4SS7QOEGADY:/data2/docker/overlay2/l/24BA5LASJSQBJYYNQONNE7DFOA:/data2/docker/overlay2/l/RHLGBBVVMXFSFDL666UIIDLCU6:/data2/docker/overlay2/l/ZJYKOHO5XHWZVLIG3OOX4SMJMW:/data2/docker/overlay2/l/X3VORDWXFDU2Q4IZGWZE24GOF7,upperdir=/data2/docker/overlay2/581f3545fee5eef1ebdd17aea4f9e4d4b922a18a608972a6115f2bbeec32b019/diff,workdir=/data2/docker/overlay2/581f3545fee5eef1ebdd17aea4f9e4d4b922a18a608972a6115f2bbeec32b019/work '
Unexpected end of /proc/mounts line `0 0
Could you please help me to find the reason?
Will the source code be available soon?
Hi, sorry to bother you. Which file did you use to get the final result, the model_best.pt or model_latest ? I find the quantitative result of model_best is better than the other one, when I test the images. Look forward to your reply, thanks.
The output i am getting after running the test scripts is this
(deeplearning) administrator@administrator-System-Product-Name:~/Desktop/Projects/RCAN/RCAN_TestCode/code$ python main.py --data_test MyImage --scale 3 --model RCAN --n_resgroups 10 --n_resblocks 20 --n_feats 64 --pre_train ../model/RCAN_BIX3.pt --test_only --save_results --chop --save 'RCAN' --testpath /home/administrator/Desktop/Projects/RCAN/RCAN_TestCode/LR/LRBI --degradation BD --testset Set5
Making model...
Use DIV2K mean (0.4488, 0.4371, 0.4040)
Loading model from ../model/RCAN_BIX3.pt
Evaluation:
100%|█████████████████████████████████████████████| 5/5 [00:02<00:00, 1.72it/s]
[MyImage x3] PSNR: 0.000 (Best: 0.000 @epoch 1)
Total time: 2.90s, ave time: 0.58s
I have downloaded the pre-trained models. I have used Set5 data which is present in LR folder.
Hi,
Thank you for your research!
I was wondering which script can I use to apply pretrained model for particular image instead of applying it for full test5 set?
Hi Yulun,
Thank you for sharing your excellent work.
While answering this issue, You mentioned that your model takes 70s on titan xp for training 100 iterations. I am using batch size 80 and 8 gpus to perform training using exactly your code and other parameters, but it is taking much more time. Could you please tell me what could be the problem?
you use nn.AdaptiveAvgPool2d when you define CALayer in rcan.py,but you use AdaptiveAvgPool2d(1),as i know,when the pool size =1,this pooling is no difference with AvgPool2d(1),and it change nothing with input,so i want to know what dose it mean in your network,thank you!
您好,感谢您提供源代码。
我现在正在尝试重新训练x4的模型,但我发现在一块1080ti上面跑一个epoch都需要很长时间,所以我想问一下重新训练1000个epoch大概需要多久呢?谢谢
Hi, Thank you for sharing the code. Great work!
How can i train a model with gray training images? Thanks.
我用visio画出来的图像质量很低
Hi,
I want to know whether the PSNR will decrease if I only let RGB channels divide 255.0 rather than substract the mean of dataset and then add it to the output.
Look forward to your kind reply!
Thanks for sharing your great work :)
I wanted to try to run it with the pre-trained model, but the dropbox link on RCAN/RCAN_TrainCode/experiment/model/Readme.md is returning a 404 status code
Hi,
I have a folder (ex. MyImages) of images. I want to train the network on my set of images.
The input images has the same size as output images, because the input images was custom scaled.
Can you give , as example, the necessary steps to train your network on my set of images?
Kind regards,
Ion
Hi!I recently read your paper and codes, and want to ask your a question.
Is there an argument to continue training after 300 epoch if I feel it's not enough ? else, how to do that ? thanks ! the --load option seems to do the job, but are the checkpoints retained ? with the correct numbering ?
Hi,sorry to bother you. I have some problems about the Resuming training. Firstly, My training stopped unexpectedly with a setence of 'EOFError: Ran out of input'. I don't know why. Secondly, I want to resume training. I try the setting' --load RCAN_BIX2_G10R20P48 --resume -1 --n_GPUs 2' , i got an error as the followings show:
Preparing loss function:
1.000 * L1
Traceback (most recent call last):
File "/home/img/Desktop/sxd/RCAN/RCAN_TrainCode/code/main.py", line 16, in
loss = loss.Loss(args, checkpoint) if not args.test_only else None
File "/home/img/Desktop/sxd/RCAN/RCAN_TrainCode/code/loss/init.py", line 67, in init
if args.load != '.': self.load(ckp.dir, cpu=args.cpu)
File "/home/img/Desktop/sxd/RCAN/RCAN_TrainCode/code/loss/init.py", line 140, in load
for l in self.loss_module:
TypeError: 'DataParallel' object is not iterable
Process finished with exit code 1
hello, there is a sentence "The initial leaning rate is set to 10−4 and then decreases
to half every 2 × 105 iterations of back-propagation." in paper,but I do not see this change,in traing ,will I decrease the learning rate?thanks
When I ran the script
python main.py --data_test MyImage --scale 4 --model RCAN --n_resgroups 10 --n_resblocks 20 --n_feats 64 --pre_train ../model/RCAN_BIX4.pt --test_only --save_results --chop --save 'RCAN' --testpath ../LR/LRBI --testset Set5
the results are very good as expected.
However, when I try my own input frame (80*45):
I wonder if this is a normal output image for a small input (80*45)? Is this because the model is trained with input kind of > 128 * 128 or some other reason?
FYI, I put the input frames in the ../LR/LRBI folder, run the same script with only changing --testset
to the new folder
CUDA_VISIBLE_DEVICES=0 python main.py --data_test MyImage --scale 4 --model RCAN --n_resgroups 10 --n_resblocks 20 --n_feats 64 --pre_train ../model/RCAN_BIX4.pt --test_only --save_results --chop --save 'RCAN' --testpath ~/RCAN/RCAN_TestCode/datasets --testset Urban100
I Run this code to test Urban100 dataset, but psnr got 0, because hr always is -1, and lr is the original image in Urban100. Does anyone can explain ?
myimage.py
def __getitem__(self, idx):
filename = os.path.split(self.filelist[idx])[-1]
filename, _ = os.path.splitext(filename)
lr = misc.imread(self.filelist[idx])
lr = common.set_channel([lr], self.args.n_colors)[0]
return common.np2Tensor([lr], self.args.rgb_range)[0], -1, filename
In Fig. 5, the last two images ocurs error. Others images are flipped up and down significantly.
Good afternoon!
Thanks for sharing the source code and experimental results.
What are the changes in the source code to have the same size as the input image as the output image?
In this case, could I can evaluate (train/test) your network for inpainting?
Thank you for your work,
I have a problem, during training my data, there always is a error "RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 96 and 68 in dimension 2 at /pytorch/aten/src/TH/generic/THTensorMath.c:3586". patch_size=96. For simplify debugging, n_resblocks and n_resgroups is decreased to 5 and 2 . Other network structure is not changed.
The HR data is from Middlebury dataset, LR data is bicubic-sampled. (total 60 images)
This problem puzzled me for a long time. I would appreciate it if you could give me a reply.
Thanks.
<Making model...
RCAN(
(sub_mean): MeanShift(3, 3, kernel_size=(1, 1), stride=(1, 1))
(add_mean): MeanShift(3, 3, kernel_size=(1, 1), stride=(1, 1))
(head): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(body): Sequential(
(0): ResidualGroup(
(body): Sequential(
(0): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(1): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(2): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(3): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(4): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(1): ResidualGroup(
(body): Sequential(
(0): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(1): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(2): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(3): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(4): RCAB(
(body): Sequential(
(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): CALayer(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(conv_du): Sequential(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
(1): ReLU(inplace)
(2): Conv2d(4, 64, kernel_size=(1, 1), stride=(1, 1))
(3): Sigmoid()
)
)
)
)
(5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(tail): Sequential(
(0): Upsampler(
(0): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): PixelShuffle(upscale_factor=2)
)
(1): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
Preparing loss function:
1.000 * L1
[Epoch 1] Learning rate: 1.00e-4
[16/1980] [L1: 56.1840] 0.1+1.1s
[32/1980] [L1: 58.0140] 0.1+0.1s
[48/1980] [L1: 58.6069] 0.1+0.1s
[64/1980] [L1: 57.1261] 0.1+0.5s
[80/1980] [L1: 55.2015] 0.1+0.1s
Traceback (most recent call last):
File "main.py", line 20, in
t.train()
File "/media/ybl/0A9AD66165F33762/CODE/RCAN-master/RCAN_TrainCode/code/trainer.py", line 47, in train
for batch, (lr, hr, _, idx_scale) in enumerate(self.loader_train):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 286, in next
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/media/ybl/0A9AD66165F33762/CODE/RCAN-master/RCAN_TrainCode/code/dataloader.py", line 47, in _ms_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 138, in default_collate
return [default_collate(samples) for samples in transposed]
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 138, in
return [default_collate(samples) for samples in transposed]
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 115, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 96 and 68 in dimension 2 at /pytorch/aten/src/TH/generic/THTensorMath.
c:3586>file:///home/ybl/%E5%9B%BE%E7%89%87/2019-01-09%2023-10-59%E5%B1%8F%E5%B9%95%E6%88%AA%E5%9B%BE.png
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.