tengshaofeng / residualattentionnetwork-pytorch Goto Github PK

a pytorch code about Residual Attention Network. This code is based on two projects from

Python 100.00%

residualattentionnetwork-pytorch's Introduction

ResidualAttentionNetwork-pytorch

A pytorch code about Residual Attention Network.

This code is based on two projects from

https://github.com/liudaizong/Residual-Attention-Network and https://github.com/fwang91/residual-attention-network/blob/master/imagenet_model/Attention-92-deploy.prototxt

The first project is the pytorch code, but i think some network detail is not good. So I modify it according to the architechure of the Attention-92-deploy.prototxt.

And I also add the ResidualAttentionModel_92 for training imagenet, ResidualAttentionModel_448input for larger image input, and ResidualAttentionModel_92_32input_update for training cifar10.

paper referenced

Residual Attention Network for Image Classification (CVPR-2017 Spotlight) By Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Chen Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang

how to train?

first, download the data from http://www.cs.toronto.edu/~kriz/cifar.html make sure the varible

is_train = True

CUDA_VISIBLE_DEVICES=0 python train.py

CUDA_VISIBLE_DEVICES=0 python train_mixup.py(with mixup)

you can train on ResidualAttentionModel_56 or ResidualAttentionModel_448input, only should modify the code in train.py from "from model.residual_attention_network import ResidualAttentionModel_92 as ResidualAttentionModel" to "from model.residual_attention_network import ResidualAttentionModel_56 as ResidualAttentionModel"

how to test?

make sure the varible

is_train = False

CUDA_VISIBLE_DEVICES=0 python train.py

CUDA_VISIBLE_DEVICES=0 python train_mixup.py(with mixup)

result

cifar-10: Acc-95.4(Top-1 err 4.6) with ResidualAttentionModel_92_32input_update(higher than paper top-1 err 4.99)
cifar-10: Acc-96.65(Top-1 err 3.35) with ResidualAttentionModel_92_32input_update(with mixup).
cifar-10: Acc-96.84(Top-1 err 3.16) with ResidualAttentionModel_92_32input_update(with mixup, with simpler attention module).

Thanks to @PistonY, who give me the advice of mixup. More details for mixup you can reference the project https://github.com/facebookresearch/mixup-cifar10

the paper only give the archietcture details of attention_92 for imagenet with 224 input but not for cifar10. So I build the net following my understanding. I have not struggled for optimizing the code, so maybe you can do better based my code.

model file：

model_92_sgd.pkl is the trained model file, accuracy of 0.954

residualattentionnetwork-pytorch's People

Contributors

Stargazers

Watchers

Forkers

berryhn bentengma fengwengg leviawang udonda lijiannuist zt1112 nikitatselousov cch2016 chiehchiu aymenx17 dashengge back2yes abi98213 cltdevelop shuharold guancheng817 xiaoyigwr csjunxu jizongfox shubhampachori12110095 ondrejbiza icaresth alexliyang jy00002 xiaodongdreams witgotflg gqrong lxmwust honglongcai fengjiqiang wkflyerman zhzixuan carl-lei vistart huangwenwenlili manmancover lwpyh artechstark bzhong2 baucheng xxlxsyhl leolucklee wyk0517 deepcolin kjzju xychen9459 tonyfd ayulove zhanqan grp2019 lvxiuwang joegue husam1986 wzx479 pengchuan1994 xupp1989 13331112522 thomaslin1990 anhcda-study mengkunzhao mikey240 zcl912 wingszb xtmeng wqz960 lpsunny voidstrike akolada m1ckyro5a gaohuiluo zrdail gaimjkp yuan776 zhushaoquan xujiafree sunsunwudll bkl255 shiyanrubing jayant1234 3ptelephant cattyhubby sparkparis mayshy lanson07 fairuzsafwan zhaowujie greitzmann azureli yifeng1992 carol007 hvning jianku122 hell-to-heaven rena-jzhang star0071 kurnianggoro jkdomoguen qianrenjian maelstrom9

residualattentionnetwork-pytorch's Issues

Error : Data must be sequence , got float

I am trying to implement a new dataset on this code. I changed the class name and also included data class which gives an image as an item of size 448*448 through each iteration. And there is a list of labels matching the class name list. And I am using from model.residual_attention_network import ResidualAttentionModel_448input as.....

And I am getting this error :
Traceback (most recent call last):
File "train.py", line 83, in
model = ResidualAttentionModel().cuda()
File "/home/jayant/Documents/Marsh_Ann/ResidualAttentionNetwork-pytorch-master/model/residual_attention_network.py", line 24, in init
self.residual_block0 = ResidualBlock(64, 128)
File "/home/jayant/Documents/Marsh_Ann/ResidualAttentionNetwork-pytorch-master/model/basic_layers.py", line 20, in init
self.bn3 = nn.BatchNorm2d(output_channels/4)
File "/home/jayant/anaconda3/envs/saltmarsh/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 21, in init
self.weight = Parameter(torch.Tensor(num_features))
TypeError: new(): data must be a sequence (got float)
@tengshaofeng Do you have an intuition about what am I doing wrong? I can also share my dataset rendering class. It has a getiitem method which returns 448*448 image.

分类精度低的问题如何解决，整体精度95%？

During the test, cifar10, the output data structure is incorrect.

test code：

print('%s :Accuracy of the model on the test images: %d %%' % (datetime.now(),100 * correct / total))

# print('Accuracy of the model on the test images:', correct.item()/total)
# print(correct.item())
# print(total)
# for i in range(10):
#     print('%s :Accuracy of %5s : %2d %%' % (
#         datetime.now(),classes[i],  class_correct[i].item() / class_total[i]))
#     print(class_correct[i].item())
#     print(class_total[i])
# return correct / total

out：
D:\Microsoft Visual Studio\Shared\Anaconda3_64\envs\xk\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
2020-03-31 15:32:25.001979 :Accuracy of the model on the test images: 95 %
Accuracy of the model on the test images: 0.954
9540
10000
2020-03-31 15:32:25.002979 :Accuracy of plane : 0 %
194
1000.0
2020-03-31 15:32:25.002979 :Accuracy of car : 0 %
206
1000.0
2020-03-31 15:32:25.002979 :Accuracy of bird : 0 %
169
1000.0
2020-03-31 15:32:25.002979 :Accuracy of cat : 0 %
136
1000.0
2020-03-31 15:32:25.002979 :Accuracy of deer : 0 %
187
1000.0
2020-03-31 15:32:25.002979 :Accuracy of dog : 0 %
159
1000.0
2020-03-31 15:32:25.003980 :Accuracy of frog : 0 %
204
1000.0
2020-03-31 15:32:25.003980 :Accuracy of horse : 0 %
197
1000.0
2020-03-31 15:32:25.003980 :Accuracy of ship : 0 %
205
1000.0
2020-03-31 15:32:25.003980 :Accuracy of truck : 0 %
203
1000.0

If don't add ‘.item()’
The output will become：
D:\Microsoft Visual Studio\Shared\Anaconda3_64\envs\xk\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
2020-03-31 15:38:02.784257 :Accuracy of the model on the test images: 95 %
Accuracy of the model on the test images: tensor(0, device='cuda:0')
9540
10000
2020-03-31 15:38:02.785258 :Accuracy of plane : 0 %
tensor(194, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.786259 :Accuracy of car : 0 %
tensor(206, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.786259 :Accuracy of bird : 0 %
tensor(169, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.787259 :Accuracy of cat : 0 %
tensor(136, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.787259 :Accuracy of deer : 0 %
tensor(187, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.788261 :Accuracy of dog : 0 %
tensor(159, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of frog : 0 %
tensor(204, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of horse : 0 %
tensor(197, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of ship : 0 %
tensor(205, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.790264 :Accuracy of truck : 0 %
tensor(203, device='cuda:0', dtype=torch.uint8)
1000.0

I hope to get your help. thanks

下面是我运行某epoch的结果，我想问一下：为什么分类测试精度这么低？

Epoch [32/300], Iter [100/254] Loss: 0.2530
Epoch [32/300], Iter [200/254] Loss: 0.1421
the epoch takes time: 40.39500594139099
evaluate test set:
Accuracy of the model on the test images: 87 %
Accuracy of the model on the test images: 0.8785185185185185
Accuracy of plane : 0 %
Accuracy of car : 0 %
Accuracy of bird : 1 %
Accuracy of cat : 0 %
Accuracy of deer : 0 %
Accuracy of dog : 3 %
Accuracy of frog : 0 %
Accuracy of horse : 0 %
Accuracy of ship : 0 %
Accuracy of truck : 1 %

分类测试精度这么低，还有多个类别都有对应精度？是不是我运行软件的版本有问题，我用python3.5 pytorch1.1版本。还有就是最高精度没输出是咋回事。谢谢！

Hi，is there any impletation of visualizing the mask? i'm insterest in the mask they showed in the paper,it seems very good

A Inputsize Question

Hi @tengshaofeng ,thanks ,But I have a question,in attention_module.py ,the class AttentionModule_stage0 inputsize is 112112,but in the class AttentionModule_stage1 the inputsize is 5656,is any maxpool layer used in the middle?I think it's not mentioned in the paper.

attention map

您好，请问怎么输出attention map呢

Errors when I try to run train.py

I follow the insturction and run: CUDA_VISIBLE_DEVICES=0 python train.py
but I get
TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:

(tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
(tuple of ints size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

what's wrong with this code?

Shouldn't you record grad when testing?

ResidualAttentionNetwork-pytorch/Residual-Attention-Network/train_mixup.py

Line 56 in 88ed90f

images = Variable(images.cuda())

When testing, model do not need grad.
And this line caused me out of memory.

How to generate the masks given in the paper?

I have a trained residual attention model, and I want to visualize the masks given in Figure 1. Any idea how do the authors do that? @tengshaofeng If u have already done it, can u share the code to actually visualize the attention masks?

Test Accuracy Stagnates

Can you tell me if your training and testing accuracies always followed each other? I am implementing a smaller and modified version of the network you coded, and my test accuracy seems to have stagnated at 81%.
Also, I think you have coded a different architecture because you are adding output of pool layer as well as the output of pool+conv layer to the upsampled input, while the actual architecture only adds the pool+conv output to the upsampled layer. Is that making all the difference?

transfer learning

I want to use this code for another dataset, which parameter makes sure that my new data will be used for the model trained on CIFAR?

And do you have any advice, if input data dimensions are higher than CIFAR, e.g 100*100?

TypeError

When I run this code in python3.6
I met an error
'File "/home/user/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 33, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (float, int, int, int), but expected one of:

no arguments
(int ...)
didn't match because some of the arguments have invalid types: (float, int, int, int)
(torch.FloatTensor viewed_tensor)
(torch.Size size)
(torch.FloatStorage data)
(Sequence data)
'
Do you know how to fix it
Thank you

The error about if name == 'main': freeze_support()

excuse me，your code bring me a big help about my research，but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you!
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

    if __name__ == '__main__':
        freeze_support()
        ...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

请问在AttentionModule_stage1_cifar函数中原论文结构这里没在上采样后加 out_trunk这一步骤吧如下

out_interp2 = self.interpolation2(out_up_residual_blocks1) + out_trunk

Mixed attention、Channel attention and Spatial attention

Hello, I studied your code carefully, and then I found that there are different formulas for Mixed Attention, Channel Attention and Spatial Attention in the paper. But I don't see a formal representation of F (xi, c) in your code. I just started to learn about Deep Networks. How do I modify the network if I want to express different attentions? Thank you!

extending it to 3d data

Hi,
Can we implement the same network for 3D data by using 3d layers of same 2d layers? What do you advice?

model = ResidualAttentionModel() error with python3

TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

(torch.device device)
(torch.Storage storage)
(Tensor other)
(tuple of ints size, torch.device device)
(object data, torch.device device)

Focus of the attention mask

I have question about the the soft attention mask. I have implemented residual attention blocks for specific domain (faces). How does the attention mask focus on specific regions of the face? such as forehead and so on???

请问residual_attention_network.py里的各个类有什么区别？

It seems that the this code reproduced results can not achieve the results in the original paper ?

stage 0

where the code has stage 0 which doesn't exist in the paper

i think the num of params for cifar10 residual network is incorrect

i think the num of params for cifar10 residual network is incorrect, i find that it is much bigger than the num in paper

model_92_sgd.pkl is pre_trained for cifar10?

Hi,
Is the model_92_sgd.pkl is pre_trained for cifar10? Does the imagenet has the pretrained model? Thanks

Expression of mix attention

Thanks for your job! I have a question about the expression of mix attention. And is conv->relu->conv->sigmoid able to represent it?

Errors when I run train.py

Traceback (most recent call last):
File "train.py", line 93, in
model = ResidualAttentionModel().cuda()
File "/home/ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/residual_attention_network.py", line 136, in init
self.residual_block1 = ResidualBlock(64, 256)
File "/home/ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/basic_layers.py", line 16, in init
self.conv1 = nn.Conv2d(input_channels, output_channels/4, 1, 1, bias = False)
File "/home//.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 412, in init
False, _pair(0), groups, bias, padding_mode)
File "/home//.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 78, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

(*, torch.device device)
(torch.Storage storage)
(Tensor other)
(tuple of ints size, *, torch.device device)
(object data, *, torch.device device)

Traceback (most recent call last): File "train.py", line 20, in <module> from model.residual_attention_network import ResidualAttentionModel_92_32input_update as ResidualAttentionModel ImportError: No module named model.residual_attention_network

what's the version of torch, torchvision and python?

what's the version of torch, torchvision and python? can anyone explain it?

model_92_sgd.pkl is pre_trained for cifar10?

Hi,
Is the model_92_sgd.pkl is pre_trained for cifar10? Does the imagenet has the pretrained model? Thanks

Questions about the performance on ImageNet

Is there anyone train the resattentionnet on ImageNet?

The paper didn't provide the batchsize for ImageNet training. So I set the batchsize=256/lr=0.1 which is a common setting, but the training result (top1-acc: 77.64) is much lower than paper reported (top1-acc: 78.24) ! More details about hyperparameters are listed as below. The epoch setting is converted from the iteration which is mentioned in paper. If we set the batchsize as 256, then there is 5k iteration in 1 epoch. According to the paper, we should decay the learning rate at 200k/5k=40, 400k/5k=80, 500k/5k=100 epoch, and terminate training at 530/5k=106 epoch.

The learning rate is divided by 10 at 200k, 400k, 500k iterations. We terminate training at 530k iterations.

Hyperparameter settings

args.epochs = 106
args.batch_size = 256
### data transform: RandomResizeCrop(224)/HorizontalFlip(0.5)/ChangeLight(AlexNet color augmenation)/Normalize() are used in training
args.autoaugment = False
args.colorjitter = False
args.change_light = True # standard color augmentation from AlexNet
### optimizer
args.optimizer = 'SGD'
args.lr = 0.1
args.momentum = 0.9
args.weigh_decay_apply_on_all = True  # TODO: weight decay apply on which params
args.weight_decay = 1e-4
args.nesterov = True
### criterion
args.labelsmooth = 0
### lr scheduler
args.scheduler = 'uneven_multistep'
args.lr_decay_rate = 0.1
args.lr_milestone = [40, 80, 100]

What is the meaning of `softmax` in attention_module.py?

Hi, I am confused about the term softmax_blocks. The term in the paper should be soft mask blocks? I check the ResidualBlock class which does not exist normalization layers.

about the code "out_interp = self.interpolation1(out_middle_2r_blocks) + out_down_residual_blocks1"

hello ,thank you for your code!
But I have a question about your code.The episode in your code seems to be no such operation in the paper and in the soft mask branch only skip connection have addition operation.Could you help me solve this question?
out_interp = self.interpolation1(out_middle_2r_blocks) + out_down_residual_blocks1

pretrained network

你好，如果我想用自己的数据集，有没有在ImageNet上预训练好的模型呢？

have you ever tested the num of theparams

i find that the params is different from

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).

I ran your code and meet the error as below:

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). Traceback (most recent call last): File "train_pre.py", line 52, in <module> for i, (images, labels) in enumerate(train_loader): File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 275, in __next__ idx, batch = self._get_batch() File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 254, in _get_batch return self.data_queue.get() File "/usr/lib/python3.5/multiprocessing/queues.py", line 343, in get res = self._reader.recv_bytes() File "/usr/lib/python3.5/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 175, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 50) is killed by signal: Bus error.

The run environment is python 3.5, tensorflow 1.0.1 and pythorch 0.3.1. I have search for the solutions. And I think this maybe cause by version confilicts.

Can you tell us the run environment, and any other suggestion? thx

About multi-label

Hi @tengshaofeng,
Do you know if this model can process multi-label datasets, like NUSWIDE? Any idea how to do it? Thank you.

话题关闭

new() received an invalid combination of arguments

E TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:
E * (torch.device device)
E * (torch.Storage storage)
E * (Tensor other)
E * (tuple of ints size, torch.device device)
E * (object data, torch.device device)

how to fix it? Thanks

can you provide the pretrained model

Thank you for sharing your code!

can you provide the best pretrained model?