Giter Site home page Giter Site logo

residualattentionnetwork-pytorch's Introduction

ResidualAttentionNetwork-pytorch

A pytorch code about Residual Attention Network.

This code is based on two projects from

https://github.com/liudaizong/Residual-Attention-Network and https://github.com/fwang91/residual-attention-network/blob/master/imagenet_model/Attention-92-deploy.prototxt

The first project is the pytorch code, but i think some network detail is not good. So I modify it according to the architechure of the Attention-92-deploy.prototxt.

And I also add the ResidualAttentionModel_92 for training imagenet, ResidualAttentionModel_448input for larger image input, and ResidualAttentionModel_92_32input_update for training cifar10.

paper referenced

Residual Attention Network for Image Classification (CVPR-2017 Spotlight) By Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Chen Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang

how to train?

first, download the data from http://www.cs.toronto.edu/~kriz/cifar.html make sure the varible

is_train = True

CUDA_VISIBLE_DEVICES=0 python train.py

CUDA_VISIBLE_DEVICES=0 python train_mixup.py(with mixup)

you can train on ResidualAttentionModel_56 or ResidualAttentionModel_448input, only should modify the code in train.py from "from model.residual_attention_network import ResidualAttentionModel_92 as ResidualAttentionModel" to "from model.residual_attention_network import ResidualAttentionModel_56 as ResidualAttentionModel"

how to test?

make sure the varible

is_train = False

CUDA_VISIBLE_DEVICES=0 python train.py

CUDA_VISIBLE_DEVICES=0 python train_mixup.py(with mixup)

result

  1. cifar-10: Acc-95.4(Top-1 err 4.6) with ResidualAttentionModel_92_32input_update(higher than paper top-1 err 4.99)

  2. cifar-10: Acc-96.65(Top-1 err 3.35) with ResidualAttentionModel_92_32input_update(with mixup).

  3. cifar-10: Acc-96.84(Top-1 err 3.16) with ResidualAttentionModel_92_32input_update(with mixup, with simpler attention module).

Thanks to @PistonY, who give me the advice of mixup. More details for mixup you can reference the project https://github.com/facebookresearch/mixup-cifar10

the paper only give the archietcture details of attention_92 for imagenet with 224 input but not for cifar10. So I build the net following my understanding. I have not struggled for optimizing the code, so maybe you can do better based my code.

model file:

model_92_sgd.pkl is the trained model file, accuracy of 0.954

residualattentionnetwork-pytorch's People

Contributors

tengshaofeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

residualattentionnetwork-pytorch's Issues

Error : Data must be sequence , got float

I am trying to implement a new dataset on this code. I changed the class name and also included data class which gives an image as an item of size 448*448 through each iteration. And there is a list of labels matching the class name list. And I am using from model.residual_attention_network import ResidualAttentionModel_448input as.....

And I am getting this error :
Traceback (most recent call last):
File "train.py", line 83, in
model = ResidualAttentionModel().cuda()
File "/home/jayant/Documents/Marsh_Ann/ResidualAttentionNetwork-pytorch-master/model/residual_attention_network.py", line 24, in init
self.residual_block0 = ResidualBlock(64, 128)
File "/home/jayant/Documents/Marsh_Ann/ResidualAttentionNetwork-pytorch-master/model/basic_layers.py", line 20, in init
self.bn3 = nn.BatchNorm2d(output_channels/4)
File "/home/jayant/anaconda3/envs/saltmarsh/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 21, in init
self.weight = Parameter(torch.Tensor(num_features))
TypeError: new(): data must be a sequence (got float)
@tengshaofeng Do you have an intuition about what am I doing wrong? I can also share my dataset rendering class. It has a getiitem method which returns 448*448 image.

During the test, cifar10, the output data structure is incorrect.

test code:

print('%s :Accuracy of the model on the test images: %d %%' % (datetime.now(),100 * correct / total))

# print('Accuracy of the model on the test images:', correct.item()/total)
# print(correct.item())
# print(total)
# for i in range(10):
#     print('%s :Accuracy of %5s : %2d %%' % (
#         datetime.now(),classes[i],  class_correct[i].item() / class_total[i]))
#     print(class_correct[i].item())
#     print(class_total[i])
# return correct / total

out:
D:\Microsoft Visual Studio\Shared\Anaconda3_64\envs\xk\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
2020-03-31 15:32:25.001979 :Accuracy of the model on the test images: 95 %
Accuracy of the model on the test images: 0.954
9540
10000
2020-03-31 15:32:25.002979 :Accuracy of plane : 0 %
194
1000.0
2020-03-31 15:32:25.002979 :Accuracy of car : 0 %
206
1000.0
2020-03-31 15:32:25.002979 :Accuracy of bird : 0 %
169
1000.0
2020-03-31 15:32:25.002979 :Accuracy of cat : 0 %
136
1000.0
2020-03-31 15:32:25.002979 :Accuracy of deer : 0 %
187
1000.0
2020-03-31 15:32:25.002979 :Accuracy of dog : 0 %
159
1000.0
2020-03-31 15:32:25.003980 :Accuracy of frog : 0 %
204
1000.0
2020-03-31 15:32:25.003980 :Accuracy of horse : 0 %
197
1000.0
2020-03-31 15:32:25.003980 :Accuracy of ship : 0 %
205
1000.0
2020-03-31 15:32:25.003980 :Accuracy of truck : 0 %
203
1000.0

If don't add ‘.item()’
The output will become:

D:\Microsoft Visual Studio\Shared\Anaconda3_64\envs\xk\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
2020-03-31 15:38:02.784257 :Accuracy of the model on the test images: 95 %
Accuracy of the model on the test images: tensor(0, device='cuda:0')
9540
10000
2020-03-31 15:38:02.785258 :Accuracy of plane : 0 %
tensor(194, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.786259 :Accuracy of car : 0 %
tensor(206, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.786259 :Accuracy of bird : 0 %
tensor(169, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.787259 :Accuracy of cat : 0 %
tensor(136, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.787259 :Accuracy of deer : 0 %
tensor(187, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.788261 :Accuracy of dog : 0 %
tensor(159, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of frog : 0 %
tensor(204, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of horse : 0 %
tensor(197, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of ship : 0 %
tensor(205, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.790264 :Accuracy of truck : 0 %
tensor(203, device='cuda:0', dtype=torch.uint8)
1000.0

I hope to get your help. thanks

下面是我运行某epoch的结果,我想问一下:为什么分类测试精度这么低?

Epoch [32/300], Iter [100/254] Loss: 0.2530
Epoch [32/300], Iter [200/254] Loss: 0.1421
the epoch takes time: 40.39500594139099
evaluate test set:
Accuracy of the model on the test images: 87 %
Accuracy of the model on the test images: 0.8785185185185185
Accuracy of plane : 0 %
Accuracy of car : 0 %
Accuracy of bird : 1 %
Accuracy of cat : 0 %
Accuracy of deer : 0 %
Accuracy of dog : 3 %
Accuracy of frog : 0 %
Accuracy of horse : 0 %
Accuracy of ship : 0 %
Accuracy of truck : 1 %

分类测试精度这么低,还有多个类别都有对应精度?是不是我运行软件的版本有问题,我用python3.5 pytorch1.1版本。还有就是最高精度没输出是咋回事。谢谢!

A Inputsize Question

Hi @tengshaofeng ,thanks ,But I have a question,in attention_module.py ,the class AttentionModule_stage0 inputsize is 112112,but in the class AttentionModule_stage1 the inputsize is 5656,is any maxpool layer used in the middle?I think it's not mentioned in the paper.

Errors when I try to run train.py

I follow the insturction and run: CUDA_VISIBLE_DEVICES=0 python train.py
but I get
TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:

  • (tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
  • (tuple of ints size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

what's wrong with this code?

Test Accuracy Stagnates

Can you tell me if your training and testing accuracies always followed each other? I am implementing a smaller and modified version of the network you coded, and my test accuracy seems to have stagnated at 81%.
Also, I think you have coded a different architecture because you are adding output of pool layer as well as the output of pool+conv layer to the upsampled input, while the actual architecture only adds the pool+conv output to the upsampled layer. Is that making all the difference?

transfer learning

I want to use this code for another dataset, which parameter makes sure that my new data will be used for the model trained on CIFAR?

And do you have any advice, if input data dimensions are higher than CIFAR, e.g 100*100?

TypeError

When I run this code in python3.6
I met an error
'File "/home/user/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 33, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (float, int, int, int), but expected one of:

  • no arguments
  • (int ...)
    didn't match because some of the arguments have invalid types: (float, int, int, int)
  • (torch.FloatTensor viewed_tensor)
  • (torch.Size size)
  • (torch.FloatStorage data)
  • (Sequence data)
    '
    Do you know how to fix it
    Thank you

The error about if __name__ == '__main__': freeze_support()

excuse me,your code bring me a big help about my research,but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you!
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

    if __name__ == '__main__':
        freeze_support()
        ...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

Mixed attention、Channel attention and Spatial attention

Hello, I studied your code carefully, and then I found that there are different formulas for Mixed Attention, Channel Attention and Spatial Attention in the paper. But I don't see a formal representation of F (xi, c) in your code. I just started to learn about Deep Networks. How do I modify the network if I want to express different attentions? Thank you!

extending it to 3d data

Hi,
Can we implement the same network for 3D data by using 3d layers of same 2d layers? What do you advice?

model = ResidualAttentionModel() error with python3

TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

  • (torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, torch.device device)
  • (object data, torch.device device)

Focus of the attention mask

I have question about the the soft attention mask. I have implemented residual attention blocks for specific domain (faces). How does the attention mask focus on specific regions of the face? such as forehead and so on???

stage 0

where the code has stage 0 which doesn't exist in the paper

Expression of mix attention

Thanks for your job! I have a question about the expression of mix attention. And is conv->relu->conv->sigmoid able to represent it?

Errors when I run train.py

Traceback (most recent call last):
File "train.py", line 93, in
model = ResidualAttentionModel().cuda()
File "/home/ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/residual_attention_network.py", line 136, in init
self.residual_block1 = ResidualBlock(64, 256)
File "/home/ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/basic_layers.py", line 16, in init
self.conv1 = nn.Conv2d(input_channels, output_channels/4, 1, 1, bias = False)
File "/home//.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 412, in init
False, _pair(0), groups, bias, padding_mode)
File "/home//.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 78, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

  • (*, torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, *, torch.device device)
  • (object data, *, torch.device device)

Questions about the performance on ImageNet

Is there anyone train the resattentionnet on ImageNet?

The paper didn't provide the batchsize for ImageNet training. So I set the batchsize=256/lr=0.1 which is a common setting, but the training result (top1-acc: 77.64) is much lower than paper reported (top1-acc: 78.24) ! More details about hyperparameters are listed as below. The epoch setting is converted from the iteration which is mentioned in paper. If we set the batchsize as 256, then there is 5k iteration in 1 epoch. According to the paper, we should decay the learning rate at 200k/5k=40, 400k/5k=80, 500k/5k=100 epoch, and terminate training at 530/5k=106 epoch.

The learning rate is divided by 10 at 200k, 400k, 500k iterations. We terminate training at 530k iterations.

Hyperparameter settings

args.epochs = 106
args.batch_size = 256
### data transform: RandomResizeCrop(224)/HorizontalFlip(0.5)/ChangeLight(AlexNet color augmenation)/Normalize() are used in training
args.autoaugment = False
args.colorjitter = False
args.change_light = True # standard color augmentation from AlexNet
### optimizer
args.optimizer = 'SGD'
args.lr = 0.1
args.momentum = 0.9
args.weigh_decay_apply_on_all = True  # TODO: weight decay apply on which params
args.weight_decay = 1e-4
args.nesterov = True
### criterion
args.labelsmooth = 0
### lr scheduler
args.scheduler = 'uneven_multistep'
args.lr_decay_rate = 0.1
args.lr_milestone = [40, 80, 100]

pretrained network

你好,如果我想用自己的数据集,有没有在ImageNet上预训练好的模型呢?

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).

I ran your code and meet the error as below:

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). Traceback (most recent call last): File "train_pre.py", line 52, in <module> for i, (images, labels) in enumerate(train_loader): File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 275, in __next__ idx, batch = self._get_batch() File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 254, in _get_batch return self.data_queue.get() File "/usr/lib/python3.5/multiprocessing/queues.py", line 343, in get res = self._reader.recv_bytes() File "/usr/lib/python3.5/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 175, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 50) is killed by signal: Bus error.

The run environment is python 3.5, tensorflow 1.0.1 and pythorch 0.3.1. I have search for the solutions. And I think this maybe cause by version confilicts.

Can you tell us the run environment, and any other suggestion? thx

new() received an invalid combination of arguments

E TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:
E * (torch.device device)
E * (torch.Storage storage)
E * (Tensor other)
E * (tuple of ints size, torch.device device)
E * (object data, torch.device device)

how to fix it? Thanks

Errors when I run train.py

File "/home//ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/attention_module.py", line 249, in forward
out_interp3 = self.interpolation3(out_softmax3) + out_softmax2
RuntimeError: The size of tensor a (14) must match the size of tensor b (2) at non-singleton dimension 3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.