Giter Site home page Giter Site logo

Comments (7)

tengshaofeng avatar tengshaofeng commented on August 9, 2024

@Neo96Mav , which network you used, or you have modified the network yourself based my code?

from residualattentionnetwork-pytorch.

jain-avi avatar jain-avi commented on August 9, 2024

I have used your network and the official Caffe network for reference, and implemented my own small network. I am not using attention modules for 4x4 because I feel they are too small, and I am only using one attention module in 8x8. My network is relatively small, and its for CIFAR images only.
Can you let me know the intuition behind this -
screen shot 2018-06-15 at 5 03 51 pm
You have added output of residual block, as well as the output of the skip connection to the upsampled layer!

from residualattentionnetwork-pytorch.

tengshaofeng avatar tengshaofeng commented on August 9, 2024

@Neo96Mav , this is refer to the caffe network, i think it is added for more detail information. You can remove it for testing the effectiveness.

from residualattentionnetwork-pytorch.

josianerodrigues avatar josianerodrigues commented on August 9, 2024

Hi @Neo96Mav,
Did you test the model using only one 8x8 Attention module? Was the accuracy better?

from residualattentionnetwork-pytorch.

jain-avi avatar jain-avi commented on August 9, 2024

Hi @josianerodrigues , I added the 4x4 attention module as well. I am stuck at 89.5% accuracy. Maybe my model is not big enough or I am not using the exact same configuration, but I feel that it should not have affected it so much. @tengshaofeng Do u have any ideas why we can't match the authors performance?

from residualattentionnetwork-pytorch.

tengshaofeng avatar tengshaofeng commented on August 9, 2024

@Neo96Mav , the paper only give the archietcture details of attention_92 for imagenet with 224 input but not for cifar10. So I build the net ResidualAttentionModel_92_32input following my understanding.
I have tested on it on cifar10 test set, the result is as following:
Accuracy of the model on the test images: 0.9354

maybe some details is not good. you can refer to the data preprocessed in the paper, keep same with the author. or maybe you can tune the hyper parameters for better performance. U can also remove the add operation to test the network.
image

from residualattentionnetwork-pytorch.

tengshaofeng avatar tengshaofeng commented on August 9, 2024

@Neo96Mav @josianerodrigues
the result now is 0.954

from residualattentionnetwork-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.