Light

Test Accuracy Stagnates about residualattentionnetwork-pytorch HOT 7 CLOSED

tengshaofeng commented on August 9, 2024

Test Accuracy Stagnates

from residualattentionnetwork-pytorch.

Comments (7)

tengshaofeng commented on August 9, 2024

@Neo96Mav , which network you used, or you have modified the network yourself based my code?

from residualattentionnetwork-pytorch.

jain-avi commented on August 9, 2024

I have used your network and the official Caffe network for reference, and implemented my own small network. I am not using attention modules for 4x4 because I feel they are too small, and I am only using one attention module in 8x8. My network is relatively small, and its for CIFAR images only.
Can you let me know the intuition behind this -

You have added output of residual block, as well as the output of the skip connection to the upsampled layer!

from residualattentionnetwork-pytorch.

tengshaofeng commented on August 9, 2024

@Neo96Mav ， this is refer to the caffe network, i think it is added for more detail information. You can remove it for testing the effectiveness.

from residualattentionnetwork-pytorch.

josianerodrigues commented on August 9, 2024

Hi @Neo96Mav,
Did you test the model using only one 8x8 Attention module? Was the accuracy better?

from residualattentionnetwork-pytorch.

jain-avi commented on August 9, 2024

Hi @josianerodrigues , I added the 4x4 attention module as well. I am stuck at 89.5% accuracy. Maybe my model is not big enough or I am not using the exact same configuration, but I feel that it should not have affected it so much. @tengshaofeng Do u have any ideas why we can't match the authors performance?

from residualattentionnetwork-pytorch.

tengshaofeng commented on August 9, 2024

@Neo96Mav , the paper only give the archietcture details of attention_92 for imagenet with 224 input but not for cifar10. So I build the net ResidualAttentionModel_92_32input following my understanding.
I have tested on it on cifar10 test set, the result is as following:
Accuracy of the model on the test images: 0.9354

maybe some details is not good. you can refer to the data preprocessed in the paper, keep same with the author. or maybe you can tune the hyper parameters for better performance. U can also remove the add operation to test the network.

from residualattentionnetwork-pytorch.

tengshaofeng commented on August 9, 2024

@Neo96Mav @josianerodrigues
the result now is 0.954

from residualattentionnetwork-pytorch.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.