Hi, thank you very much to opensource such a wonderful work! I have a question for

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Question about the visualization for VIT? about transformer-mm-explainability HOT 2 CLOSED

hila-chefer commented on May 19, 2024

Question about the visualization for VIT?

from transformer-mm-explainability.

Comments (2)

hila-chefer commented on May 19, 2024

Hi @zhaoxin94 thanks for your interest in our work!
What you are referring to is a normalization we use for the multi modal update rule.
For pure self attention architectures, this issue of no normalization will indeed cause the values in the diagonal to be bigger than those outside it, but for self attention, we take the row corresponding to the CLS token, and disregard the CLS score itself.
So basically, the values in the diagonal do not influence the visualization in the self attention case, and regularization has little to no impact.
This question was raised by the reviewers as well, so we ran experiments of normalizing Eq. 6 and found results to be very similar with a marginal difference.

I hope this answers your question, please let me know if you require any further clarifications.

Best,
Hila.

from transformer-mm-explainability.

hila-chefer commented on May 19, 2024

@zhaoxin94 closing this issue due to inactivity. Please reopen if necessary.

from transformer-mm-explainability.

Question about the visualization for VIT? about transformer-mm-explainability HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent