Hi All, Thanks for providing the code. I come across the mismatc

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

By the way , the self.proj = nn.Linear(dim_out, dim_out)</cod

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="use

code mismatch with the theory about cvt HOT 10 OPEN

microsoft commented on July 4, 2024

code mismatch with the theory

from cvt.

Comments (10)

basavaraj-hampiholi commented on July 4, 2024 2

Hi @diaodeyi ,
In the present code, get_cls_model function is called by the registry.py. You can use build_model in build.py to call the model. Otherwise, you can remove the registry and directly call get_cls_model function. Both way should work

Good luck..

from cvt.

askerlee commented on July 4, 2024

This is after conv_proj_q, conv_proj_k and conv_proj_v. But I'm not sure why the authors still use the pointwise projections after the conv projections.

from cvt.

basavaraj-hampiholi commented on July 4, 2024

@askerlee, I think it is part of depthwise separable convolutions. Depthwise convolutions followed by pointwise projections.

from cvt.

diaodeyi commented on July 4, 2024

I want to know the code how to call the get_cls_model function in the cls_cvt.py

from cvt.

diaodeyi commented on July 4, 2024

By the way , theself.proj = nn.Linear(dim_out, dim_out)Means FFN only projection with same dimension?

from cvt.

basavaraj-hampiholi commented on July 4, 2024

@diaodeyi It's the single linear layer (with the same in/out dimension) right after the attention calculation. The FFN in this code is class MLP (line 53).

from cvt.

diaodeyi commented on July 4, 2024

Thanks, there are so many linear projections that aren't be mentioned by paper.

from cvt.

basavaraj-hampiholi commented on July 4, 2024

@diaodeyi Yes. I think they have left them out with the presumption that the reader has a prior good understanding of basic transformer architecture.

from cvt.

diaodeyi commented on July 4, 2024

@askerlee, I think it is part of depthwise separable convolutions. Depthwise convolutions followed by pointwise projections.

No, I think the proj_q\k\v are exactly the things the paper does not mention.

from cvt.

Markin-Wang commented on July 4, 2024

@askerlee, I think it is part of depthwise separable convolutions. Depthwise convolutions followed by pointwise projections.

No, I think the proj_q\k\v are exactly the things the paper does not mention.

Hi, the seperable depth conv contains two parts: depth-wise conv and point-wise conv. The author implemented the point-wise conv via the linear layer, maybe because it's convenience for the ablation study. The only difference between them is the bias term.

from cvt.

code mismatch with the theory about cvt HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent