mxbonn / ltmp Goto Github PK

Code for Learned Thresholds Token Merging and Pruning for Vision Transformers (LTMP). A technique to reduce the size of Vision Transformers to any desired size with minimal loss of accuracy.

Home Page: https://maxim.bonnaerens.com/publication/ltmp

Python 100.00%

computer-vision deep-learning efficiency pruning transformer vision-transformer

ltmp's People

Contributors

Stargazers

Watchers

Forkers

dl-vit

ltmp's Issues

CORE CODE

你好，请问一下论文中实现”学习阈值以实现合并与剪枝“的功能的核心代码是哪部分呢？Hello, may I ask which part of the core code in the paper implements the function of "learning threshold to achieve merging and pruning"?

stop gradient operation in merging

Hi, I have a question about implementation details regarding learned threshold merging.

In this line of code, you detach the generated mask which still has a gradient flow by straight through trick.
In my understanding, still, the threshold can still be learned by flop loss. Is there any other reason for using a stop gradient in the mask applying to the features? Can it make models learn hard if no stop gradient is applied?

Thanks for providing wonderful work!

Recommend Projects

mxbonn / ltmp Goto Github PK

ltmp's People

Contributors

Stargazers

Watchers

Forkers

ltmp's Issues

CORE CODE

stop gradient operation in merging

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent