Giter Site home page Giter Site logo

smt's Introduction

smt's People

Contributors

afeng-x avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

smt's Issues

mmdet

Your work is so great, thank you for your contribution to science, but I am having problems reproducing it:TypeError: RetinaNet: init_weights() got an unexpected keyword argument 'pretrained' , may I ask what is your version of mmdet, I am having the above problem with both MaskRCNN and RetinaNet.

OutOfMemoryError

Hi author, your code is great, but when I introduce your SMT module for training I always have a case of not enough memory in the attn = (q @ k.transpose(-2, -1)) * self.scale statement when calculating the Attention, and it's not enough for me to set the Batchsize to 1. Can the author give some ideas how to modify it, please. I'm only using stage3's structure

About the pth file

Thanks for your team's efforts. Could you please share this weight file?
image

How to draw the Fig.4?

Thank you for your sharing, may I ask how the figure 4 in the paper is drawn? Can you provide the code?

Query about the motivation

Hi there! This is a nice work. But I have a little query about the motivation of the architecture design. In the paper "Based on the research conducted in [11, 4], which performed a quantitative analysis of different depths of self-attention blocks and discovered that shallow blocks tend to capture short-range dependencies while deeper ones capture long-range dependencies".

From my knowledge, the transformer can always model globally and can capture high effective receptive field from initial stages. Why did you refer that the shallow blocks capture short-range dependencies and deeper ones capture long-range dependencies? Why did they both capture long-range? Why did shallow capture short range but deep capture long-range?

DW_Conv

Thank you very much for sharing your great work, in the paper I noticed that DW_Conv were used in the SAM module. If I use vanilla convolution instead of DW_Conv, will it have a performance improvement in addition to growing parameters?

train on IN-1K from scratch with smt_t; 65% top1 of 40 epoches.

I am not sure I train rightly.
some setting:

  • opt: adamw
  • baselr: 1e-3 warmup first 5 epoch
  • batch_size : 128
  • total epoch: 150 (not 300 because of too long gpu time with 10 days, 150 epoches ,4 T4 gpus)
  • lr_scheduler : cos from 5th epoch
    now trained 40 epoches ,65% top1 ,and increase rate slowly per epoch.
    and except total epoch(150) less than default 300;others are almost same as git .
    I want to know 2 information:
    1、Will total epoch that set 150 be big influence on final result ?
    2、Is it normal that 40 epoches ,65% top1 ? bacause training need long time ,I want to get some information before. If it is wrong,I will stop it early.

Long training time

Why did the training time for 10 batches increase several times when replacing the backbone network from CrossFormer-S to SMT-T, despite SMT-T having much fewer parameters and computations than CrossFormer-S?

I'm having a problem, can anyone help me with it please?

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2694) of binary: /usr/bin/python
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 193, in
main()
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

ImportError: cannot import name 'TorchDispatchMode' from 'torch.utils._python_dispatch'

Thank you for your sharing! when I followed the instruction to try this project and evalution on ImageNet -1k, It showed this error: ImportError: cannot import name 'TorchDispatchMode' from 'torch.utils._python_dispatch', My conda list is as below:

Name Version Build Channel

_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
ca-certificates 2024.3.11 h06a4308_0
ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.4 h6a678d5_1
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libstdcxx-ng 11.2.0 h1234567_1
ncurses 6.4 h6a678d5_0
numpy 1.24.1 pypi_0 pypi
opencv-python 4.4.0.46 pypi_0 pypi
openssl 3.0.13 h7f8727e_2
pillow 10.2.0 pypi_0 pypi
pip 24.0 py38h06a4308_0
ptflops 0.7.3 pypi_0 pypi
python 3.8.19 h955ad1f_0
pyyaml 6.0.1 pypi_0 pypi
readline 8.2 h5eee18b_0
scipy 1.10.1 pypi_0 pypi
setuptools 69.5.1 py38h06a4308_0
sqlite 3.45.3 h5eee18b_0
termcolor 1.1.0 pypi_0 pypi
thop 0.1.1-2209072238 pypi_0 pypi
timm 0.4.12 pypi_0 pypi
tk 8.6.14 h39e8969_0
torch 1.10.0+cu113 pypi_0 pypi
torchaudio 0.10.0+cu113 pypi_0 pypi
torchvision 0.11.1+cu113 pypi_0 pypi
typing-extensions 4.9.0 pypi_0 pypi
wheel 0.43.0 py38h06a4308_0
xz 5.4.6 h5eee18b_1
yacs 0.1.8 pypi_0 pypi
zlib 1.2.13 h5eee18b_1

Visualization of modulation values, relative receptive field

Thank you for your great work. I'm quite interested in your visualization works. It will contribute much to the research community.
It would be better if the authors could provide the code to visualize feature maps across heads and how to compute and draw relative receptive fields.
Thank you so much in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.