youhuang67 / interformer Goto Github PK

View Code? Open in Web Editor NEW

35.0 35.0 5.0 10.3 MB

License: MIT License

Python 98.78% Shell 0.95% Dockerfile 0.12% Makefile 0.06% CSS 0.01% Batchfile 0.07%

interformer's People

Contributors

Stargazers

Watchers

Forkers

universewill codwest provable0816 siqi9747 black123321

interformer's Issues

About different image sizes in Table 2

In Table 2 of the paper, you adopt different image sizes for different methods (with 512 for InterFormer). Would it be unfair for the performance (NoC) comparison?

AttributeError: 'ConfigDict' object has no attribute 'test'

Hello author, thank you very much for your outstanding contribution.
In Evaluation，To evaluate on SBD with InterFormer-Tiny, run: CUDA_VISIBLE_DEVICES=0,1,2,3 bash tools/dist_clicktest.sh work_dirs/interformer_tiny_coco_lvis_320k/iter_320000.pth 4 --dataset sbd --size_divisor 32.
I encountered an error while running：Traceback (most recent call last):
File "/home/student1/wbj/InterFormer/tools/clicktest.py", line 316, in
main()
File "/home/student1/wbj/InterFormer/tools/clicktest.py", line 219, in main
dataset = build_dataset(cfg.data.test)
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/mmcv/utils/config.py", line 50, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'test'
How should I solve it?

The full output of the error：
(py38_wbj) student1@user-PowerEdge-R555:~$ CUDA_VISIBLE_DEVICES=0,1 bash /home/student1/wbj/InterFormer/tools/dist_clicktest.sh "/home/student1/wbj/InterFormer/work_dirs/interformer_tiny_coco_lvis_320k/iter_320000.pth" 2 --dataset berkeley --size_divisor 32
/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
WARNING:torch.distributed.run:

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

2023-07-05 01:55:19,481 - mmseg - INFO - Multi-processing start method is `None`
2023-07-05 01:55:19,482 - mmseg - INFO - OpenCV num_threads is `32 2023-07-05 01:55:19,482 - mmseg - INFO - OMP num threads is 1 2023-07-05 01:55:20,168 - mmseg - INFO - Multi-processing start method is` None`2023-07-05 01:55:20,169 - mmseg - INFO - OpenCV num_threads is`32
2023-07-05 01:55:20,169 - mmseg - INFO - OMP num threads is 1
Traceback (most recent call last):
File "/home/student1/wbj/InterFormer/tools/clicktest.py", line 316, in
main()
File "/home/student1/wbj/InterFormer/tools/clicktest.py", line 219, in main
dataset = build_dataset(cfg.data.test)
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/mmcv/utils/config.py", line 50, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'test'
Traceback (most recent call last):
File "/home/student1/wbj/InterFormer/tools/clicktest.py", line 316, in
main()
File "/home/student1/wbj/InterFormer/tools/clicktest.py", line 219, in main
dataset = build_dataset(cfg.data.test)
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/mmcv/utils/config.py", line 50, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'test'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 11982) of binary: /home/student1/anaconda3/envs/py38_wbj/bin/python
Traceback (most recent call last):
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/student1/anaconda3/envs/py38_wbj/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/home/student1/wbj/InterFormer/tools/clicktest.py FAILED

Failures:
[1]:
time : 2023-07-05_01:55:22
host : user-PowerEdge-R555
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 15984)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2023-07-05_01:55:22
host : user-PowerEdge-R555
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 15982)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Training Flow

In SimpleClick, each image undergoes 1 to 3 iterations. During the iteration for one image, previous outputs and new coordinate features are processed through the entire model each time. In your architecture, previous outputs and new coordinate features are fed into 'Feature Decoding.' Therefore, I believe that during the iteration for one image, the image features from 'Feature Encoding' are reused.
Is my understanding correct?

Training code

I'm sorry to bother you again. I have been tracing your code, but I still cannot find the training code that consists of the complete training flow. Could you give me a hint?

Why model memory usage is too large?

Hello, I run the demo.py with 'python demo.py weights/iter_320000.pth --device cuda:0' successfully. My gpu is RTX 4090 D with 24GB memory. When I opened a 960 × 1440 image, it took up more than 24GB and I got a "CUDA out of memory" error. Even when I opened a 110 × 118 image, it took up 3GB of memory. Why is that?

The trained weights link is broken

"The trained weights are available at [InterFormer(https://drive.google.com/drive/folders/1kEll7pqulpE00JcCvSut0e9C4peKDWRe?usp=sharing)" in READE.ME is broken.

License

Would you be able to add a license to your code?

Reproducing results

Dear authors, @YouHuang67

Thanks for releasing the code and models. I am trying to rerun your released light model on DAVIS, but the numbers look better than you reported in the paper. For example, I got 5.54 for NoC90, but you reported 6.19. Also, your code will crash for case 008.jpg, so I need to remove it for evaluation. Missing one case won't cause such a big difference, so I may need your help to figure out the issue.

How did you handle the crashed case 008.jpg?
Did you merge all the objects in an image for evaluation? As previous works did.
Did you do any processing for the original DAVIS345 dataset? I saw you at least renamed the masks.

BTW, the results for Berkeley are the same. I look forward to hearing from you! Thanks in advance.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.