Giter Site home page Giter Site logo

Comments (14)

wondervictor avatar wondervictor commented on September 14, 2024 5

Besides, reporting the inference time and corresponding accuracy with the single-scale input without cropping or flipping will be more fair in comparison with other methods

from attanet.

wondervictor avatar wondervictor commented on September 14, 2024 5

Just to be clear, SFNet adopts the single scale testing with input size 1024x2048 and BiSeNet adopts a downsampled input 1024x2048. Neither of them adopts multi-scale testing / sliding / flipping in test. Notably, we fixed the input size to 1024x2048 or 512x1024 of the AttaNet but failed to reach the results of the paper. (You can see the details of the discussions above. I'm not alone). Moreover, evaluating speed without torch.cuda.synchronize() is a serious bug and leads to wrong inference time(time w/ synchronize >> time w/o synchronize) .

Actual speed and accuracy of the proposed AttaNet grabs more attention. Providing correct evaluation scripts is urgent since the repo has been open sourced for several months.
Thanks.

from attanet.

wondervictor avatar wondervictor commented on September 14, 2024 2

Further, I've downloaded the code & models and evaluated the speed and accuracy in my local machine.
Specs: GPU: NVIDIA Titan Xp, CPU: 2 Intel Xeon E5-2620 v3.

Model: AttaNet w/ ResNet-18

Speed: 1024x2048 input size

inference.py outputs:

load resnet
start warm upwarm up done
=======================================
FPS: 24.972443
Inference time 40.044140 ms

Accuracy: 1024x2048 input size * w/o crop and flip*

evaluate.py outputs:

================================================================================
evaluating the model ...

setup and restore model
load resnet
compute the mIOU
 61%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                                            | 305/500 [03:10<02:55,  1.11it/s]100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [06:09<00:00,  1.35it/s]
[0.98095101 0.84992488 0.91837538 0.64723809 0.65401004 0.5827008
 0.61811279 0.75313067 0.91223381 0.68902523 0.92999311 0.77462518
 0.5425117  0.94219012 0.85731529 0.87147848 0.79053572 0.52289506
 0.739424  ]
0.7671932295569451
mIOU is: 0.767193

from attanet.

ydhongHIT avatar ydhongHIT commented on September 14, 2024 2

I find that the speed test code does not use torch.cuda.synchronize().

from attanet.

wondervictor avatar wondervictor commented on September 14, 2024 1

Hi @liuzhidemaomao, your results (76.7 mIoU and 55.2 FPS w.r.t 1024x2048 input) are consistent with mine regardless of the gpu. (Results from Table 1 in original paper are 78.5 mIoU and 130 FPS on 1080Ti, which is much slower than 2080Ti)
In my opinion, reporting the speed with the same setting (inference setting: single scale or test-time augmentation ) as the performance evaluation is more convincing and reasonable.
However, using test-time augmentation (crop and flip in evaluate.py) to reach higher accuracy but providing the speed in another setting (input is 512x1024) will be misleading for the community to use.
Moreover, other methods cited in Table 1 and Figure 1 adopt the same setting for both performance evaluation and speed evaluation as far as I know.

from attanet.

songqi-github avatar songqi-github commented on September 14, 2024 1
  • MS/Flip
    Be clear that SFNet uses MS/Flip when testing on ADE20K (see Tabel 5 in SFNet). In Table 5 of our paper, nearly all the comparison methods use MS/Flip, to compare with those methods, we also use MS/Flip on ADE20K.

  • w/ synchronize
    In our code, w/ or w/o synchronize doesn't influence the inference speed.

  • Real-time evaluate
    We will upload the weights and the evaluation file for real-time testing soon, please wait for that.

from attanet.

songqi-github avatar songqi-github commented on September 14, 2024

Hi, thanks for your attention to our paper. There must be some problem with your fps testing. We have tested several times on our GPU, and we can achieve at least 120 fps even when the GPU is much slower than other same types. Please check your code and environment. About the accuracy testing, we mainly follow the evaluation method in BiSeNetV2 and SFNet to ensure fairness. We use this file for multi-scale testing of ResNet-50/101 and not for real-time accuracy testing in our paper. The inference time and corresponding accuracy use the same testing settings in our paper. We will upload the used one for real-time accuracy testing as soon as possible.

from attanet.

wondervictor avatar wondervictor commented on September 14, 2024

Hi @songqi-github
Indeed, AttaNet is a great work with some efficient designs.
To compare with SFNet, FANet and etc. which adopts the 1024x2048 input, I modified the script inference.py by changing the input size to 1024x2048 and removing the downsample operation and the model achieved 25 FPS with 76.7 mIoU.
Besides, BiSeNetV2 adopts 512x1024 input to evaluate mIoU and inference speed without evaluation tricks.

We do not adopt any evaluation tricks, e.g., sliding-window evaluation and multi-scale testing, which can improve accuracy but are time-consuming. With the input of 2048 × 1024 resolution, we first re- size it to 1024 × 512 resolution to inference and then resize the prediction to the original size of the input. We measure the inference time with only one GPU card and repeat 5000 iterations to eliminate the error fluctuation. We note that the time of resizing is included in the inference time measurement. In other words, when measuring the inference time, the practical input size is 2048 × 1024

In my opinion, reporting the inference time and mIoU without test-time augmentations is more convincing. In other words, the time of inferencing chips cropped or flipped for each input should be added.

from attanet.

liuzhidemaomao avatar liuzhidemaomao commented on September 14, 2024

Further, I've downloaded the code & models and evaluated the speed and accuracy in my local machine.
Specs: GPU: NVIDIA Titan Xp, CPU: 2 Intel Xeon E5-2620 v3.

Model: AttaNet w/ ResNet-18

Speed: 1024x2048 input size

inference.py outputs:

load resnet
start warm upwarm up done
=======================================
FPS: 24.972443
Inference time 40.044140 ms

Accuracy: 1024x2048 input size * w/o crop and flip*

evaluate.py outputs:

================================================================================
evaluating the model ...

setup and restore model
load resnet
compute the mIOU
 61%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                                            | 305/500 [03:10<02:55,  `1.11it/s]100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████|` 500/500 [06:09<00:00,  1.35it/s]
[0.98095101 0.84992488 0.91837538 0.64723809 0.65401004 0.5827008
 0.61811279 0.75313067 0.91223381 0.68902523 0.92999311 0.77462518
 0.5425117  0.94219012 0.85731529 0.87147848 0.79053572 0.52289506
 0.739424  ]
0.7671932295569451
mIOU is: 0.767193

I have re-trained and re-evaluated the code in my own machine without any changes.
My environment:
GPU: GeForce RTX 2080ti, CPU: Intel(R) Core(TM) i9-10900X CPU @ 3.70GHz

inference.py output
image
evaluate.py output
image

After change the inference.py with input size 1024*2048.
The result:
image

from attanet.

lxtGH avatar lxtGH commented on September 14, 2024

@wondervictor

Hi @liuzhidemaomao, your results (76.7 mIoU and 55.2 FPS w.r.t 1024x2048 input) are consistent with mine regardless of the gpu. (Results from Table 1 in original paper are 78.5 mIoU and 130 FPS on 1080Ti, which is much slower than 2080Ti)
In my opinion, reporting the speed with the same setting (inference setting: single scale or test-time augmentation ) as the performance evaluation is more convincing and reasonable.
However, using test-time augmentation (crop and flip in evaluate.py) to reach higher accuracy but providing the speed in another setting (input is 512x1024) will be misleading for the community to use.
Moreover, other methods cited in Table 1 and Figure 1 adopt the same setting for both performance evaluation and speed evaluation as far as I know.

I agree with you. I can not reproduce this work using my own codebase. with 1024*2048 input, I obtain 76.8 mIoU. With 512 x 1024 input , the result is very bad.
What is your results using 512 x 1024 input ?

from attanet.

lxtGH avatar lxtGH commented on September 14, 2024

I find that the speed test code does not use torch.cuda.synchronize().

Hi! @ydhongHIT Interesting, Did you test the speed using the torch.cuda.synchronize()
wutianyiRosun/CGNet#2

from attanet.

ydhongHIT avatar ydhongHIT commented on September 14, 2024

I find that the speed test code does not use torch.cuda.synchronize().

Hi! @ydhongHIT Interesting, Did you test the speed using the torch.cuda.synchronize()
wutianyiRosun/CGNet#2

I didn't test the speed but I think it may explain why the test speed of you is different from author's.

from attanet.

songqi-github avatar songqi-github commented on September 14, 2024

In the previous reply, we already said that we used the same settings in both speed testing and performance evaluation. The given evaluate.py is used for multi-scale testing for heavy models. We are still working on this repo, and we'll try to release the full code soon. Please check how to implement SAM and AFM first.

from attanet.

BUAA-LKG avatar BUAA-LKG commented on September 14, 2024

@wondervictor

Hi @liuzhidemaomao, your results (76.7 mIoU and 55.2 FPS w.r.t 1024x2048 input) are consistent with mine regardless of the gpu. (Results from Table 1 in original paper are 78.5 mIoU and 130 FPS on 1080Ti, which is much slower than 2080Ti)
In my opinion, reporting the speed with the same setting (inference setting: single scale or test-time augmentation ) as the performance evaluation is more convincing and reasonable.
However, using test-time augmentation (crop and flip in evaluate.py) to reach higher accuracy but providing the speed in another setting (input is 512x1024) will be misleading for the community to use.
Moreover, other methods cited in Table 1 and Figure 1 adopt the same setting for both performance evaluation and speed evaluation as far as I know.

I agree with you. I can not reproduce this work using my own codebase. with 1024*2048 input, I obtain 76.8 mIoU. With 512 x 1024 input , the result is very bad. What is your results using 512 x 1024 input ?

Have you reproduce this work?

from attanet.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.