Hi, and thank you for making this code available. I am running it on

Thank you! I am timing only the predict call, like this: <div class="snippet-c

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Speed up the detection? about simple-hrnet HOT 4 CLOSED

stefanopini commented on August 25, 2024

Speed up the detection?

from simple-hrnet.

Comments (4)

stefanopini commented on August 25, 2024

Hi,

When using pose_hrnet_w32_256x192.pth, you have to change the number of channels from 48 (default) to 32 and the image resolution from 384x288 to 256x192 (options -c and -r).
Please check the documentation of the script live-demo.py with the command python scripts/live-demo.py --help or the documentation of the class SimpleHRNet for further details.

Yes, you can switch to yolo-tiny weights passing their paths through the parameters of the SimpleHRNet class. If you're using the live-demo.py script, you can replace lines 46 to 56 with:

    model = SimpleHRNet(
        hrnet_c,
        hrnet_j,
        hrnet_weights,
        model_name=hrnet_m,
        resolution=image_resolution,
        multiperson=not single_person,
        return_bounding_boxes=not disable_tracking,
        max_batch_size=max_batch_size,
        yolo_model_def="./models/detectors/yolo/config/yolov3-tiny.cfg",
        yolo_weights_path="./models/detectors/yolo/weights/yolov3-tiny.weights",
        device=device
    )

It's hard to guess the difference in inference time between these settings, the best option is to try them.
Please post your inference time with pose_hrnet_w32_256x192.pth and with yolov3-tiny! I'll post mine soon.

from simple-hrnet.

antithing commented on August 25, 2024

Thank you!
I am timing only the predict call, like this:

start = time.time()	
pts = model.predict(frame)
end = time.time()	
print(end-start)

Running on the same image in a loop 10 times with the pose_hrnet_w48_384x288 weights, I get inference times of (seconds):

0.23575758934020996
0.2087860107421875
0.19779682159423828
0.20479035377502441
0.20778775215148926
0.20179319381713867
0.1938016414642334
0.20778727531433105
0.18680858612060547
0.19979572296142578

Switching to the pose_hrnet_w32_256x192 weights and running the same thing, I see a worse result when I draw the keypoints, and the times are much the same:

0.21777725219726562
0.1957993507385254
0.21677803993225098
0.19879746437072754
0.19679784774780273
0.2067883014678955
0.18481040000915527
0.1957998275756836
0.19080471992492676
0.19679927825927734

I have not tested yolo-tiny yet, as I wanted to try one at a time and log the results.

Do you see the same thing?

Thanks!

from simple-hrnet.

stefanopini commented on August 25, 2024

Hello @antithing , sorry for the late reply.

Yes, I've experienced a similar behaviour when running in single-person mode while I've experienced different timings when running in multi-person mode.

Here's the inference time on a GTX 1060 as mean and standard deviation over 10 predictions:

pose_hrnet_w48_384x288

Single-person: 0.160240 ± 0.004197
Multi-person, frame without people: 0.049684 ± 0.000645 (this is just the forward pass of YOLOv3)
Multi-person, frame with 1 person: 0.216628 ± 0.008854
Multi-person, frame with many people (~8): 0.483446 ± 0.032138

pose_hrnet_w32_256x192

Single-person: 0.158238 ± 0.003195
Multi-person, frame without people: 0.049384 ± 0.000488 (this is just the forward pass of YOLOv3)
Multi-person, frame with 1 person: 0.211480 ± 0.008367
Multi-person, frame with many people (~8): 0.242922 ± 0.004263

As you can see, in single-person mode, the difference between the two architecture is minimal while in multi-person mode the difference is considerable when many people are in the scene.
This difference could be mainly caused by the pre-processing steps (which include the resize operations).

Regarding YOLO, the inference time of YOLOv3 is 0.049684 ± 0.000645.
The one of YOLOv3-tiny is 0.015586 ± 0.000668.
I think you could expect a similar detection speedup (~3x) on your hardware configuration too.
In less powerful hardware such as my laptop, the overall fps of live_demo.py are ~5fps with YOLOv3 and ~9fps with YOLOv3-tiny.

Btw, I've just added a new option --use_tiny_yolo in the scripts live-demo.py and extract-keypoints.py! Please check out the latest commit.

from simple-hrnet.

antithing commented on August 25, 2024

Thank you! I will grab the new commit. Much appreciated.

from simple-hrnet.

Speed up the detection? about simple-hrnet HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent