I want to track on MOT16 using public detection results, without run detection using

How to track directly using MOT16 public detection results? about tracking_wo_bnw HOT 7 CLOSED

phil-bergmann commented on August 15, 2024

How to track directly using MOT16 public detection results?

from tracking_wo_bnw.

Comments (7)

timmeinhardt commented on August 15, 2024

Tracktor relies on the underlying object detector to produce identity preserving frame to frame tracks. If you do not want to use the object detector at all, even not for bounding box regression, then you could probably also do tracking only with reID. But I assume that the performance would suffer quite a bit. Depending on the reID method.

There is already an option to switch between public or private detections. But in order to not use the object detector at all one hast to change the code significantly. This goes beyond pointing out a few lines of code that have to be modified. Try to make yourself familiar with our code and you should be able to modify it.

from tracking_wo_bnw.

DarkstartsUp commented on August 15, 2024

Thanks for your reply! Another question: in obj_detect.test_rois(), the convolution operation run on each rois which are cropped from source image, or run on the whole source image then crop each rois out from feature map?
The thing I am doing is run tracking on super large-size frames (resolution is about 30000x20000), but the objects I want to track are in normal size (about 100x100), so there are hundreds of objects I want to track at the same time. So directly input the raw frame into backbone network is unrealistic. Could you please share me your suggestions on how should I do? Thanks a lot!

from tracking_wo_bnw.

DarkstartsUp commented on August 15, 2024

In the current tracking results, the performance on relatively large objects is OK, but on relatively small objects is very bad, most of small object are missed. And the running speed is very slow, it takes 1 hour and 40 minutes to process 240 frames.

from tracking_wo_bnw.

timmeinhardt commented on August 15, 2024

Convolutions are applied to the entire image. However, we use Region of Interest Pooling to get the features for a given region of interest.

Processing very large frames will directly translate into an increase in runtime. One could try to run an object detector with a smaller backbone or look at the code and try to optimise our Tracktor code directly.

Did you retrain the object detector on your dataset? If not this might be the reason why it missed many of the objects.

from tracking_wo_bnw.

DarkstartsUp commented on August 15, 2024

I did not retrain the object detector, but the objects I want to track are also pedestrains, so I think pre-trained detector works well. I guess the primary reason of low recall is the downsample processing before input very large frames into backbone network. It will make the small object far smaller than the anchors on feature map.
So I will try other tracking methods, which directly use public detection results. Thanks a lot for your help! I will close this issue soon.

from tracking_wo_bnw.

timmeinhardt commented on August 15, 2024

Yes, the downsampling could be the reason for loosing the small objects. You could apply the object detector to the image without downsampling. And if this is too large for a GPU you could split the image in multiple parts.

Our method uses the public detections directly. You will run into the same issue with any other method that applies tracking-by-detection. Because your input image has to be processed by an object detector first. So this is not an issue of the tracker but rather the object detector. Once you solved the issue you can easily apply our tracker.

from tracking_wo_bnw.

DarkstartsUp commented on August 15, 2024

Get it. Thanks a lot!

from tracking_wo_bnw.

How to track directly using MOT16 public detection results? about tracking_wo_bnw HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent