foolwood / dcfnet Goto Github PK

View Code? Open in Web Editor NEW

214.0 214.0 66.0 34.6 MB

DCFNet: Discriminant Correlation Filters Network for Visual Tracking

License: MIT License

MATLAB 90.16% Objective-C 9.70% M 0.14%

dcfnet's People

Contributors

Stargazers

Watchers

Forkers

ouya-bytes caomw smartcai zhengzhugithub csgaobb bigmaye wanjinchang dengyu9080 zhangxiaodi johndpope benjamesbabala stevenlol lyk125 yyuzhongpv mfzhang zhyj3038 december-boy stoneyang-cv kwan-ywan zzynwu lexmao dlfollower emailjhd signalimagecv aliushn alittleq ddwapple mrdoer leixuchn weiliangxiao zhengzhenxian abdelpakey alei089 wufenggit siam7tong zhaoluo locussam ted8201 jshuzi yangkang779 baoqiancherry saintlogos1234 sysuzyq arasharchor zhaowujie lixingxing2016 hhhhedy goldwh zfxu zhanglichao wtdeng dr-lingyunxu 983632847 platon3344 greitzmann fengsijia uu5 studentwong xrosliang sherry-0806 lcbwn xdfanfan baozhiqiang1978 muzafferkal xiaqingxu2

dcfnet's Issues

Is it better to shift the gaussion label?

In the training state, DCFNet use the target and the search image as input and the center gaussion label(then move to the left-top). Will it better if we generate gaussion label according to the position of the target in the search image? If the position of the target is not in the center, then generate the shifted label.

Missing definition for variables or function and file

Hi, there:

U guys did a good job and thanks for sharing the code!

When I tried to replicate your work done in the preprint, I've encountered two issues:

When loading UAV123, the variable (or a function?) named configSeqs is not defined when it is called at this line.
When loading NUS_PRO, the file named seq_list_with_gt.csv is non-exist in this repo as mentioned at this line.

The two issues blocked us from following your paper, ie, to train the DCFNet on the three datasets.
So, could you please check them out and fix'em?

@foolwood

Thanks.

training process question

Hello,when I run the train_DCFNet.m,the display content in command window is like this:

train: epoch 08: 226/572: 84.8 (84.8) HZ objective: 44.632
train: epoch 08: 227/572: 84.8 (84.8) HZ objective: 44.628
train: epoch 08: 228/572: 84.8 (84.8) HZ objective: 44.549
train: epoch 08: 229/572: 84.8 (84.8) HZ objective: 44.570
......

the last value are always between 44 and 46,no obvious increase or decrease,I am not sure is it normal?
I use a subset of VID,the first 100 files in the first folder.

Thanks for your attention.

how to implement the pooling layer or other functions defined by ourselves?

Excuse me, I want to ask how to implement the pooling layer or other functions defined by ourselves? @foolwood

cos window multiple feature map (training)

Hi,

I noticed that while training, the CNN output (before DCF) is not mult with cos window (or hann) in contradiction to tracking pipeline that mult the CNN output.

Why?

请问公式9和10的推导过程？~~~

thankyou~~~

Can you elaborate the difference between your work and cfnet

Hi,

It seems that your work is closely related to CFNET, yet your performance is much better, 0.624 vs 0.568 in OTB100. Can you elaborate what makes such a big difference since CFNET also uses VID as training set and exponential decay learning rate schedule?

Thanks

the different feature size in the training and tracking process

In your tracking process, the output feature size is 125 12532 before the DCF layer.
However, in your training code, the networkType is set to 12, and the output size after two convolutional laysers is smaller than the input image size by 4. (feature_za=input_size-[4,4])
I wonder why the saved netork (.mat file) after training can generate different feature size?
Look for your early reply! And thank you very much.

VID2015 and Center loss

Hi,

Thank you for great job.
I have few questions:

In the previous version DCFNet trained on uav123, nus-pro and tc-128.
the current one is defined to VID2015, why?
Why you don't use the CenterLoss loss anymore? Why in the first place the CenterLoss don't propagate(only forward but no backward)?

Thank you

关于训练后的模型评测

作者想问下，测试你的training下训练的模型应该用哪个啊，得到model/DCFNet-net-12-125-2.mat 还有model_mulit还有data/DCFNet-net-12-125-2.0/net-epoch-1到50，我觉得是model/DCFNet那个的，不知道是不是，谢谢

Does VID training use 86G data?

Hello, does the training data use VID's latest 86G training? Why do I get out of memory errors when running train_DCFNet.m?

What is the function of LRN layer?

What is the function of LRN layer? I notice that some siamese trackers use BatchNorm layer.

The exact configurations for retraining the released model

Hi,

I noticed that the current train_DCFNet.m script trains networkType = 12 while the released model (DCFNet-net-7-125-2.mat) use networkType = 7. So my question is how to retrain this model?

For now, the meta data of the released model is:

normalization: subtract mean value [123, 117, 104]
network type: 7
loss type: 1
input size 125
padding 2

Can I just set these values in the train_DCFNet.m, then I am good to go ?
What does the loss type mean ?

Thanks

VOT评测指标

你好，想问下vot的话，要怎么测试呀，有评测的脚本么？谢谢，这个是很好的工作，非常感谢

About the training data VID2015 and training time

Thanks for your work and code.

When i use the VID2015 as training data ,it will generate 464873 pairs traning data, even the batchsize is set to be 32, it still have more than 10000 calculations in one epoch. How to solve this problem when you are training the network?

Thanks~

Is it possible to use stride in the network, how would it impact the tracking performance?

Loading is very slow

why is it very slow to load when I want to run demo ?

the padding of the conv layers

Thanks for good work as usual~
Take type-7 and type-12 network for example. I find that the padding of the conv layers is all 0 when training(input size: 125 output size: 121). But when tracking, the padding is set to 1 (input size: 125 output size: 125) while using the same conv parameters. Can you explain why you do like this? It is a theoretical setting(better in theory) or just a experimental result (better performance in experiments)?

overfit when using DCFNet?

I use this net to train:
elseif networkType == 31
%% target
conv1 = dagnn.Conv('size', [5 5 3 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv1', conv1, {'target'}, {'conv1'}, {'conv1f', 'conv1b'}) ;
net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'conv1r'});

conv2 = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv2', conv2, {'conv1r'}, {'conv2'}, {'conv2f', 'conv2b'}) ;
net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'conv2r'});

conv3 = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv3', conv3, {'conv2r'}, {'conv3'}, {'conv3f', 'conv3b'}) ;

conv4 = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv4', conv4, {'conv3'}, {'conv4'}, {'conv4f', 'conv4b'}) ;

conv5 = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv5', conv5, {'conv4'}, {'conv5'}, {'conv5f', 'conv5b'}) ;

net.addLayer('conv5_dropout' ,dagnn.DropOut('rate', 0.2),{'conv5'},{'x'});

%% search
conv1s = dagnn.Conv('size', [5 5 3 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv1s', conv1s, {'search'}, {'conv1s'}, {'conv1f', 'conv1b'}) ;
net.addLayer('relu1s', dagnn.ReLU(), {'conv1s'}, {'conv1sr'});

conv2s = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv2s', conv2s, {'conv1sr'}, {'conv2s'}, {'conv2f', 'conv2b'}) ;
net.addLayer('relu2s', dagnn.ReLU(), {'conv2s'}, {'conv2sr'});

conv3s = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv3s', conv3s, {'conv2sr'}, {'conv3s'}, {'conv3f', 'conv3b'}) ;

conv4s = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv4s', conv4s, {'conv3s'}, {'conv4s'}, {'conv4f', 'conv4b'}) ;

conv5s = dagnn.Conv('size', [5 5 32 32],'pad', 2, 'stride', 1, 'dilate', 1, 'hasBias', true) ;
net.addLayer('conv5s', conv5s, {'conv4s'}, {'conv5s'}, {'conv5f', 'conv5b'}) ;

net.addLayer('conv5s_dropout' ,dagnn.DropOut('rate', 0.2),{'conv5s'},{'z'});
window_sz = [125,125];

end

Question: The CLE and objective (both train and val) decreases during training. But I test the saved model, surprisingly to find that the tracker fail to track the target as the epoch increases. I use the earlier model which the epoch is lower， the tracker track successfully. So I think maybe the model is overfitting? So I test one video in the train set. If the model become overfitting, tracker may perform excellently in the training set. However, the tracker fail to track the target in the early stage. Is there any bug in the layer DCF?
Another question is in the getlmdbRAM.m: imdb.images.set(randperm(num_all_frame,100)) = int8(2); The program select randomly 100 frames in the whole frames sequences to be the val set. But some val frames may be adjacent to the train frames. Maybe it will be better to choose the val set by the videos which don't exist in the train set?

Impact of feature size on the tracking performance

In one of the issues #5, you mentioned that the resolution map is an important factor why DCFNet has better performance than CFNet. Could you elaborate more information about this? How much AUC would loss if the resolution of the feature map is smaller, such as 33/63 as mentioned?

I meet the same problem with the same environment. After I change the default setting to nonrecursive, the problem still exists. Although the hint change, the problem seems unchanged.

I meet the same problem with the same environment. After I change the default setting to nonrecursive, the problem still exists. Although the hint change, the problem seems unchanged.
Error Message below:
Error using vl_argparse (line 108)
Expected either a param-value pair or a structure.

Error in run_DCFNet>DCFNet_initialize (line 54)
state = vl_argparse(state, param, 'nonrecursive');

Error in run_DCFNet (line 27)
[state, ~] = DCFNet_initialize(im{1}, init_rect, param);

Error in demo_DCFNet (line 15)
res = run_DCFNet(subS,0,0,param);

It seems like the param must be a structure. If not, the function will be in loop. But I do not allow it to loop, the function have to stop because of the lack of structured parameter. Can you help me?
Thanks a lot.

Originally posted by @tzjtatata in #9 (comment)

error using vl_argparse (line 63)
OPTS must be a structure

error vl_argparse (line 160)
opts.(field) = vl_argparse(opts.(field), value,
'merge') ;

error vl_argparse (line 97)
opts = vl_argparse(opts, vertcat(params,values),
varargin{:}) ;

error DCFNet_initialize (line 16)
state = vl_argparse(state, param);

error demo_DCFNet (line 43)
[state, ~] = DCFNet_initialize(im{1}, init_rect, opts);

After a check on the relevant code, the error could be aviod when the code 'state.net =[];' changed to be 'state.net = param.net;' in DCFNet_initialize.m.

Does the error occured because my matconvnet version or something else?

Thanks!

Python Equivalent

Dear @foolwood,
Is there any Python equivalent of your code?