Long-term Temporal Convolutions (LTC)

This is the Torch code for the following paper:

Gül Varol, Ivan Laptev and Cordelia Schmid, Long-term Temporal Convolutions for Action Recognition, arxiv:1604.04494, 2016.

Check the project page for more materials.

Contact: Gül Varol.

Preparation

Install Torch with cuDNN support.
Download UCF101 and/or HMDB51 datasets.
Data pre-processing. TO-DO
C3D model in Torch. TO-DO

Running the code

You can simply run th main.lua to start a training with the default parameters. Following are several examples on how to set parameters in different scenarios:

#### From scratch experiments on UCF101
# Run with default parameters (UCF101 dataset, split 1, 100-frame 58x58 resolution flow network with 0.9 dropout)
th  main.lua -expName flow_100f_d9

# Continue training from epoch 10
th  main.lua -expName flow_100f_d9 -continue -epochNumber 10

# Test final prediction accuracy for model number 20
th  main.lua -expName flow_100f_d9 -evaluate -modelNo 20

# Train 100-frame RGB network from scratch on UCF101 dataset
th  main.lua -nFrames 100 -loadHeight 67  -loadWidth 89  -sampleHeight 58  -sampleWidth 58  \
-stream rgb  -expName rgb_100f_d5  -dataset UCF101 -dropout 0.5 -LRfile LR/UCF101/rgb_d5.lua

# Train 71x71 spatial resolution flow network
th  main.lua -nFrames 100 -loadHeight 81  -loadWidth 108 -sampleHeight 71  -sampleWidth 71  \
-stream flow -expName flow_100f_d5 -dataset UCF101 -dropout 0.5 -LRfile LR/UCF101/flow_d5.lua

# Train 16-frame 112x112 spatial resolution flow network
th  main.lua -nFrames 16  -loadHeight 128 -loadWidth 171 -sampleHeight 112 -sampleWidth 112 \
-stream flow -expName flow_100f_d5 -dataset UCF101 -dropout 0.5 -LRfile LR/UCF101/flow_d5.lua

#### Fine-tune HMDB51 from UCF101
# Train the last layer and freeze the lower layers
th main.lua -expName flow_100f_58_d9/finetune/last             \
-loadHeight 67 -loadWidth 89 -sampleHeight 58 -sampleWidth 58  \
-dataset HMDB51                                                \
-LRfile LR/HMDB51/flow_d9_last.lua                             \
-finetune last                                                 \
-retrain log/UCF101/flow_100f_58_d9/model_50.t7

# Fine-tune the whole network
th main.lua -expName flow_100f_58_d9/finetune/whole            \
-loadHeight 67 -loadWidth 89 -sampleHeight 58 -sampleWidth 58  \
-dataset HMDB51                                                \
-LRfile LR/HMDB51/flow_d9_whole.lua                            \
-finetune whole                                                \
-lastlayer log/HMDB51/flow_100f_58_d9/finetune/last/model_3.t7 \
-retrain log/UCF101/flow_100f_58_d9/model_50.t7

Note that the results are sensitive to the learning rate (LR) schedule. You can set your own LR by writing a -LRfile. Following are a few observations that can be useful:

RGB networks converge faster than flow networks.
High dropout takes longer to converge.
HMDB51 dataset trains faster.
Fewer number of frames trains faster.

Pre-trained models

TO-DO

Citation

If you use this code, please cite the following:

@article{varol16a,
      TITLE = {{Long-term Temporal Convolutions for Action Recognition}},
      AUTHOR = {Varol, G{"u}l and Laptev, Ivan and Schmid, Cordelia},
      JOURNAL = {arXiv:1604.04494},
      YEAR = {2016}
}

Acknowledgements

This code is largely built on the ImageNet training example https://github.com/soumith/imagenet-multiGPU.torch by Soumith Chintala.

longlong-jing / ltc Goto Github PK

ltc's Introduction

Long-term Temporal Convolutions (LTC)

Preparation

Running the code

Pre-trained models

Citation

Acknowledgements

ltc's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent