Giter Site home page Giter Site logo

longlong-jing / ltc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gulvarol/ltc

0.0 2.0 0.0 22 KB

Long-term Temporal Convolutions for Action Recognition

Home Page: http://www.di.ens.fr/willow/research/ltc

License: MIT License

Lua 92.92% HTML 7.08%

ltc's Introduction

Long-term Temporal Convolutions (LTC)

This is the Torch code for the following paper:

Gül Varol, Ivan Laptev and Cordelia Schmid, Long-term Temporal Convolutions for Action Recognition, arxiv:1604.04494, 2016.

Check the project page for more materials.

Contact: Gül Varol.

Preparation

  1. Install Torch with cuDNN support.

  2. Download UCF101 and/or HMDB51 datasets.

  3. Data pre-processing. TO-DO

  4. C3D model in Torch. TO-DO

Running the code

You can simply run th main.lua to start a training with the default parameters. Following are several examples on how to set parameters in different scenarios:

#### From scratch experiments on UCF101
# Run with default parameters (UCF101 dataset, split 1, 100-frame 58x58 resolution flow network with 0.9 dropout)
th  main.lua -expName flow_100f_d9

# Continue training from epoch 10
th  main.lua -expName flow_100f_d9 -continue -epochNumber 10

# Test final prediction accuracy for model number 20
th  main.lua -expName flow_100f_d9 -evaluate -modelNo 20

# Train 100-frame RGB network from scratch on UCF101 dataset
th  main.lua -nFrames 100 -loadHeight 67  -loadWidth 89  -sampleHeight 58  -sampleWidth 58  \
-stream rgb  -expName rgb_100f_d5  -dataset UCF101 -dropout 0.5 -LRfile LR/UCF101/rgb_d5.lua

# Train 71x71 spatial resolution flow network
th  main.lua -nFrames 100 -loadHeight 81  -loadWidth 108 -sampleHeight 71  -sampleWidth 71  \
-stream flow -expName flow_100f_d5 -dataset UCF101 -dropout 0.5 -LRfile LR/UCF101/flow_d5.lua

# Train 16-frame 112x112 spatial resolution flow network
th  main.lua -nFrames 16  -loadHeight 128 -loadWidth 171 -sampleHeight 112 -sampleWidth 112 \
-stream flow -expName flow_100f_d5 -dataset UCF101 -dropout 0.5 -LRfile LR/UCF101/flow_d5.lua

#### Fine-tune HMDB51 from UCF101
# Train the last layer and freeze the lower layers
th main.lua -expName flow_100f_58_d9/finetune/last             \
-loadHeight 67 -loadWidth 89 -sampleHeight 58 -sampleWidth 58  \
-dataset HMDB51                                                \
-LRfile LR/HMDB51/flow_d9_last.lua                             \
-finetune last                                                 \
-retrain log/UCF101/flow_100f_58_d9/model_50.t7

# Fine-tune the whole network
th main.lua -expName flow_100f_58_d9/finetune/whole            \
-loadHeight 67 -loadWidth 89 -sampleHeight 58 -sampleWidth 58  \
-dataset HMDB51                                                \
-LRfile LR/HMDB51/flow_d9_whole.lua                            \
-finetune whole                                                \
-lastlayer log/HMDB51/flow_100f_58_d9/finetune/last/model_3.t7 \
-retrain log/UCF101/flow_100f_58_d9/model_50.t7

Note that the results are sensitive to the learning rate (LR) schedule. You can set your own LR by writing a -LRfile. Following are a few observations that can be useful:

  • RGB networks converge faster than flow networks.
  • High dropout takes longer to converge.
  • HMDB51 dataset trains faster.
  • Fewer number of frames trains faster.

Pre-trained models

TO-DO

Citation

If you use this code, please cite the following:

@article{varol16a,
      TITLE = {{Long-term Temporal Convolutions for Action Recognition}},
      AUTHOR = {Varol, G{"u}l and Laptev, Ivan and Schmid, Cordelia},
      JOURNAL = {arXiv:1604.04494},
      YEAR = {2016}
}

Acknowledgements

This code is largely built on the ImageNet training example https://github.com/soumith/imagenet-multiGPU.torch by Soumith Chintala.

ltc's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.