Giter Site home page Giter Site logo

ryokaw / fpl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from takumayagi/fpl

0.0 0.0 0.0 11.47 MB

Code/data of the paper "Future Person Localization in First-Person Videos" (CVPR2018)

Home Page: https://artilects.net/projects.html

License: MIT License

Python 99.73% Shell 0.27%

fpl's Introduction

Future Person Localization in First-Person Videos (CVPR2018)

This repository contains the code and data (caution: no raw image provided) for the paper "Future Person Localization in First-Person Videos" by Takuma Yagi, Karttikeya Mangalam, Ryo Yonetani and Yoichi Sato.

Prediction examples

Requirements

We confirmed the code works correctly in below versions.

Installation

Download data

You can download our dataset from below link:
(caution: no raw image provided!)
Download link (processed data)

If you wish downloading via terminal, consider using custom script.

Extract the downloaded tar.gz file at the root directory.

tar xvf fpl.tar.gz

Pseudo-video

Since we cannot release the raw images, we prepared sample pseudo-video below.
The video shows the automatically extracted location histories, poses. The number shown in the bounding box corresponds to the person id in the processed data.
Background colors are the result from pre-trained dilated CNN trained with MIT Scene Parsing Benchmark.
Download link (pseudo-video)

Create dataset

Run dataset generation script to preprocess raw locations/poses/egomotions.
A single processed file will be generated in datasets/.

# Test (debug) data
python utils/create_dataset.py utils/id_test.txt --traj_length 20 --traj_skip 2 --nb_splits 5 --seed 1701 --traj_skip_test 5
# All data
python utils/create_dataset.py utils/id_list_20.txt --traj_length 20 --traj_skip 2 --nb_splits 5 --seed 1701 --traj_skip_test 5

Prepare training script

Modify the "in_data" arguments in scripts/5fold.json.

Running the code

Directory structure

    .
    +---data (feature files)
    +---dataset (processed data)
    +---experiments (logging)
    +---gen_scripts (automatically generated scripts for cross validation)
    +---models
    +---scripts (configuration)
    |   +---5fold.json
    +---utils
        +---run.py (training script)
        +---eval.py (evaluation script)

Training

In our environment (a single TITAN X Pascal w/ CUDA 8, cuDNN 5.1), it took approximately 40 minutes per split.

# Train proposed model and ablation models
python utils/run.py scripts/5fold.json run <gpu id>
# Train proposed model only
python utils/run.py scripts/5fold_proposed_only.json run <gpu id>

Evaluation

python utils/eval.py experiments/5fold_yymmss_HHMMSS/ 17000 run <gpu id> 10

Prediction visualization using pseudo-video

We provided visualization code using pseudo-video.
Download below pseudo-videos and run the following code:
Download link (pseudo-videos for visualization)

# Run this code after placing <video_id>.mp4 into data/pseudo_viz/
# Extract images from video
python utils/video2img_all.py data/pseudo_viz/
# Plot images
python utils/plot_prediction.py <experiment>/<fold> --traj_type 0
# Write videos
python utils/write_video.py <experiment>/<fold> --vid GOPRXXXXU20 --frame XXXX --pid XXX

License and Citation

The dataset provided in this repository is only to be used for non-commercial scientific purposes. If you used this dataset in scientific publication, cite the following paper:

Takuma Yagi, Karttikeya Mangalam, Ryo Yonetani and Yoichi Sato. Future Person Localization in First-Person Videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

@InProceedings{yagi2018future,
    title={Future Person Localization in First-Person Videos},
    author={Yagi, Takuma and Mangalam, Karttikeya and Yonetani, Ryo and Sato, Yoichi},
    booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2018}
}

fpl's People

Contributors

takumayagi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.