Giter Site home page Giter Site logo

andrew-zhu / densevideocaptioning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jaywongwang/densevideocaptioning

0.0 2.0 0.0 16.6 MB

Official Tensorflow Implementation of the paper "Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning" in CVPR 2018

License: MIT License

Python 99.79% Shell 0.21%

densevideocaptioning's Introduction

DenseVideoCaptioning

Tensorflow Implementation of the Paper Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning by Jingwen Wang et al. in CVPR 2018.

alt text

Data Preparation

Please download annotation data and C3D features from the website ActivityNet Captions. The ActivityNet C3D features with stride of 64 frames (used in my paper) can be found here.

Please follow the script dataset/ActivityNet_Captions/preprocess/anchors/get_anchors.py to obtain clustered anchors and their pos/neg weights (for handling imbalance class problem). I already put the generated files in dataset/ActivityNet_Captions/preprocess/anchors/.

Please follow the script dataset/ActivityNet_Captions/preprocess/build_vocab.py to build word dictionary and to build train/val/test encoded sentence data.

Hyper Parameters

The configuration (from my experiments) is given in opt.py, including model setup, training options, and testing options.

Training

Train dense-captioning model using the script train.py.

First pre-train the proposal module for around 5 epochs. Set train_proposal=True and train_caption=False. Then train the whole dense-captioning model by setting train_proposal=True and train_caption=True. To understand the proposal module, I refer you to the original SST paper and also my tensorflow implementation of SST.

Prediction

Follow the script test.py to make proposal predictions and to evaluate the predictions.

Evaluation

Please note that the official evaluation metric has been updated (Line 194). In the paper, old metric is reported (but still, you can compare results from different methods, all CVPR-2018 papers report old metric).

Results

The predicted results for val/test set can be found here.

Dependencies

tensorflow==1.0.1

python==2.7.5

Other versions may also work.

Update:

  1. I corrected some naming errors and simplified the proposal loss using tensorflow built-in function.
  2. I uploaded C3D features with stride of 64 frames (used in my paper). You can find it here.
  3. I uploaded val/test results of both without joint ranking and with joint ranking.
  4. I uploaded video_fps.json and updated test.py.
  5. Due to large file constraint, you may need to download data/paraphrase-en.gz here and put it in densevid_eval-master/coco-caption/pycocoevalcap/meteor/data/.
  6. I correct multi-rnn mistake caused by get_rnn_cell() function (see model.py).

densevideocaptioning's People

Contributors

jaywongwang avatar

Watchers

James Cloos avatar Andrew-Zhu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.