Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo. AAAI 2021. [arxiv]
- Ubuntu 16.04
- CUDA 9.2
- cuDNN 7.4.2
- Java 8
- Python 2.7.12
- PyTorch 1.1.0
- Other python packages specified in requirements.txt
$ pip install -r requirements.txt
-
Download the GloVe Embedding from here and locate it at
data/Embeddings/GloVe/GloVe_300.json
. -
Extract features from datasets and locate them at
data/<DATASET>/features/<NETWORK>.hdf5
.e.g. ResNet101 features of the MSVD dataset will be located at
data/MSVD/features/ResNet101.hdf5
.I refer to this repo for extracting the ResNet101 features, and this repo for extracting the 3D-ResNext101 features.
-
Split the features into train, val, and test sets by running following commands.
$ python -m split.MSVD $ python -m split.MSR-VTT
You can skip step 2-3 and download below files
- MSVD
- MSR-VTT
Clone the evaluation code from the official coco-evaluation repo.
$ git clone https://github.com/tylin/coco-caption.git
$ mv coco-caption/pycocoevalcap .
$ rm -rf coco-caption
$ python extract_negative_videos.py
or you can skip this step as the output files are already uploaded at data/<DATASET>/metadata/neg_vids_<SPLIT>.json
$ python train.py
You can change some hyperparameters by modifying config.py
.
$ python evaluate.py --ckpt_fpath <MODEL_CHECKPOINT_PATH>
The source-code in this repository is released under MIT License.