This repository contains the code for the following papers:
-
Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, Feng Wu, Context-Aware Visual Policy Network for Sequence-Level Image Captioning. in ACM MM, 2018. (PDF)
-
Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, Feng Wu, Context-Aware Visual Policy Network for Fine-Grained Image Captioning. in TPAMI, 2019. (Extended journal version. PDF)
pip3 install torch torchvision
- Install Java JDK (for METEOR Metric):
apt install default-jdk
- Clone with Git, and then enter the root directory:
git clone --recursive https://github.com/daqingliu/CAVP.git && cd CAVP
- Download the image features (download link coming soon) extracted from bottom-up-attention into
data
and unzip it. - Download coco annotations (h5 and json) into
data
.
Just simply run:
bash run_train.sh
bash run_eval.sh
@article{zha2019context,
title={Context-aware visual policy network for fine-grained image captioning},
author={Zha, Zheng-Jun and Liu, Daqing and Zhang, Hanwang and Zhang, Yongdong and Wu, Feng},
journal={IEEE transactions on pattern analysis and machine intelligence},
year={2019},
}
Part of this repository is built upon self-critical.pytorch.