jialinwu17 / self_critical_vqa Goto Github PK

Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''

Python 98.57% Shell 1.43%

vqa interpretable-deep-learning interpretable-ai explainable-ai visual-question-answering

self_critical_vqa's Introduction

Code for Self-Critical Reasoning for Robust Visual Question Answering (NeurIPS 2019 Spotlight)

This repo contains codes for ''Self-Critical Reasoning for Robust Visual Question Answering'' with VQA-X human textual explanations This repo contains code modified from here, many thanks!

Prerequisites

Python 3.7.1
PyTorch 1.1.0
spaCy (we use en_core_web_lg spaCy model)
h5py, pickle, json, cv2

Preprocessing

Please download the detection features from this google drive and put it to 'data' folder
Please run bash tools/download.sh to download other useful data files including VQA QA pairs and Glove embeddings
Please run bash tools/preprocess.sh to preprocess the data
mkdir saved_models

Training

The training propocess is split to three stage or two stage:

Three stage version (pretrain on CP, fine-tune using influential strengthening loss and fine-tune with both.)

(1) Pretrain on VQA-CP train dataset by runnning
CUDA_VISIBLE_DEVICES=0 python main.py --load_hint -1 --use_all 1 --learning_rate 0.001 --split v2cp_train --split_test v2cp_test --max_epochs 40
After the pretraining you will have a saved model in saved_models named by the start training time.
Alternatively, you can directly download a model from here.

(2) Pretrain using the influential strengthening loss
Here, please replace the 86-th line in the train.py with your VQA-CP pretrained models.
Then, please run the following line to strengthen the most influential object.
CUDA_VISIBLE_DEVICES=0 python main.py --load_hint 0 --use_all 0 --learning_rate 0.00001 --split v2cp_train_vqx --split_test v2cp_test --max_epochs 12 --hint_loss_weight 20
After the pretraining you will have anthor saved model in saved_models named by the start training time.
Alternatively, you can directly download a model from here.

(3) Training with the self-critical objectives.
Here, please replace the 82-th line in the train.py with your influence strengthened pretrained models.
Then, please run the following line for training.
CUDA_VISIBLE_DEVICES=0 python main.py --load_hint 1 --use_all 0 --learning_rate 0.00001 --split v2cp_train_vqx --split_test v2cp_test --max_epochs 5 --hint_loss_weight 20 --compare_loss_weight 1500
Alternatively, you can directly download a model from here.

self_critical_vqa's People

Contributors

Stargazers

Watchers

Forkers

aistudentsh ffzhang1231 jokieleung nooneust ammieqi yanxinzju v-user1098new sunshinewhy ankitshah009 fake10086

self_critical_vqa's Issues

Could you provide answer to label maps?

Would it be possible to provide trainval_ans2label.pkl and trainval_label2ans.pkl files. The files generated from compute_softscore.py does not seem to be consistent with the labels in VQA_caption_{name}dataset.pkl files.

For instance, for qid 166207000, the GT answers should be robot/beep*, but the label returned by the dataset is 142, which corresponds to the answer: 'real' in label2ans.

Where another download link for train36.h5py

https://drive.google.com/drive/folders/1IXTsTudZtYLqmKzsXxIZbXfCnys_Izxr

Because the file is larger than 20G , so google drive can not download in China. Could scholar provide another download link for it?
I want to reproduce the result. Could you help me?

How did you create glove6b_init_objects/attributes_300d.npy?

I am trying to recreate the glove embeddings for a different set of objects and attributes. Would you be able to share the code for creating the glove6b_init_objects_300d.npy and glove6b_init_attributes_300d.npy files? Thank you!

Reproduction Results

Following after the the instructions in the readme file and using the first two pre-trained networks does not reproduce the 49.5% accuracy reported in the paper.
Also, following the experimental settings in the paper doesn't reproduce the results.
Tried 20 different seeds and the best score is 49.

Can you please share the exact settings you use to get 49.5?

Thanks!

Can you provide another download path of VQA_caption_traindataset.pkl?

It seems that I don't have permission to access this resource.

train36_imgid2img.pkl where can I find this file

I am getting this error:
FileNotFoundError: [Errno 2] No such file or directory: 'train36_imgid2img.pkl'

Can you provide the file 'train_qid2hint.pkl' or the code to generate it?

Can you provide other ways to download the detection features?

like baiduYunpan or others available in China.
Thank you.

Is there a bug in create_vqx_hint.py?

Hi, jialing!
Recently, I am trying to introduce 'the most influential objects' into my model. However, when I check 'create_vqx_hint.py', I don't understand the code at lines 187-189.

if cosine_similarity(exp_emb[attr_token:attr_token+1], atts[j:j+1]) > 0.3 :
        if hint_score_attr[j] <= cosine_similarity(exp_emb[attr_token:attr_token+1], atts[j:j+1]):
                hint_score[j] = cosine_similarity(exp_emb[attr_token:attr_token+1], atts[j:j+1])

I also found that you don't use 'hint_score_attr' anymore, but this code would change 'hint_score[j]' that has been assigned at line 179. Thus, is it should be more reasonable that using 'hint_score_attr[j]' to replace 'hint_score[j]' at line 189. like this:

if cosine_similarity(exp_emb[attr_token:attr_token+1], atts[j:j+1]) > 0.3 :
        if hint_score_attr[j] <= cosine_similarity(exp_emb[attr_token:attr_token+1], atts[j:j+1]):
                hint_score_attr[j] = cosine_similarity(exp_emb[attr_token:attr_token+1], atts[j:j+1])

features 的下载链接已经失效，不知道什么情况？有人提供百度下载链接吗？

Can you provide the file 'train36_imgid2img.pkl'，'val36_imgid2img.pkl' or the code to generate it?

Can you provide these two files again or the method to generate them? The previous link shared is no longer working, and I really need these two files or the script to generate them. Thank you very much！