stanfordnlp / mac-network Goto Github PK

View Code? Open in Web Editor NEW

491.0 31.0 120.0 210 KB

Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)

License: Apache License 2.0

Python 100.00%

attention clevr machine-reasoning compositional-attention-networks tensorflow question-answering vqa

mac-network's Introduction

StanfordNLP: A Python NLP Library for Many Human Languages

⚠️ Note ⚠️

All development, issues, ongoing maintenance, and support have been moved to our new GitHub repository as the toolkit is being renamed as Stanza since version 1.0.0. Please visit our new website for more information. You can still download `stanfordnlp` via pip, but newer versions of this package will be made available as `stanza`. This repository is kept for archival purposes.

The Stanford NLP Group's official Python NLP library. It contains packages for running our latest fully neural pipeline from the CoNLL 2018 Shared Task and for accessing the Java Stanford CoreNLP server. For detailed information please visit our official website.

References

If you use our neural pipeline including the tokenizer, the multi-word token expansion model, the lemmatizer, the POS/morphological features tagger, or the dependency parser in your research, please kindly cite our CoNLL 2018 Shared Task system description paper:

@inproceedings{qi2018universal,
 address = {Brussels, Belgium},
 author = {Qi, Peng  and  Dozat, Timothy  and  Zhang, Yuhao  and  Manning, Christopher D.},
 booktitle = {Proceedings of the {CoNLL} 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies},
 month = {October},
 pages = {160--170},
 publisher = {Association for Computational Linguistics},
 title = {Universal Dependency Parsing from Scratch},
 url = {https://nlp.stanford.edu/pubs/qi2018universal.pdf},
 year = {2018}
}

The PyTorch implementation of the neural pipeline in this repository is due to Peng Qi and Yuhao Zhang, with help from Tim Dozat and Jason Bolton.

This release is not the same as Stanford's CoNLL 2018 Shared Task system. The tokenizer, lemmatizer, morphological features, and multi-word term systems are a cleaned up version of the shared task code, but in the competition we used a Tensorflow version of the tagger and parser by Tim Dozat, which has been approximately reproduced in PyTorch (though with a few deviations from the original) for this release.

If you use the CoreNLP server, please cite the CoreNLP software package and the respective modules as described here ("Citing Stanford CoreNLP in papers"). The CoreNLP client is mostly written by Arun Chaganty, and Jason Bolton spearheaded merging the two projects together.

Issues and Usage Q&A

To ask questions, report issues or request features, please use the GitHub Issue Tracker.

Setup

StanfordNLP supports Python 3.6 or later. We strongly recommend that you install StanfordNLP from PyPI. If you already have pip installed, simply run:

pip install stanfordnlp

this should also help resolve all of the dependencies of StanfordNLP, for instance PyTorch 1.0.0 or above.

If you currently have a previous version of stanfordnlp installed, use:

pip install stanfordnlp -U

Alternatively, you can also install from source of this git repository, which will give you more flexibility in developing on top of StanfordNLP and training your own models. For this option, run

git clone https://github.com/stanfordnlp/stanfordnlp.git
cd stanfordnlp
pip install -e .

Running StanfordNLP

Getting Started with the neural pipeline

To run your first StanfordNLP pipeline, simply following these steps in your Python interactive interpreter:

>>> import stanfordnlp
>>> stanfordnlp.download('en')   # This downloads the English models for the neural pipeline
# IMPORTANT: The above line prompts you before downloading, which doesn't work well in a Jupyter notebook.
# To avoid a prompt when using notebooks, instead use: >>> stanfordnlp.download('en', force=True)
>>> nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English
>>> doc = nlp("Barack Obama was born in Hawaii.  He was elected president in 2008.")
>>> doc.sentences[0].print_dependencies()

The last command will print out the words in the first sentence in the input string (or Document, as it is represented in StanfordNLP), as well as the indices for the word that governs it in the Universal Dependencies parse of that sentence (its "head"), along with the dependency relation between the words. The output should look like:

('Barack', '4', 'nsubj:pass')
('Obama', '1', 'flat')
('was', '4', 'aux:pass')
('born', '0', 'root')
('in', '6', 'case')
('Hawaii', '4', 'obl')
('.', '4', 'punct')

Note: If you are running into issues like OSError: [Errno 22] Invalid argument, it's very likely that you are affected by a known Python issue, and we would recommend Python 3.6.8 or later and Python 3.7.2 or later.

We also provide a multilingual demo script that demonstrates how one uses StanfordNLP in other languages than English, for example Chinese (traditional)

python demo/pipeline_demo.py -l zh

See our getting started guide for more details.

Access to Java Stanford CoreNLP Server

Aside from the neural pipeline, this project also includes an official wrapper for acessing the Java Stanford CoreNLP Server with Python code.

There are a few initial setup steps.

Download Stanford CoreNLP and models for the language you wish to use
Put the model jars in the distribution folder
Tell the python code where Stanford CoreNLP is located: export CORENLP_HOME=/path/to/stanford-corenlp-full-2018-10-05

We provide another demo script that shows how one can use the CoreNLP client and extract various annotations from it.

Online Colab Notebooks

To get your started, we also provide interactive Jupyter notebooks in the demo folder. You can also open these notebooks and run them interactively on Google Colab. To view all available notebooks, follow these steps:

Go to the Google Colab website
Navigate to File -> Open notebook, and choose GitHub in the pop-up menu
Note that you do not need to give Colab access permission to your github account
Type stanfordnlp/stanfordnlp in the search bar, and click enter

Trained Models for the Neural Pipeline

We currently provide models for all of the treebanks in the CoNLL 2018 Shared Task. You can find instructions for downloading and using these models here.

Batching To Maximize Pipeline Speed

To maximize speed performance, it is essential to run the pipeline on batches of documents. Running a for loop on one sentence at a time will be very slow. The best approach at this time is to concatenate documents together, with each document separated by a blank line (i.e., two line breaks \n\n). The tokenizer will recognize blank lines as sentence breaks. We are actively working on improving multi-document processing.

Training your own neural pipelines

All neural modules in this library, including the tokenizer, the multi-word token (MWT) expander, the POS/morphological features tagger, the lemmatizer and the dependency parser, can be trained with your own CoNLL-U format data. Currently, we do not support model training via the Pipeline interface. Therefore, to train your own models, you need to clone this git repository and set up from source.

For detailed step-by-step guidance on how to train and evaluate your own models, please visit our training documentation.

LICENSE

StanfordNLP is released under the Apache License, Version 2.0. See the LICENSE file for more details.

mac-network's People

Contributors

Stargazers

Watchers

Forkers

shubhampachori12110095 fotwo hyzcn tonydeep g-wang alikhalilli yinjc ammarshadiq meinwerk zgsxwsdxg stepabi faezs codeaudit deesatzed mannykayy chiuyeelau konstantinklepikov wanjinchang tony32769 agdolla kgraph yjimmyy kkochubey1 afcarl yynil ml-lab marcusrobust casillas-qf shyamalschandra thilinicooray kamalkraj batermj zdepablo mayankjobanputra bigdatasciencegroup decewei linhanxiao kimisissi kimiqq maplewzx chubbymaggie kushalkafle alkalami ankitshah009 yyf8989 sibei bbnsumanth pvk444 cyhbrilliant lazycrazyowl theailabs fendaq aningstar schangpi soudia henryfriedlander tomarraj008 yourtone mgunay15 abhinavds ronghanghu gnahzak limoneren lizw14 andreicostinescu whungt alexiehta fagan2888 aurooj lbda1 nlprmx chrisams dorarad wh-forker sushantakpani tianrenwang excelsimon andresespinosapc glaciohound ronsoohyeong wtdeng linktopast1990 solmur tommylitlle arturbeg rosspeckomplekt aidaah akshay107 jmfanbu michalp21 paradoxzw 5l1v3r1 sumedhgodbole alexmirrington alirezazareian schen149 iswarm aiyaaa b-matchlsr bhathiya-hw

mac-network's Issues

About MAC on GQA-like images

Hello,

I would like to run the model on images that are not in the GQA dataset, but as if they were in GQA (basically I just want replace some images of the dataset with other images, and keep asking the same questions). For running the model on GQA I simply followed the instructions on the GQA branch, which consist in downloading the spatial features and the objects features and then to merge them.

But how do I extract those features from other images? I saw the extract_features.py script, but I don't fully understand how to use it in order to extract both spatial and object features. And what about the other parameters (image_height, image_width, model_stage, batch_size)? What should I use in order to extract features in the same way as the ones that you generated and put available to download?

Thanks in advance.

About gqa_sptail_info.json

In what script are these files being generated?

‘official download page’ link encountered 404 error

Hi
‘official download page’ link encountered 404 error.
Plz check the website can work normally.

ValueError while training on data1.2

I followed the instructions on the readme.md and the training part worked with no error with data.zip before.

However, when I clone the current version of the gqa branch and followed the instructions on the same md from the beginning(downloading everything again and merging), I am getting the following error in the training part just after the first epoch:

ValueError: Index (148690) out of range (0-108076)

Full stack trace after the first epoch is as follows:

eb  1, 78 (10000 / 10000), t = 0.32 (0.00+0.23), lr 0.0003, l = 1.9574, a = 0.5469, avL = 1.9961, avA = 0.5434, g = -1.0000, emL = 2.0224, emA = 0.5367; gqaExTraceback (most recent call last):
  File "main.py", line 848, in <module>
    main()
  File "main.py", line 710, in main
    evalRes = runEvaluation(sess, model, data["main"], dataOps, epoch, getPreds = getPreds, prevRes = evalRes)
  File "main.py", line 251, in runEvaluation
    minLoss = prevRes["val"]["minLoss"] if prevRes else float("inf"))
  File "main.py", line 571, in runEpoch
    imagesBatch = loadImageBatch(data["images"], batch)
  File "main.py", line 363, in loadImageBatch
    imageBatch[i, 0:numObjects] = toFile(imageId)["features"][imageId["idx"], 0:numObjects]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/dataset.py", line 553, in __getitem__
    selection = sel.select(self.shape, args, dsid=self.id)
  File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 94, in select
    sel[args]
  File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 261, in __getitem__
    start, count, step, scalar = _handle_simple(self.shape,args)
  File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 457, in _handle_simple
    x,y,z = _translate_int(int(arg), length)
  File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 477, in _translate_int
    raise ValueError("Index (%s) out of range (0-%s)" % (exp, length-1))
ValueError: Index (148690) out of range (0-108076)

About fine-tuning on CLEVR-Humans

Hello again!

If you don't mind, I have one more question for detailed procedure of fine-tuning the CLEVR-Humans dataset.

I was able to reproduce 12-step MAC's accuracy (98.9%) using Pytorch, but failed to reproduce Humans after FT (result was 76.6%, lower than paper's 81.5%).

My fine-tuning was done by (1) load fully trained model on CLEVR, (2) initialize new words' embedding vectors just as original words, (3) re-training the model on CLEVR-Humans train dataset ONLY following original model's learning schedule.

It seems your fine-tuning code trains the model on mixture CLEVR and CLEVR-Humans train dataset rather than using only CLEVR-Humans train dataset. (sorry if I misread again 😢) So I'm guessing that this difference might be the reason.

Since using the mixture of both dataset will take longer than just using CLEVR-Humans, I'm opening the issue thinking you might encountered the same problem and could help me out.

Thanks!

About Evaluation

For the evaluation I run the given command in readme with "--test" parameter but it gives "Index out of range" error. What might be the cause?

Testing on epoch 25...
Traceback (most recent call last):2 (0.00+0.24), lr 0.003, l = 2.3097, a = 0.5703, avL = 2.5296, avA = 0.6206, g = -1.0000, emL = 2.4391, emA = 0.6370; gqaExperiment
File "main.py", line 850, in
main()
File "main.py", line 777, in main
evalRes = runEvaluation(sess, model, data["main"], dataOps, epoch, evalTest = False, getPreds = True)
File "main.py", line 258, in runEvaluation
minLoss = prevRes["test"]["minLoss"] if prevRes else float("inf"))
File "main.py", line 573, in runEpoch
imagesBatch = loadImageBatch(data["images"], batch)
File "main.py", line 365, in loadImageBatch
imageBatch[i, 0:numObjects] = toFile(imageId)["features"][imageId["idx"], 0:numObjects]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/dataset.py", line 553, in getitem
selection = sel.select(self.shape, args, dsid=self.id)
File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 94, in select
sel[args]
File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 261, in getitem
start, count, step, scalar = _handle_simple(self.shape,args)
File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 457, in _handle_simple
x,y,z = _translate_int(int(arg), length)
File "/home/ec2-user/conda/envs/tf_gpu/lib/python3.6/site-packages/h5py/_hl/selections.py", line 477, in _translate_int
raise ValueError("Index (%s) out of range (0-%s)" % (exp, length-1))
ValueError: Index (150458) out of range (0-148854)

questions are not related to images

Hi,

I just trained a baseline(LSTM+CNN) and checked the predictions, but I figured out in the generated json file, several images have questions not corresponding to the objects in them. For example, this image(id:2359959) has questions:
Is there a sandwich in the image?
What kind of food is it?
Is the sandwich on the right?
Are there any clocks or flags?

But actually there is no food in the image.

Also, in this image(id:2371593) the question is:
In which part of the picture is the cat, the bottom or the top?

But actually the object is a person.

The questions are automatically generated for the images using scene graph, and I feel confused about in which step these mistakes may happen?

Thanks!

Train on full GQA dataset

hi, I currently work on this dataset.
I followed the guide and successfully train on Data1.2.zip and CLEVR version.
but now I want to train on the full Dataset from the website (70G).

Is Dataset1.2.zip a small subset of which on the website, or it has all questions and images (just remove some unnecessary part)?
if yes, then what should I do to run on "full" dataset with the baseline model?

thanks for your great work

Scene graph

Just wonder, does anyone has the scene graph for all splits of the GQA Dataset?

About releasing features for the ground-truth bounding boxes

Hi, @dorarad Could you kindly tell me when will you release the features for the ground-truth bounding boxes

Training on VQAv2

I see that there is option to choose dataset to be VQA. Wanted to know if I could train it on VQAv2 and if so how?

-bash: fork: Cannot allocate memory

Thanks for the repo

On trying to evaluate the model using python main.py --expName "gqaExperiment" --finalTest --testedNum 1000 --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt, it always seems to stop preprocessing at 64% and gives the error -bash: fork: Cannot allocate memory

Does this have something to do with having only 16 GB of RAM or would this be because of some other issue?

Thanks

Issue with the repo link using git clone

Hey, I think the download link for the repo has some issues. If i clone the directory using https://github.com/stanfordnlp/mac-network.git through git clone, all the files are not downloaded. The download zip option seems to be working fine though.

KeyError: '11183447'

Hi, I uploaded test.json to evalai (test2019 phase) and got the following issues:

Traceback (most recent call last):
File "/code/scripts/workers/submission_worker.py", line 336, in run_submission
submission_metadata=submission_serializer.data,
File "/tmp/tmpnr9tlxl6/compute/challenge_data/challenge_225/main.py", line 96, in evaluate
output["result"].append({tier: getScores(questions, questions, predictions, tier, kwargs['submission_metadata']['method_name'])})
File "/tmp/tmpnr9tlxl6/compute/challenge_data/challenge_225/main.py", line 315, in getScores
predicted = predictions[qid]
KeyError: '11183447'

I find that the question id '11183447' is in validation split, not in the test split.
So, it is strange that there is a keyerror here in test2019 phase.

AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'rnn_cell'

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
File "main.py", line 23, in
from model import MACnet
File "/home/gpuuser/shikha_phd/shikha/mac-network_figure_qa/model.py", line 6, in
import ops
File "/home/gpuuser/shikha_phd/shikha/mac-network_figure_qa/ops.py", line 5, in
from mi_gru_cell import MiGRUCell
File "/home/gpuuser/shikha_phd/shikha/mac-network_figure_qa/mi_gru_cell.py", line 4, in
class MiGRUCell(tf.nn.rnn_cell.RNNCell):
AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'rnn_cell'

preprocess.py extraDataset

First of all, I am trying to use extraDataset. There is one bug I found in preprocessData function:
for tier in extraData should be extraDataset

Secondly, it is unclear how we can use extra options, especially what extraVal means. Does it mean that only extra validation set is used in validation, but the code seems to only train on the validation set? So does it mean that the model only train on the validation set of the extra data?

About "all_submission_data.json" file

Hello,
When I try to run the command for submission, it cannot find "all_submission_data.json" file. I downloaded all the data from https://nlp.stanford.edu/data/gqa and I also could not find that file. Am I missing something?
Thanks,

Not found: Key macModel/MACnetwork/MACCell/linearLayerqInput10/biases/bias not found in checkpoint

Could not run the evaluation code: python main.py --expName "clevrExperiment" --finalTest --testedNum 10000 --netLength 16 -r --getPreds --getAtt @configs/args.txt
Due to error: Not found: Key macModel/MACnetwork/MACCell/linearLayerqInput10/biases/bias not found in checkpoint.
Check the main.py, found did not save bias.
Can you suggest how to fix this issues and where did you name bias, save to the checkpoint in the code? Thanks very much.

NotFoundError: 2 root error(s) found.
(0) Not found: Key macModel/MACnetwork/MACCell/linearLayerqInput10/biases/bias not found in checkpoint
[[{{node save/RestoreV2}}]]
(1) Not found: Key macModel/MACnetwork/MACCell/linearLayerqInput10/biases/bias not found in checkpoint
[[{{node save/RestoreV2}}]]
[[save/RestoreV2/_309]]
0 successful operations.
0 derived errors ignored.

I can't reproduce the performance as paper report

Thanks for your interesting work and I met some confusions when I run your code.
I run mac-network on GQA dataset as your github guidance. But the valid accuracy is lower than the paper mentioned. I run your code and the performance is listed as:
mac-network valid accuracy: 43.82
GQA-LSTM valid accuracy: 45.54
GQA-LSTMCNN valid accuracy: 42.03
Am I doing something wrong?

When will the code of the Neural State Machine be released?

Thanks.

About memory's variational dropout

First, thanks for sharing your great work.

As following your code, a question came across about your variational dropout on memory vector. (https://github.com/stanfordnlp/mac-network/blob/master/mac_cell.py#L215, https://github.com/stanfordnlp/mac-network/blob/master/mac_cell.py#L590)

It seems the mask is generated once when building graph, maintaining its shape afterwards (64, 512), and is applied always whether the model is in training or evaluation.

Since this kind of dropout produces stochastic result while evaluation, and accept only fixed batch size. I am wondering Is it ok to apply such method.

Correct me if I'm reading the code wrong.

Difference between all_val_data.json and val_all_questions.json

There is a file called all_val_data.json (436.4MB) in https://nlp.stanford.edu/data/gqa/data.zip and a similar file called val_all_questions.json (1.7G) in https://nlp.stanford.edu/data/gqa/questions.zip.
So, which one is the validation set?

About the balanced dataset size

Hi,
I saw in the paper https://cs.stanford.edu/people/dorarad/gqa/gqaPaper.pdf that the balanced GQA dataset consists of 1.7M questions. However, when I download the dataset on https://nlp.stanford.edu/data/gqa/questions1.2.zip and count the number of samples. It's weird since I only found the balanced dataset with 1M samples.
Thanks.

how to submit to test server

Thanks for this nice repo. I have tried to run experiments on GQA, and it has no problem. After I have trained the model, I did not find the instruction on how I can create the .json file that can be used to submit to the EvalAI test server. Maybe you mentioned it somewhere, but I did not find it. It will be good if you can let me know how such a .json file can be created in order to submit to test server. Thank you!

Scene graph baseline for GQA

Hello,

Is there a way to run the scene graph baseline reported in the paper or are there any available details on how to implement it?

ops.gatedAct no such function, in gqa branch

In model.py classifier method, line, an ops.gatedAct function is called, which does not (currently) exist in ops.py

About Grounding in GQA

Thanks for the repo

I find that you have did the Grounding experiments in GQA Datasets in the paper, but I'm confused on it, for example, how did you transform GQA questions to the Grounding sentences, and what is the answer corresponding on the GQA question when doing the Grounding experiments, could you detail it, thanks.

GQA 2020 submisstion

I generated the submit_predict.json and submited it to GQA evaluation server. However, I got an accuracy of 0 in test phase, but the result in dev phase makes sense. Is it possible that I predict all wrong answers in test split?

What is wrong with the submission file?

Interpretability

Hi there!

First of all thanks for publishing the code! I really enjoy this work :-).
I would like to ask if you could maybe share the exact parameters you used to create the network that produces this: https://camo.githubusercontent.com/e9e9464bfc10736d86b150ada2d8f68e74d3afae/68747470733a2f2f63732e7374616e666f72642e6564752f70656f706c652f646f72617261642f6d61632f696d67732f76697375616c2e706e67

Thanks in advance!

Is the question engine for gqa dataset available?

Hi. Thanks for the amazing repository. I was wondering if the question generation engine used to create the gqa dataset is available anywhere.

About object features

Hi~, thank you for your great work. I have one question about the object features. The object features files just contain object features and objects' bounding boxes, how can I know the object class and attributes? I know that sceneGraphs.zip provides information about the objects, attributes and relations in the image, but how does this information correspond to the object features files?
Looking forward for your response, thank you~

using it for custom questions?

Hey,

After training it, I want to use the model to take custom questions given some image from the dataset.

I am not sure if I am right, but it feel like the code does not have that mode. Particularly, that it loads the json file for val,test and re initializes the embedding. Also vocab file takes quite a while to load from .pkl files.
Can you please help me with this?

Which tools used to draw such a variety of elegant and decent Figures?

Hi, Thanks for your insightful work.
I want to know which tools you used to draw the figures in your BLOGPOST and paper. There all look so attracting & vivid!

Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000).

Hi, I'm trying to run the baseline:
CUDA_VISIBLE_DEVICES=1,2 python main.py --expName "gqaLSTM-CNN" --train --testedNum 10000 --epochs 25 @configs/gqa/gqaLSTMCNN.txt

I used tensorflow 1.5 with cudnn-7.3.1 and cuda-toolkit-9.0, but I got the error:
Preprocess data...
load dictionaries
Loading data...
Reading tier train
Reading tier val
Reading tier testdev
took 26.13 seconds
Loading word vectors...
loaded embs from file
took 0.02 seconds
Vectorizing data...
took 6.98 seconds
answerWordsNum
1845
took 35.19 seconds
Building model...
took 4.80 seconds
2019-04-08 09:23:06.386644: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-04-08 09:23:11.112367: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:06:00.0
totalMemory: 7.93GiB freeMemory: 7.81GiB
2019-04-08 09:23:11.251205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:09:00.0
totalMemory: 11.93GiB freeMemory: 2.15GiB
2019-04-08 09:23:11.251260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2019-04-08 09:23:11.251277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1
2019-04-08 09:23:11.251322: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0: Y N
2019-04-08 09:23:11.251329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1: N Y
2019-04-08 09:23:11.251341: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:06:00.0, compute capability: 6.1)
2019-04-08 09:23:11.251352: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:09:00.0, compute capability: 5.2)
Initializing weights
Training epoch 1...
2019-04-08 09:23:22.341741: E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2019-04-08 09:23:22.342735: F tensorflow/core/kernels/conv_ops.cc:717] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
Aborted

How could I fix it?

Best,
Ziyan

Pretrained mac network

Thanks for this awesome code base and dataset! :)
Do you plan to release pretrained weights for the mac network?

Non-issue

Servers are not responding to download image features

Hi,

We needed to re-download all features from the given link in the readme. However, currently the servers are not responding both for objects and spatial features links (https://www.isitdownrightnow.com/nlp.stanford.edu.html). Is this a temporary outage?

consistency, validity, and plausibility in GQA

Dear @dorarad, I encounter several problems when I run a project on GQA. Could you please help me?

Consistency evaluation. Which .json should be used to evaluate the consistency? I used testdev_balanced_questions.json, but an error occurred, ['2062326'] key error. I found this id is included in the testdev_all_questions.json.
Validity and Plausibility. According to the provided eval.py, the json file should be the train_choices.json and val_choices.json. The KeyError: '201497576' will be triggered in code line: valid = belongs(predicted, choices[qid]["valid"], question). And, the two files have no ["valid"] and ["plausible"].

Could you please help me to solve these problems? Thank you

Number of unique answers.

Thank you very much for the nice dataset !

I have a question about the number of unique answers in the GQA dataset.
When computing the number of unique answers I get:

1845 answers for the training split (based on combining each 'answer' of the 10 training files)
1852 answers for train + valid
1853 for train + valid + test_dev splits.

In the paper, you mention that there are 1878, is this discrepancy caused by some answers only being present in the test split ?

Have a great day :)

Yana

License?

Hi,

Awesome paper :) Question: what is the license for the code?

Hugh

About object number

Hi,
I just processed the scene graph and the object feature, but I found that the number of objects in scenegraph.json is not equal to that in gqa_objects.h5 file. For example, the number of objects for image '2386621' is 16 in train_sceneGraphs.json but is 18 in gqa_objects.h5 file. Is there anything wrong with my processing? And how to match the object number and the feature?
Thanks!

Evaluation error

Hello! Thank you so much for putting up this beautiful work!

After training on the args.txt configuration, I proceeded to evaluate as per the instructions. I obtained the following error. Do you know what is going on?

2020-06-26 00:53:56.788567: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-26 00:53:56.795400: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2599845000 Hz
2020-06-26 00:53:56.795894: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xecc6dc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-26 00:53:56.795930: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-06-26 00:53:56.801185: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-06-26 00:53:56.923729: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xecf9b60 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-06-26 00:53:56.923796: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P40, Compute Capability 6.1
2020-06-26 00:53:56.925923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:84:00.0
2020-06-26 00:53:56.927409: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.928514: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.929673: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.930791: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.932004: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.933170: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.933611: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /share/apps/python3/3.7.3/intel/lib:/share/apps/gcc/6.3.0/lib64:/share/apps/gcc/6.3.0/lib:/share/apps/mpc/1.0.3/gnu/lib:/share/apps/mpfr/3.1.5/gnu/lib:/share/apps/gmp/6.1.2/gnu/lib:/share/apps/intel/19.0.1/mkl/lib/intel64:/share/apps/intel/19.0.1/lib/intel64:/share/apps/centos/7/usr/lib64:/opt/slurm/lib64
2020-06-26 00:53:56.933650: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-06-26 00:53:56.933691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-26 00:53:56.933715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-06-26 00:53:56.933736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
�[1mPreprocess data...�[0m
�[1mLoading data...�[0m
took 76.19 seconds
�[1mLoading word vectors...�[0m
0
{'': 0, '': 1, '': 2, '': 3, 'are': 4, 'there': 5, 'more': 6, 'big': 7, 'green': 8, 'things': 9, 'than': 10, 'large': 11, 'purple': 12, 'shiny': 13, 'cubes': 14, 'how': 15, 'many': 16, 'other': 17, 'of': 18, 'the': 19, 'same': 20, 'shape': 21, 'as': 22, 'tiny': 23, 'cyan': 24, 'matte': 25, 'object': 26, 'is': 27, 'color': 28, 'sphere': 29, 'cube': 30, 'what': 31, 'material': 32, 'that': 33, 'right': 34, 'brown': 35, 'cylinder': 36, 'and': 37, 'left': 38, 'gray': 39, 'on': 40, 'side': 41, 'small': 42, 'rubber': 43, 'behind': 44, 'thing': 45, 'to': 46, 'metallic': 47, 'size': 48, 'any': 49, 'have': 50, 'block': 51, 'blue': 52, 'yellow': 53, 'a': 54, ';': 55, 'it': 56, 'ball': 57, 'its': 58, 'in': 59, 'front': 60, 'does': 61, 'number': 62, 'red': 63, 'spheres': 64, 'made': 65, 'metal': 66, 'cylinders': 67, 'both': 68, 'balls': 69, 'or': 70, 'blocks': 71, 'objects': 72, 'visible': 73, 'another': 74, 'has': 75, 'greater': 76, 'fewer': 77, 'less': 78, 'either': 79, 'anything': 80, 'else': 81, 'do': 82, 'an': 83, 'equal': 84}
85
{'yes': 0, '2': 1, 'no': 2, 'rubber': 3, 'large': 4, '0': 5, 'sphere': 6, 'gray': 7, 'cube': 8, 'blue': 9, 'brown': 10, '1': 11, 'yellow': 12, 'purple': 13, 'cylinder': 14, 'small': 15, 'green': 16, 'metal': 17, '3': 18, '4': 19, 'cyan': 20, '6': 21, 'red': 22, '5': 23, '8': 24, '7': 25, '9': 26, '10': 27}
28
{'': 0, '': 1, '': 2, '': 3, 'are': 4, 'there': 5, 'more': 6, 'big': 7, 'green': 8, 'things': 9, 'than': 10, 'large': 11, 'purple': 12, 'shiny': 13, 'cubes': 14, 'yes': 15, 'how': 16, 'many': 17, 'other': 18, 'of': 19, 'the': 20, 'same': 21, 'shape': 22, 'as': 23, 'tiny': 24, 'cyan': 25, 'matte': 26, 'object': 27, '2': 28, 'is': 29, 'color': 30, 'sphere': 31, 'cube': 32, 'no': 33, 'what': 34, 'material': 35, 'that': 36, 'right': 37, 'brown': 38, 'cylinder': 39, 'and': 40, 'left': 41, 'rubber': 42, 'gray': 43, 'on': 44, 'side': 45, 'small': 46, 'behind': 47, 'thing': 48, '0': 49, 'to': 50, 'metallic': 51, 'size': 52, 'any': 53, 'have': 54, 'block': 55, 'blue': 56, 'yellow': 57, 'a': 58, ';': 59, 'it': 60, 'ball': 61, 'its': 62, 'in': 63, 'front': 64, 'does': 65, 'number': 66, 'red': 67, 'spheres': 68, 'made': 69, 'metal': 70, 'cylinders': 71, '1': 72, 'both': 73, 'balls': 74, 'or': 75, 'blocks': 76, 'objects': 77, 'visible': 78, 'another': 79, 'has': 80, 'greater': 81, 'fewer': 82, 'less': 83, '3': 84, '4': 85, 'either': 86, 'anything': 87, 'else': 88, 'do': 89, '6': 90, 'an': 91, 'equal': 92, '5': 93, '8': 94, '7': 95, '9': 96, '10': 97}
98
took 0.00 seconds
�[1mVectorizing data...�[0m
took 13.70 seconds
took �[1m�[34m89.90�[0m seconds
�[1mBuilding model...�[0m
took �[1m�[34m17.59�[0m seconds
Traceback (most recent call last):
File "main.py", line 802, in
main()
File "main.py", line 691, in main
epoch = loadWeights(sess, saver, init)
File "main.py", line 190, in loadWeights
config.restoreEpoch, config.lr = lastLoggedEpoch()
File "main.py", line 62, in lastLoggedEpoch
epoch = int(lastLine[0])
ValueError: invalid literal for int() with base 10: 'epoch'

Thank you very much in advance for the help! :) Take care :)