amritasaha1812 / csqa_code Goto Github PK

Python 90.39% Shell 8.28% Perl 1.33%

csqa_code's Introduction

Step1: Download https://drive.google.com/file/d/1ccZSys8u4F_mqNJ97OOlSLe3fjpFLhdv/view?usp=sharing and extract it (and rename the folder to lucene_dir)

Step2: Download the files ent_embed.pkl.npy, rel_embed.pkl.npy, id_ent_map.pickle, id_rel_map.pickle from the link https://zenodo.org/record/4052427#.X2_hWXRKhQI and place them in a dir. named transe_dir

Step3: Download https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing and put it in a folder glove_dir

Step4: Download the wikidata JSONs from the link https://zenodo.org/record/4052427#.X2_hWXRKhQI and put them in a folder wikidata_dir

Step5: Put the correct (complete) paths to wikidata_dir, lucene_dir, transe_dir, glove_dir in params.py and params_test.py

Step6: In both params.py and params_test.py, use param['type_of_loss']="decoder"

Step7: Create a folder say 'Target_Model_decoder' where you want the decoder model to be dumped, and make two folders inside it ('dump' and 'model') (e.g. 'mkdir Target_Model_decoder/dump' and 'mkdir Target_Model_decoder/model')

Step8: put the params.py and params_test.py from Step 6 inside Target_Model_decoder folder

Step9: Create another version of  params.py and params_test.py, this time using param['type_of_loss']="kvmem"

Step 10: Create a folder say 'Target_Model_kvmem' where you want the kvmem model to be dumped, and make two folders inside it ('dump' and 'model') (e.g. 'mkdir Target_Model_kvmem/dump' and 'mkdir Target_Model_kvmem/model')

Step11: Download train_preprocessed.zip from https://drive.google.com/file/d/1HmLOGTV_v18grW_hXpu_s6MdogEJDM_a/view?usp=sharing and extract and put the contents (preprocessed pickle files of the train data) into Target_Model_decoder/dump and Target_Model_kvmem/dump

Step12: Download valid_preprocessed.zip from https://drive.google.com/file/d/1uoBUjjidyDks0pEUehxX-ofB5B_trdpP/view?usp=sharing and extract and put the contents (preprocessed pickle files of the valid data) into Target_Model_decoder/dump and Target_Model_kvmem/dump

Step13: Download test_preprocessed.zip from https://drive.google.com/file/d/1PMOE_jQJM_avY3MItAdEI0s3GJU_Km31/view?usp=sharing and extract and put the contents  (preprocessed pickle files of the test data) into Target_Model_decoder/dump and Target_Model_kvmem/dump

Step14: Run ./run.sh for training (the way it has been shown in the run.sh file) where the dump_dir is 'Target_Model_decoder' which you have created earlier and the data_dir is the directory containing the downloaded data

Step15: Run ./run_test.sh for testing (the way it has been shown in the run_test.sh file).

Step16: For evaluating the model separately on each question type, run the following:
./run_test.sh Target_Model_decoder verify
./run_test.sh Target_Model_decoder quantitative_count
./run_test.sh Target_Model_decoder comparative_count
./run_test.sh Target_Model_kvmem simple
./run_test.sh Target_Model_kvmem logical
./run_test.sh Target_Model_kvmem quantitative
./run_test.sh Target_Model_kvmem comparative

csqa_code's People

Contributors

Stargazers

Watchers

csqa_code's Issues

Code for pre-processing wikidata json dump?

Could you please also share the code that you use for pre-processing the wikidata json dump? This would be an enormous help. Thanks!

Restricted access train_preprocessed.zip and valid_preprocessed.zip ?

Hi,

I am unable to access train_preprocessed.zip and valid_preprocessed.zip through the links given in the README.

Step7: Download train_preprocessed.zip from https://drive.google.com/file/d/1HmLOGTV_v18grW_hXpu_s6MdogEJDM_a/view?usp=sharing and extract and put the contents (preprocessed pickle files of the train data) into Target_Model/dump
Step8: Download valid_preprocessed.zip from https://drive.google.com/file/d/1uoBUjjidyDks0pEUehxX-ofB5B_trdpP/view?usp=sharing and extract and put the contents (preprocessed pickle files of the valid data) into Target_Model/dump

Lucene index no longer supported

Hi @amritasaha1812,

I'm trying to replicate the experiments and I have this error:

org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(SimpleFSIndexInput(path="/Users/dthai/Codes/csqa-rl/CSQA_Code/lucene_index_9m/segments_1"))): 3 (needs to be between 6 and 9). This version of Lucene only supports indexes created with release 6.0 and later.

Can you share the code to generate Lucene index or upload the generated index of the newer version?

Thank you!

Add requirements.txt

Add a file containing ALL dependencies.
Most importantly, add
tensorflow=0.10.0
pattern.en
lucene

what is the background of CSQA, any paper or document explain it

Where to download the dataset

Hi, where can I download the CSQA question-answer dataset?

`kvmem` vs `decoder` loss?

What is the difference between kvmem loss and decoder loss? How do you decide which to use?

The process of evaluation

Hey @vardaan123, would you mind explaining the whole evaluation process? I am a little bit confused on the evaluation part.

From the script here

Step16: For evaluating the model separately on each question type, run the following:
./run_test.sh Target_Model_decoder verify
./run_test.sh Target_Model_decoder quantitative_count
./run_test.sh Target_Model_decoder comparative_count
./run_test.sh Target_Model_kvmem simple
./run_test.sh Target_Model_kvmem logical
./run_test.sh Target_Model_kvmem quantitative
./run_test.sh Target_Model_kvmem comparative

I think you feed different type of question to neural network using different decoding method (seq2seq/kvmem). Is this the correct way to evaluate? Assume we get a question, how can we know the type of the question before feeding it into a specific type the decoder ?

Can you explain a little bit about the evaluation process?

BTW, the precision/recall calculation is based on all_entities field or entities_in_utterance field in the SYSTEM JSON response ?

Training without pickle dump files

Hi @amritasaha1812 , I get runtime errors if I try to run without the train data pickle files I get the following runtime error.

Traceback (most recent call last):
  File "run_model.py", line 455, in <module>
    main()
  File "run_model.py", line 449, in main
    get_dialog_dict(param)
  File "/home/sanyam/notebooks/csqa/read_data.py", line 33, in get_dialog_dict
    ques_type_id = param['ques_type_id']
KeyError: 'ques_type_id'

To reproduce this bug, you can comment the lines which load the existing dictionaries in run_model.py

    # if isinstance(param['train_data_file'], list) and isinstance(param['valid_data_file'], list) and all([os.path.exists(x) for x in param['train_data_file']]) and all([os.path.exists(x) for x in param['valid_data_file']]):
	# print 'dictionary already exists'
    #     sys.stdout.flush()
    # elif isinstance(param['train_data_file'], str) and isinstance(param['valid_data_file'], str) and os.path.exists(param['train_data_file']) and os.path.exists(param['valid_data_file']):# and os.path.exists(param['test_data_file']):
    #     print 'dictionary already exists'
    #     sys.stdout.flush()
    # else:
    get_dialog_dict(param)
    print 'dictionary formed'
    sys.stdout.flush()
    run_training(param)

"id_entity_map.pickle" File not found in wikidata_dir

the wikidata_dir does not contain the file being loaded on this line.
https://github.com/amritasaha1812/CSQA_Code/blob/master/run_model.py#L314

Complex Questions

Hi,

Does this model only work for questions whose answers are entities? Can it handle count, comparison and "Yes or No" questions? Looking forward to your reply.

Best regards
Sirui

load_wikidata_wfn.py

Some paths are hard-coded to cluster paths, these must be changed. Or if file is not req, you can remove those files.

`decoder_loss` runtime error

Hi,

I got a runtime error when training the model. The error was caused by this statement:

CSQA_Code/hierarchy_model.py

Line 267 in 0b297bd

prob = tf.nn.softmax(logits)

It turns out that the variable logits is a list of tensors instead of a single tensor, as a result,
applying tf.nn.softmax to the list raised an error.

Could you please tell me how to fix this? Thank you so much!

For how many epochs must we train the model?
How much time does each epoch take? It's been about 12 hours since I put it to train and I still waiting for it to complete the second epoch. I am using two Nvidia 1080 Ti GPUs. I am observing that GPU usage is low most of the time and CPU usage is 100%. Is there a way to somehow use multiple CPUs and increase the training speed?

Thanks,

amritasaha1812 / csqa_code Goto Github PK

csqa_code's Introduction

csqa_code's People

Contributors

Stargazers

Watchers

Forkers

csqa_code's Issues

Recommend Projects

Recommend Topics

Recommend Org