csqa_code's Introduction
Step1: Download https://drive.google.com/file/d/1ccZSys8u4F_mqNJ97OOlSLe3fjpFLhdv/view?usp=sharing and extract it (and rename the folder to lucene_dir) Step2: Download the files ent_embed.pkl.npy, rel_embed.pkl.npy, id_ent_map.pickle, id_rel_map.pickle from the link https://zenodo.org/record/4052427#.X2_hWXRKhQI and place them in a dir. named transe_dir Step3: Download https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing and put it in a folder glove_dir Step4: Download the wikidata JSONs from the link https://zenodo.org/record/4052427#.X2_hWXRKhQI and put them in a folder wikidata_dir Step5: Put the correct (complete) paths to wikidata_dir, lucene_dir, transe_dir, glove_dir in params.py and params_test.py Step6: In both params.py and params_test.py, use param['type_of_loss']="decoder" Step7: Create a folder say 'Target_Model_decoder' where you want the decoder model to be dumped, and make two folders inside it ('dump' and 'model') (e.g. 'mkdir Target_Model_decoder/dump' and 'mkdir Target_Model_decoder/model') Step8: put the params.py and params_test.py from Step 6 inside Target_Model_decoder folder Step9: Create another version of params.py and params_test.py, this time using param['type_of_loss']="kvmem" Step 10: Create a folder say 'Target_Model_kvmem' where you want the kvmem model to be dumped, and make two folders inside it ('dump' and 'model') (e.g. 'mkdir Target_Model_kvmem/dump' and 'mkdir Target_Model_kvmem/model') Step11: Download train_preprocessed.zip from https://drive.google.com/file/d/1HmLOGTV_v18grW_hXpu_s6MdogEJDM_a/view?usp=sharing and extract and put the contents (preprocessed pickle files of the train data) into Target_Model_decoder/dump and Target_Model_kvmem/dump Step12: Download valid_preprocessed.zip from https://drive.google.com/file/d/1uoBUjjidyDks0pEUehxX-ofB5B_trdpP/view?usp=sharing and extract and put the contents (preprocessed pickle files of the valid data) into Target_Model_decoder/dump and Target_Model_kvmem/dump Step13: Download test_preprocessed.zip from https://drive.google.com/file/d/1PMOE_jQJM_avY3MItAdEI0s3GJU_Km31/view?usp=sharing and extract and put the contents (preprocessed pickle files of the test data) into Target_Model_decoder/dump and Target_Model_kvmem/dump Step14: Run ./run.sh for training (the way it has been shown in the run.sh file) where the dump_dir is 'Target_Model_decoder' which you have created earlier and the data_dir is the directory containing the downloaded data Step15: Run ./run_test.sh for testing (the way it has been shown in the run_test.sh file). Step16: For evaluating the model separately on each question type, run the following: ./run_test.sh Target_Model_decoder verify ./run_test.sh Target_Model_decoder quantitative_count ./run_test.sh Target_Model_decoder comparative_count ./run_test.sh Target_Model_kvmem simple ./run_test.sh Target_Model_kvmem logical ./run_test.sh Target_Model_kvmem quantitative ./run_test.sh Target_Model_kvmem comparative
csqa_code's People
Forkers
vardaan123 sanyam5 hugochan vishwajeetkumar93 sobalgi amarianah sungjinlees uditsaxena tk1363704 hxin08 raymondli0 gokasiko katherinelyx iamrishiraj oktang baylee001 karma19350 caoxu915683474 pdsxsf munirabobakercsqa_code's Issues
Code for pre-processing wikidata json dump?
Could you please also share the code that you use for pre-processing the wikidata json dump? This would be an enormous help. Thanks!
Restricted access train_preprocessed.zip and valid_preprocessed.zip ?
Hi,
I am unable to access train_preprocessed.zip and valid_preprocessed.zip through the links given in the README.
Step7: Download train_preprocessed.zip from https://drive.google.com/file/d/1HmLOGTV_v18grW_hXpu_s6MdogEJDM_a/view?usp=sharing and extract and put the contents (preprocessed pickle files of the train data) into Target_Model/dump
Step8: Download valid_preprocessed.zip from https://drive.google.com/file/d/1uoBUjjidyDks0pEUehxX-ofB5B_trdpP/view?usp=sharing and extract and put the contents (preprocessed pickle files of the valid data) into Target_Model/dump
Lucene index no longer supported
Hi @amritasaha1812,
I'm trying to replicate the experiments and I have this error:
org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(SimpleFSIndexInput(path="/Users/dthai/Codes/csqa-rl/CSQA_Code/lucene_index_9m/segments_1"))): 3 (needs to be between 6 and 9). This version of Lucene only supports indexes created with release 6.0 and later.
Can you share the code to generate Lucene index or upload the generated index of the newer version?
Thank you!
Add requirements.txt
Add a file containing ALL dependencies.
Most importantly, add
tensorflow=0.10.0
pattern.en
lucene
what is the background of CSQA, any paper or document explain it
Where to download the dataset
Hi, where can I download the CSQA question-answer dataset?
`kvmem` vs `decoder` loss?
What is the difference between kvmem
loss and decoder
loss? How do you decide which to use?
The process of evaluation
Hey @vardaan123, would you mind explaining the whole evaluation process? I am a little bit confused on the evaluation part.
From the script here
Step16: For evaluating the model separately on each question type, run the following:
./run_test.sh Target_Model_decoder verify
./run_test.sh Target_Model_decoder quantitative_count
./run_test.sh Target_Model_decoder comparative_count
./run_test.sh Target_Model_kvmem simple
./run_test.sh Target_Model_kvmem logical
./run_test.sh Target_Model_kvmem quantitative
./run_test.sh Target_Model_kvmem comparative
I think you feed different type of question to neural network using different decoding method (seq2seq/kvmem). Is this the correct way to evaluate? Assume we get a question, how can we know the type of the question before feeding it into a specific type the decoder ?
Can you explain a little bit about the evaluation process?
BTW, the precision/recall calculation is based on all_entities
field or entities_in_utterance
field in the SYSTEM JSON response ?
Training without pickle dump files
Hi @amritasaha1812 , I get runtime errors if I try to run without the train data pickle files I get the following runtime error.
Traceback (most recent call last):
File "run_model.py", line 455, in <module>
main()
File "run_model.py", line 449, in main
get_dialog_dict(param)
File "/home/sanyam/notebooks/csqa/read_data.py", line 33, in get_dialog_dict
ques_type_id = param['ques_type_id']
KeyError: 'ques_type_id'
To reproduce this bug, you can comment the lines which load the existing dictionaries in run_model.py
# if isinstance(param['train_data_file'], list) and isinstance(param['valid_data_file'], list) and all([os.path.exists(x) for x in param['train_data_file']]) and all([os.path.exists(x) for x in param['valid_data_file']]):
# print 'dictionary already exists'
# sys.stdout.flush()
# elif isinstance(param['train_data_file'], str) and isinstance(param['valid_data_file'], str) and os.path.exists(param['train_data_file']) and os.path.exists(param['valid_data_file']):# and os.path.exists(param['test_data_file']):
# print 'dictionary already exists'
# sys.stdout.flush()
# else:
get_dialog_dict(param)
print 'dictionary formed'
sys.stdout.flush()
run_training(param)
"id_entity_map.pickle" File not found in wikidata_dir
the wikidata_dir does not contain the file being loaded on this line.
https://github.com/amritasaha1812/CSQA_Code/blob/master/run_model.py#L314
Complex Questions
Hi,
Does this model only work for questions whose answers are entities? Can it handle count, comparison and "Yes or No" questions? Looking forward to your reply.
Best regards
Sirui
load_wikidata_wfn.py
Some paths are hard-coded to cluster paths, these must be changed. Or if file is not req, you can remove those files.
`decoder_loss` runtime error
Hi,
I got a runtime error when training the model. The error was caused by this statement:
Line 267 in 0b297bd
It turns out that the variable logits is a list of tensors instead of a single tensor, as a result,
applying tf.nn.softmax to the list raised an error.
Could you please tell me how to fix this? Thank you so much!
How to get the knowledge graph
Hi, How can I get the corresponding knowledge graph for this dataset?
Missing processed test data file
Hi,
Could you please tell me where to download the processed test data files? I didn't find it in README. Thank you so much!
Training time
Hi @amritasaha1812, I was wondering if you could help me with some training details.
-
For how many epochs must we train the model?
-
How much time does each epoch take? It's been about 12 hours since I put it to train and I still waiting for it to complete the second epoch. I am using two Nvidia 1080 Ti GPUs. I am observing that GPU usage is low most of the time and CPU usage is 100%. Is there a way to somehow use multiple CPUs and increase the training speed?
Thanks,
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.