Giter Site home page Giter Site logo

csqa_code's Introduction

Step1: Download https://drive.google.com/file/d/1ccZSys8u4F_mqNJ97OOlSLe3fjpFLhdv/view?usp=sharing and extract it (and rename the folder to lucene_dir)

Step2: Download the files ent_embed.pkl.npy, rel_embed.pkl.npy, id_ent_map.pickle, id_rel_map.pickle from the link https://zenodo.org/record/4052427#.X2_hWXRKhQI and place them in a dir. named transe_dir

Step3: Download https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing and put it in a folder glove_dir

Step4: Download the wikidata JSONs from the link https://zenodo.org/record/4052427#.X2_hWXRKhQI and put them in a folder wikidata_dir

Step5: Put the correct (complete) paths to wikidata_dir, lucene_dir, transe_dir, glove_dir in params.py and params_test.py

Step6: In both params.py and params_test.py, use param['type_of_loss']="decoder"

Step7: Create a folder say 'Target_Model_decoder' where you want the decoder model to be dumped, and make two folders inside it ('dump' and 'model') (e.g. 'mkdir Target_Model_decoder/dump' and 'mkdir Target_Model_decoder/model')

Step8: put the params.py and params_test.py from Step 6 inside Target_Model_decoder folder

Step9: Create another version of  params.py and params_test.py, this time using param['type_of_loss']="kvmem"

Step 10: Create a folder say 'Target_Model_kvmem' where you want the kvmem model to be dumped, and make two folders inside it ('dump' and 'model') (e.g. 'mkdir Target_Model_kvmem/dump' and 'mkdir Target_Model_kvmem/model')

Step11: Download train_preprocessed.zip from https://drive.google.com/file/d/1HmLOGTV_v18grW_hXpu_s6MdogEJDM_a/view?usp=sharing and extract and put the contents (preprocessed pickle files of the train data) into Target_Model_decoder/dump and Target_Model_kvmem/dump

Step12: Download valid_preprocessed.zip from https://drive.google.com/file/d/1uoBUjjidyDks0pEUehxX-ofB5B_trdpP/view?usp=sharing and extract and put the contents (preprocessed pickle files of the valid data) into Target_Model_decoder/dump and Target_Model_kvmem/dump

Step13: Download test_preprocessed.zip from https://drive.google.com/file/d/1PMOE_jQJM_avY3MItAdEI0s3GJU_Km31/view?usp=sharing and extract and put the contents  (preprocessed pickle files of the test data) into Target_Model_decoder/dump and Target_Model_kvmem/dump

Step14: Run ./run.sh for training (the way it has been shown in the run.sh file) where the dump_dir is 'Target_Model_decoder' which you have created earlier and the data_dir is the directory containing the downloaded data

Step15: Run ./run_test.sh for testing (the way it has been shown in the run_test.sh file).

Step16: For evaluating the model separately on each question type, run the following:
./run_test.sh Target_Model_decoder verify
./run_test.sh Target_Model_decoder quantitative_count
./run_test.sh Target_Model_decoder comparative_count
./run_test.sh Target_Model_kvmem simple
./run_test.sh Target_Model_kvmem logical
./run_test.sh Target_Model_kvmem quantitative
./run_test.sh Target_Model_kvmem comparative

csqa_code's People

Contributors

amritasaha1812 avatar vardaan123 avatar

Stargazers

 avatar Aqsz avatar Venkatesh Seetharam avatar  avatar ZHANG Bowen avatar  avatar  avatar Li jingyang avatar Tung Ilya Ng avatar Subhasis Jethy avatar Shufeng Xiong avatar  avatar frog avatar  avatar Kevin Ko avatar  avatar Taesun Whang avatar  avatar Jianyu Cai avatar Cao_enjun avatar  avatar  avatar Elias Yu avatar Ni Lao avatar Pengcheng YIN avatar  avatar  avatar youngornever avatar Nan Zhao avatar hcZhang avatar Aleksandr Perevalov avatar Qingyang Zhong avatar Romain Claret avatar  avatar tmylla avatar Shengqiang Zhang avatar  avatar  avatar  avatar Vincent Xiaopeng Lu avatar Wonseok Hwang avatar Wei Wu avatar 爱可可-爱生活 avatar Itsuki Toyota avatar Fedor Nikolaev avatar  avatar Devin Hua avatar yuanke avatar Tao Shen avatar wx avatar  avatar Ziyu Yao avatar  avatar nkmry avatar  avatar Qian avatar Debanjan Chaudhuri (Deep) avatar Sanyam Agarwal avatar Yu (Hugo) Chen avatar

Watchers

James Cloos avatar  avatar  avatar

csqa_code's Issues

Restricted access train_preprocessed.zip and valid_preprocessed.zip ?

Hi,

I am unable to access train_preprocessed.zip and valid_preprocessed.zip through the links given in the README.

Step7: Download train_preprocessed.zip from https://drive.google.com/file/d/1HmLOGTV_v18grW_hXpu_s6MdogEJDM_a/view?usp=sharing and extract and put the contents (preprocessed pickle files of the train data) into Target_Model/dump
Step8: Download valid_preprocessed.zip from https://drive.google.com/file/d/1uoBUjjidyDks0pEUehxX-ofB5B_trdpP/view?usp=sharing and extract and put the contents (preprocessed pickle files of the valid data) into Target_Model/dump

Lucene index no longer supported

Hi @amritasaha1812,

I'm trying to replicate the experiments and I have this error:

org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(SimpleFSIndexInput(path="/Users/dthai/Codes/csqa-rl/CSQA_Code/lucene_index_9m/segments_1"))): 3 (needs to be between 6 and 9). This version of Lucene only supports indexes created with release 6.0 and later.

Can you share the code to generate Lucene index or upload the generated index of the newer version?

Thank you!

Add requirements.txt

Add a file containing ALL dependencies.
Most importantly, add
tensorflow=0.10.0
pattern.en
lucene

The process of evaluation

Hey @vardaan123, would you mind explaining the whole evaluation process? I am a little bit confused on the evaluation part.

From the script here

Step16: For evaluating the model separately on each question type, run the following:
./run_test.sh Target_Model_decoder verify
./run_test.sh Target_Model_decoder quantitative_count
./run_test.sh Target_Model_decoder comparative_count
./run_test.sh Target_Model_kvmem simple
./run_test.sh Target_Model_kvmem logical
./run_test.sh Target_Model_kvmem quantitative
./run_test.sh Target_Model_kvmem comparative

I think you feed different type of question to neural network using different decoding method (seq2seq/kvmem). Is this the correct way to evaluate? Assume we get a question, how can we know the type of the question before feeding it into a specific type the decoder ?

Can you explain a little bit about the evaluation process?

BTW, the precision/recall calculation is based on all_entities field or entities_in_utterance field in the SYSTEM JSON response ?

Training without pickle dump files

Hi @amritasaha1812 , I get runtime errors if I try to run without the train data pickle files I get the following runtime error.

Traceback (most recent call last):
  File "run_model.py", line 455, in <module>
    main()
  File "run_model.py", line 449, in main
    get_dialog_dict(param)
  File "/home/sanyam/notebooks/csqa/read_data.py", line 33, in get_dialog_dict
    ques_type_id = param['ques_type_id']
KeyError: 'ques_type_id'

To reproduce this bug, you can comment the lines which load the existing dictionaries in run_model.py

    # if isinstance(param['train_data_file'], list) and isinstance(param['valid_data_file'], list) and all([os.path.exists(x) for x in param['train_data_file']]) and all([os.path.exists(x) for x in param['valid_data_file']]):
	# print 'dictionary already exists'
    #     sys.stdout.flush()
    # elif isinstance(param['train_data_file'], str) and isinstance(param['valid_data_file'], str) and os.path.exists(param['train_data_file']) and os.path.exists(param['valid_data_file']):# and os.path.exists(param['test_data_file']):
    #     print 'dictionary already exists'
    #     sys.stdout.flush()
    # else:
    get_dialog_dict(param)
    print 'dictionary formed'
    sys.stdout.flush()
    run_training(param)

Complex Questions

Hi,

Does this model only work for questions whose answers are entities? Can it handle count, comparison and "Yes or No" questions? Looking forward to your reply.

Best regards
Sirui

load_wikidata_wfn.py

Some paths are hard-coded to cluster paths, these must be changed. Or if file is not req, you can remove those files.

`decoder_loss` runtime error

Hi,

I got a runtime error when training the model. The error was caused by this statement:

prob = tf.nn.softmax(logits)

It turns out that the variable logits is a list of tensors instead of a single tensor, as a result,
applying tf.nn.softmax to the list raised an error.

Could you please tell me how to fix this? Thank you so much!

Training time

Hi @amritasaha1812, I was wondering if you could help me with some training details.

  1. For how many epochs must we train the model?

  2. How much time does each epoch take? It's been about 12 hours since I put it to train and I still waiting for it to complete the second epoch. I am using two Nvidia 1080 Ti GPUs. I am observing that GPU usage is low most of the time and CPU usage is 100%. Is there a way to somehow use multiple CPUs and increase the training speed?

Thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.