Giter Site home page Giter Site logo

l11x0m7 / question_answering_models Goto Github PK

View Code? Open in Web Editor NEW
155.0 7.0 45.0 12.38 MB

This repo collects and re-produces models related to domains of question answering and machine reading comprehension

License: MIT License

Jupyter Notebook 2.12% Python 97.28% Shell 0.59%
question-answering qa mrc machine-reading-comprehension python tensorflow

question_answering_models's Introduction

Question_Answering_Models

This repo collects and re-produces models related to domains of question answering and machine reading comprehension.

It's now still in the process of supplement.

comunity QA

Dataset

WikiQA, TrecQA, InsuranceQA

data preprocess on WikiQA

cd cQA
bash download.sh
python preprocess_wiki.py

Siamese-NN model

This model is a simple complementation of a Siamese NN QA model with a pointwise way.

To this repo for details

train model

python siamese.py --train

test model

python siamese.py --test

Siamese-CNN model

This model is a simple complementation of a Siamese CNN QA model with a pointwise way.

To this repo for details

train model

python siamese.py --train

test model

python siamese.py --test

Siamese-RNN model

This model is a simple complementation of a Siamese RNN/LSTM/GRU QA model with a pointwise way.

To this repo for details

train model

python siamese.py --train

test model

python siamese.py --test

note

All these three models above are based on the vanilla siamese structure. You can easily combine these basic deep learning module cells together and build your own models.

QACNN

Given a question, a positive answer and a negative answer, this pairwise model can rank two answers with higher ranking in terms of the right answer.

To this repo for details

train model

python qacnn.py --train

test model

python qacnn.py --test

Refer to:

Decomposable Attention Model

To this repo for details

train model

python decomp_att.py --train

test model

python decomp_att.py --test

Refer to:

Compare-Aggregate Model with Multi-Compare

To this repo for details

train model

python seq_match_seq.py --train

test model

python seq_match_seq.py --test

Refer to:

BiMPM

To this repo for details

train model

python bimpm.py --train

test model

python bimpm.py --test

Refer to:

Machine Reading Comprehension

Dataset

CNN/Daily mail, CBT, SQuAD, MS MARCO, RACE

GA Reader

To be done

GA

Refer to:

SA Reader

To be done

SAR

Refer to:

AoA Reader

To be done

AoA

Refer to:

  • Attention-over-Attention Neural Networks for Reading Comprehension

BiDAF

To this repo for details

BiDAF

The result on dev set(single model) under my experimental environment is shown as follows:

training step batch size hidden size EM (%) F1 (%) speed device
12W 32 75 67.7 77.3 3.40 it/s 1 GTX 1080 Ti

Refer to:

RNet

To this repo for details

RNet

The result on dev set(single model) under my experimental environment is shown as follows:

training step batch size hidden size EM (%) F1 (%) speed device RNN type
12W 32 75 69.1 78.2 1.35 it/s 1 GTX 1080 Ti cuDNNGRU
6W 64 75 66.1 75.6 2.95 s/it 1 GTX 1080 Ti SRU

RNet trained with cuDNNGRU:

RNet trained with SRU(without optimization on operation efficiency):

Refer to:

QANet

To this repo for details

QANet

The result on dev set(single model) under my experimental environment is shown as follows:

training step batch size attention heads hidden size EM (%) F1 (%) speed device
6W 32 1 96 70.2 79.7 2.4 it/s 1 GTX 1080 Ti
12W 32 1 75 70.1 79.4 2.4 it/s 1 GTX 1080 Ti

Experimental records for the first experiment:

Experimental records for the second experiment(without smooth):

Refer to:

  • QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
  • github repo of NLPLearn/QANet

Hybrid Network

To this repo for details

This repo contains my experiments and attempt for MRC problems, and I'm still working on it.

training step batch size hidden size EM (%) F1 (%) speed device description
12W 32 100 70.1 78.9 1.6 it/s 1 GTX 1080 Ti \
12W 32 75 70.0 79.1 1.8 it/s 1 GTX 1080 Ti \
12W 32 75 69.5 78.8 1.8 it/s 1 GTX 1080 Ti with spatial dropout on embeddings

Experimental records for the first experiment(without smooth):

Experimental records for the second experiment(without smooth):

Information

For more information, please visit http://skyhigh233.com/blog/2018/04/26/cqa-intro/.

question_answering_models's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

question_answering_models's Issues

Performance issue in the definition of cudnn_gru, MRC/BiDAF/layers.py(P1)

Hello, I found a performance issue in the definition of cudnn_gru, MRC/BiDAF/layers.py, tf.zeros([1, batch_size, num_units]) will be created repeatedly during program execution, resulting in reduced efficiency. I think it should be created before the loop.

The same issue exist in :

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

QANet

What is the reference paper for the QANet project?

关于compare-aggreate的结果

您好,请问您在实现compare-aggreate有没有达到论文中提到的效果呢?论文中的wikiqa的map是0.743,MRR是0.754,我用了自己写的代码和您写的代码,最后结果都不是很理想,如果您有什么实现细节,可以告知一下吗,感谢!

Performance issues in /MRC (by P3)

Hello! I've found a performance issue in /MRC: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • /BiDAF/util.py: dataset.batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/BiDAF/util.py: .batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/Hybrid/util.py: dataset.batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/Hybrid/util.py: .batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/QANet/util.py: dataset.batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/QANet/util.py: .batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/RNet/util.py: dataset.batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).
  • /MRC/RNet/util.py: .batch(config.batch_size)(here) should be called before .map(parser, num_parallel_calls=num_threads)(here).

Besides, you need to check the function called in map()(e.g., parser called in .map(parser, num_parallel_calls=num_threads)) whether to be affected or not to make the changed code work properly. For example, if parser needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

BiDAF

What are the Python, tensorflow, and cuda versions of your BiDAF project?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.