Giter Site home page Giter Site logo

seqmatchseq's Introduction

SeqMatchSeq

Implementations of three models described in the three papers related to sequence matching:

Learning Natural Language Inference with Lstm

Requirements

Datasets

Usage

sh preprocess.sh snli
cd main
th main.lua -task snli -model mLSTM -dropoutP 0.3 -num_classes 3

sh preprocess.sh snli will download the datasets and preprocess the SNLI corpus into the files (train.txt dev.txt test.txt) under the path "data/snli/sequence" with the format:

sequence1(premise) \t sequence2(hypothesis) \t label(from 1 to num_classes) \n

main.lua will first initialize the preprossed data and word embeddings into a Torch format and then run the alogrithm. "dropoutP" is the main prarameter we tuned.

Docker

You may try to use Docker for running the code.

After installation, just run the following codes (/PATH/SeqMatchSeq need to change):

docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt      shuohang/seqmatchseq:1.0 /bin/bash -c "sh preprocess.sh snli"
docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt/main shuohang/seqmatchseq:1.0 /bin/bash -c "th main.lua"

Machine Comprehension Using Match-LSTM and Answer Pointer

Requirements

Datasets

Usage

sh preprocess.sh squad
cd main
th mainDt.lua 

sh preprocess.sh squad will download the datasets and preprocess the SQuAD corpus into the files (train.txt dev.txt) under the path "data/squad/sequence" with the format:

sequence1(Doument) \t sequence2(Question) \t sequence of the positions where the answer appear in Document (e.g. 3 4 5 6) \n

mainDt.lua will first initialize the preprossed data and word embeddings into a Torch format and then run the alogrithm. As this code is run through multiple CPU cores, the initial parameters are written in the file "main/init.lua".

  • opt.num_processes: 5. The number of threads used.
  • opt.batch_size : 6. Batch size for each thread. (Then the mini_batch would be 5*6 .)
  • opt.model : boundaryMPtr / sequenceMPtr

Docker

You may try to use Docker for running the code.

After installation, just run the following codes (/PATH/SeqMatchSeq need to change):

docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt      shuohang/seqmatchseq:1.0 /bin/bash -c "sh preprocess.sh squad"
docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt/main shuohang/seqmatchseq:1.0 /bin/bash -c "th mainDt.lua"

A Compare-Aggregate Model for Matching Text Sequences

Requirements

Datasets

For now, this code only support SNLI and WikiQA data sets.

Usage

SNLI task (The preprocessed format follows the previous description):

sh preprocess.sh snli
cd main
th main.lua -task snli -model compAggSNLI -comp_type submul -learning_rate 0.002 -mem_dim 150 -dropoutP 0.3 

WikiQA task:

sh preprocess.sh wikiqa (Please first dowload the file "WikiQACorpus.zip" to the path SeqMatchSeq/data/wikiqa/ through address: https://www.microsoft.com/en-us/download/details.aspx?id=52419)
cd main
th main.lua -task wikiqa -model compAggWikiqa -comp_type mul -learning_rate 0.004 -dropoutP 0.04 -batch_size 10 -mem_dim 150 
  • model (model name) : compAggSNLI / compAggWikiqa
  • comp_type (8 different types of word comparison): submul / sub / mul / weightsub / weightmul / bilinear / concate / cos

Docker

You may try to use Docker for running the code.

After installation, just run the following codes (/PATH/SeqMatchSeq need to change):

For SNLI:

docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt      shuohang/transition:1.1 /bin/bash -c "sh preprocess.sh snli"
docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt/main shuohang/transition:1.1 /bin/bash -c "th main.lua -task snli -model compAggSNLI -comp_type submul -learning_rate 0.002 -mem_dim 150 -dropoutP 0.3"

For WikiQA

docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt      shuohang/seqmatchseq:1.0 /bin/bash -c "sh preprocess.sh wikiqa"
docker run -it -v /PATH/SeqMatchSeq:/opt --rm -w /opt/main shuohang/seqmatchseq:1.0 /bin/bash -c "th main.lua -task wikiqa -model compAggWikiqa -comp_type mul -learning_rate 0.004 -dropoutP 0.04 -batch_size 10 -mem_dim 150"

Copyright

Copyright 2015 Singapore Management University (SMU). All Rights Reserved.

seqmatchseq's People

Contributors

shuohangwang avatar ccclyu avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.