Giter Site home page Giter Site logo

musst's Introduction

Multi-span Style Extraction for Generative Reading Comprehension

Code for the paper: Multi-span Style Extraction for Generative Reading Comprehension
Junjie Yang, Zhuosheng Zhang, Hai Zhao framework

Dependencies

The code was tested with python 3.7 and pytorch 1.2.0. If you use conda, you can create an env and install them as follows:

conda create --name marco python==3.7
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch

Install the required packages:

pip install -r requirements.txt

Install standford corenlp: https://stanfordnlp.github.io/CoreNLP/

Datasets

You can download MS Marco v2.1 dataset here: https://microsoft.github.io/msmarco/. Put all the files into a data directory.

Now preprocess the datasets:

cd preprocessing
# Spilt 'qa' and 'nlg' subsets
python dataset_spilt.py --data_dir=${data_dir}

# Create dev reference files for evaluation
python create_dev_ref.py --data_dir=${data_dir} --task='qa'
python create_dev_ref.py --data_dir=${data_dir} --task='nlg'

Ranking

Marco provides mutiple reading passages for each question, so before answering question, we need to select the most relevant one.

Train a ranker on 4 GPUs:

cd ../ranker
python main.py \
    --model_name_or_path='albert-base-v2' \
    --data_dir=${data_dir} \
    --output_dir=${expr_dir}/ranker
    --do_train \
    --learning_rate=1e-05 \
    --num_train_epochs=3.0 \
    --warmup_steps=2497 \
    --per_gpu_train_batch_size=32 \
    --eval_steps=8324 \
    --logging_steps=100 \
    --max_seq_length=256 \
    --seed=96 \
    --weight_decay=0.01

Eval the trained model on dev set:

export TRAINED_MODEL=${expr_dir}/ranker
python main.py \
    --data_dir=${data_dir}  \
    --model_name_or_path=$TRAINED_MODEL \
    --output_dir=$TRAINED_MODEL/res  \
    --max_seq_len=256 \
    --do_eval \
    --per_gpu_pred_batch_size=128  

You will find the evaluation results in a file named dev_eval_results.txt in the output_dir:

map = 0.7109500464755606
mrr = 0.715590335907891

Now select the most relevant passages with our ranker on dev set:

python select_best_passage.py \
    --data_dir=${data_dir} \
    --ranking_res_file=${expr_dir}/ranker/dev_best_passage_pred.json \
    --set_type=dev

This will generate a file named as dev_from_self_ranker.jsonl.

The following scripts run the experiments on NLG subset, for QA subset, you just need to change argument task or task_name from "nlg" to "qa".

Syntactic multi-span answer annotator

Now we need to transform the original answers in the training set to annotated spans.

Start Stanford CoreNLP Parser server:

java -mx20g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \
    -preload tokenize,ssplit,pos,parse \
    -port 8889 

Get annotated answer spans:

cd qa_multi_span
python annotator.py \
    --train_data_file=... \
    --model_name="albert-xlarge-v2" \
    --parser_url="http://localhost:8889" \
    --set_type="train" \
    --task="nlg" \
    --output_file="train_span_annotation.json"

Question-answering module

Training:

python main.py \
    --model_name_or_path=albert-xxlarge-v2 \
    --do_train \
    --data_dir=${data_dir}  \
    --output_dir=${expr_dir} \
    --eval_file=dev_from_self_ranker.jsonl \
    --span_annotation_file="train_span_annotation.json" \
    --overwrite_output_dir  \
    --per_gpu_train_batch_size=8   \
    --per_gpu_pred_batch_size=8  \
    --num_train_epochs=5.0 \
    --learning_rate=3e-5 \
    --evaluate_during_training \
    --eval_steps=1846\
    --max_seq_len=256 \
    --max_num_spans=9 \
    --ed_threshold=8\
    --task_name=nlg \
    --seed=1996

Evaluate on dev set with passages selected by our trained ranker:

python main.py \
    --model_name_or_path=${expr_dir} \
    --output_dir=${expr_dir}/dev_with_ranker  \
    --do_eval \
    --data_dir=${data_dir}  \
    --eval_file=dev_from_self_ranker.jsonl \
    --per_gpu_pred_batch_size=32   \
    --max_seq_len=512 \
    --max_num_spans=9\
    --reference_file=dev_ref.json \
    --task_name=nlg

You will get a result file named as dev_eval_results.txt in the output_dir:

 Dev F1 = 1.0
 Dev bleu_1 = 0.642251479046639
 Dev bleu_2 = 0.5687728163471556
 Dev bleu_3 = 0.5226526089835294
 Dev bleu_4 = 0.4883366779450553
 Dev rouge_l = 0.6624096330217961

The predictions are in the file dev_prediction.json and vebose_dev_prediction.json. Here are some prediction examples:

{
  "15177": {
    "score": 0.6119573291265018,
    "span_pos": [
      [4, 5],
      [21, 22],
      [6, 9],
      [29, 30],
      [22, 26],
      [255, 255]
    ],
    "span_texts": ["population", "of", "albany, minnesota", "is", "2,662"],
    "candiate_answer": "population of albany, minnesota is 2,662.",
    "original_answer": ["The population of Albany, Minnesota is 2,662. "],
    "query": "albany mn population",
    "passage": "Albany, Minnesota, as per 2017 US Census estimate, has a community population of 2,662 people. Albany is located in Stearns County, 20 miles west of St. Cloud and 80 miles northwest of Minneapolis/St. Paul on Interstate 94 (I-94). Albany has direct access to State Highway 238, which originates in Albany."
  },
  "114414": {
    "score": 0.4104470265124703,
    "span_pos": [
      [1, 6],
      [27, 35],
      [255, 255]
    ],
    "span_texts": ["current weather in volcano,", "is 48 degrees and patchy rain possible"],
    "candiate_answer": "current weather in volcano, is 48 degrees and patchy rain possible.",
    "original_answer": ["The Volcano forecast for Apr 12 is 52 degrees and Patchy light rain."],
    "query": "current weather in volcano, ca",
    "passage": "Hourly Forecast Detailed. 1  0am:The Volcano, CA forecast for Apr 03 is 48 degrees and Patchy rain possible. 2  3am:The Volcano, CA forecast for Apr 03 is 44 degrees and Clear. 3  6am:The Volcano, CA forecast for Apr 03 is 41 degrees and Clear.  9am:The Volcano, CA forecast for Apr 03 is 48 degrees and Sunny."
  },
  "9083": {
    "score": 0.6377308023817662,
    "span_pos": [
      [14, 24],
      [255, 255]
    ],
    "span_texts": ["hippocrates is considered the father of modern medicine"],
    "candiate_answer": "hippocrates is considered the father of modern medicine.",
    "original_answer": ["Hippocrates is considered the father of modern medicine."],
    "query": "____________________ is considered the father of modern medicine.",
    "passage": "TRUE. Hippocrates is considered the father of modern medicine because he did not believe that illness was a punishment inflicted by the gods. True False. Weegy: TRUE. [ "
  }
}

License

MIT

musst's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.