Giter Site home page Giter Site logo

qasper-led-baseline's Introduction

QASPER benchmark

Baselines tested in qasper.ipynb.

Differences of our implementation over the original implementation:

  1. We use the dataset provided at https://huggingface.co/datasets/allenai/qasper since it doesn't require manually downloading files.
  2. We remove usage of allennlp since the Python package cannot be installed anymore.
  3. We add baselines to qasper/models. Currently, we have
    • QASPER (Longformer Encoder Decoder)
    • GPT-3.5-Turbo
    • TODO: RAG (with R=TF-IDF or Contriever) implemented in LangChain?
  4. We replace allennlp special tokens with the special tokens of the HF transformer tokenizer:
    • paragraph separator: '' -> tokenizer.sep_token
    • sequence pair start tokens: _tokenizer.sequence_pair_start_tokens -> tokenizer.bos_token

Setup

# create conda env
conda env create -f environment.yml

# install the qasper library
conda develop .

Dependencies:

pytorch
transformers
datasets

Data: original data

Experiments (from original repo)

With evidence selection scaffold

Albert: Is this implemented?

The configuration file to use is training_config/led_base_with_evidence_scaffold.jsonnet. Remember to set the data paths before training.

allennlp train training_config/led_base_with_evidence_scaffold.jsonnet -s <PATH TO SERIALIZATION DIRECTORY> --include-package qasper_baselines

At the end of training, you will see results on the development set. best_validation_answer_f1 and best_validation_evidence_f1 should give you the Answer F1 and Evidence F1 reported in the paper.

If you do not have a GPU, you will need to set cuda_device to -1.

Without evidence scaffold

Just set use_evidence_scaffold in the model section of the configuration to false.

Experiments on shorter contexts

The paper also reports results of training and evaluating models given contexts shorter than the full text of the paper. Use the configuration file training_config/led_base_smaller_context.jsonnet for these experiments, and set the context field in the dataset_reader and validation_dataset_reader sections of the configuration to appropriate values.

Heuristic evidence baselines

The script scripts/evidence_retrieval_heuristic_baselines.py contains these baselines. Just run

python scripts/evidence_retrieval_heuristic_baselines.py <PATH TO DEV DATA>

You will need to install sklearn for this script.

Feel free to open pull requests if find any thing that needs fixing.

Experiments with LED-large

You can run these by changing the value of transformer_model variable to allenai/led-large-16384. Note that as stated in the paper, the answer_f1 value will be very low (less than 20 F1 points).

qasper-led-baseline's People

Contributors

pdasigi avatar ag2435 avatar bogoliubon avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.