Giter Site home page Giter Site logo

cross-dataset-qa-performance's Introduction

Cross-Dataset QA Performance

Source code corresponding to the research paper: "Testing BERT for Generality in Cross-dataset Question Answering Performance", by Bootsma, Gaasbeek, 't Lam, Sekar and Weijts.

Training procedures used in this notebook based on Scheider's BERT training examples, and Devlin's paper introducing BERT.

Abstract

We adopt the existing pre-trained model called BERT, which stands for Bidirectional Encoder Representations from Transformers, to create state-of-the-art models to work on a specific downstream task. The BERT model is a transformer-based machine learning technique for Natural Language Processing, pre-trained using unlabeled text on deep bidirectional representations. We fine-tune BERT-based models for Question Answering using different industry-standard datasets. Afterwards, we evaluate these models using evaluation sets of the other datasets, to test generality in cross-dataset Question Answering performance. We find that a model trained on a specific dataset outperforms othermodels on that specific evaluation set by a significant margin, even in very similar datasets.

Overview

The notebooks in this repository are intended to be run using Google Colab, using GPU acceleration. However they can easily be modified to run locally.

In order to fine-tune the BERT-base model used, the required training set needs to be selected, and the path where the weights.h5 is stored after training will need to be changed.

For evaluation. the Google drive paths of the weights and evaluation sets will need to be changed to point to the correct files. Versions of the dev sets of SQuAD 1.1, 2.0 and CoQA trimmed to only include questions with total tokenized lenght smaller than the 512 maximum sequence length of BERT are included in the repository. The predictions generated by this notebook can then be evaluated using the evaluation script provided by SQuAD 2.0

Paper

The full paper is included in this repository, and can be read here, or be downloaded from the repository.

cross-dataset-qa-performance's People

Contributors

jellebootsma avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.