Giter Site home page Giter Site logo

alex-fabbri / nlp-models Goto Github PK

View Code? Open in Web Editor NEW

This project forked from epwalsh/nlp-models

1.0 3.0 0.0 6.29 MB

My NLP research experiments, built on PyTorch within the AllenNLP framework

License: MIT License

Makefile 1.87% Python 87.64% Shell 10.23% Perl 0.26%

nlp-models's Introduction

nlp-models

CircleCI codecov

NLP research experiments, built on PyTorch within the AllenNLP framework.


The goal of this project is to provide an example of a high-quality personal research library. It provides modularity, continuous integration, high test coverage, a code base that emphasizes readability, and a host of scripts that make reproducing any experiment here as easy as running a few make commands. I also strive to make nlp-models useful by implementing practical modules and models that extend AllenNLP. Sometimes I'll contribute pieces of what I work on here back to AllenNLP after it has been thoroughly tested.

Overview

At a high-level, the structure of this project mimics that of AllenNLP. That is, the submodules in nlpete are organized in exactly the same way as in allennlp. But I've also provided a set of scripts that automate frequently used command sequences, such as running tests or experiments. The Makefile serves as the common interface to these scripts:

  • make train: Train a model. This is basically a wrapper around allennlp train, but provides a default serialization directory and automatically creates subdirectories of the serialization directory for different runs of the same experiment.
  • make test: Equivalent to running make typecheck, make lint, make unit-test, and make check-scripts.
  • make typecheck: Runs the mypy typechecker.
  • make lint: Runs pydocstyle and pylint.
  • make unit-test: Runs all unit tests with pytest.
  • make check-scripts: Runs a few other scripts that check miscellaneous things not covered by the other tests.
  • make create-branch: A wrapper around the git functionality to create a new branch and push it upstream. You can name a branch after an issue number with make create-branch issue=NUM or give it an arbitrary name with make create-branch name="my-branch".
  • make data/DATASETNAME.tar.gz: Extract a dataset in the data/ directory. Just replace DATASETNAME with the basename of one of the .tar.gz files in that directory.

Getting started

The modules implemented here are built and tested nightly against the master branch of AllenNLP. Therefore it is recommended that you install AllenNLP from source. The easiest way to do that is as follows:

git clone https://github.com/allenai/allennlp.git && cd allennlp
./scripts/install_requirements.sh
python setup.py develop

NOTE: If you're not already familiar with AllenNLP, I would suggest starting with their excellent tutorial.

After AllenNLP is installed, you can define your own experiments with an AllenNLP model config file, and then run

make train
# ... follow the prompts to specify the path to your model config and serialization directory.

As an example which you should be able to run immediately, I've provided an implementation of CopyNet and an artificial dataset to experiment with. To train this model, run the following:

# Extract data.
make data/greetings.tar.gz

# Train model. When prompted for the model file, enter "experiments/greetings/copynet.json".
# This took (~3-5 minutes on a single GTX 1070).
make train

NOTE: All of the model configs in the experiments/ folder are defined to run on GPU #0. So if you don't have a GPU available or want to use a different GPU, you'll need to modify the trainer.cuda_device field in the experiment's config file.

Models implemented

CopyNet: A sequence-to-sequence model that incorporates a copying mechanism, which enables the model to copy tokens from the source sentence into the target sentence even if they are not part of the target vocabulary. This architecture has shown promising results on machine translation and semantic parsing tasks. For examples in use, see

Datasets available

For convenience, this project provides a handful of training-ready datasets and scripts to pull and proprocess some other useful datasets. Here is a list so far:

Greetings: A simple made-up dataset of greetings (the source sentences) and replies (the target sentences). The greetings are things like "Hi, my name is Jon Snow" and the replies are in the format "Nice to meet you, Jon Snow!". This is completely artificial and is just meant to show the usefullness of the copy mechanism in CopyNet.

# Extract data.
make data/greetings.tar.gz

NL2Bash: A challenging dataset that consists of bash one-liners along with corresponding expert descriptions. The goal is to translate the natural language descriptions into the bash commands.

# Extract data.
make data/nl2bash.tar.gz

WMT 2015: Hosted by fast.ai, this is a dataset of 22.5 million English / French sentence pairs that can be used to train an English to French or French to English machine translation system.

# Download, extract, and preprocess data (big file, may take around 10 minutes).
./scripts/data/pull_wmt.sh

Issues and improvements

If you've found a bug or have any questions, please feel free to submit an issue on GitHub. I always appreciate pull requests as well.

nlp-models's People

Contributors

epwalsh avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.