Giter Site home page Giter Site logo

unbabel / openkiwi Goto Github PK

View Code? Open in Web Editor NEW
230.0 27.0 48.0 34.37 MB

Open-Source Machine Translation Quality Estimation in PyTorch

Home Page: https://unbabel.github.io/OpenKiwi/

License: GNU Affero General Public License v3.0

Python 96.55% Mathematica 3.45%
machine-translation quality-estimation pytorch translation-quality-estimation openkiwi pytorch-lightning

openkiwi's Introduction

OpenKiwi Logo


PyPI version python versions CircleCI Code Climate coverage Code Style GitHub last commit

Open-Source Machine Translation Quality Estimation in PyTorch

Quality estimation (QE) is one of the missing pieces of machine translation: its goal is to evaluate a translation system’s quality without access to reference translations. We present OpenKiwi, a Pytorch-based open-source framework that implements the best QE systems from WMT 2015-18 shared tasks, making it easy to experiment with these models under the same framework. Using OpenKiwi and a stacked combination of these models we have achieved state-of-the-art results on word-level QE on the WMT 2018 English-German dataset.

News

  • An experimental demonstration interface called OpenKiwi Tasting has been released on GitHub and can be checked out in Streamlit Share.

  • A new major version (2.0.0) of OpenKiwi has been released. Introducing HuggingFace Transformers support and adoption of Pytorch-lightning. For a condensed view of changed, check the changelog

  • Following our nomination in early July, we are happy to announce we won the Best Demo Paper at ACL 2019! Congratulations to the whole team and huge thanks for supporters and issue reporters.

  • Check out the published paper.

  • We have released the OpenKiwi tutorial we presented at MT Marathon 2019.

Features

  • Framework for training QE models and using pre-trained models for evaluating MT.
  • Supports both word and sentence-level (HTER or z-score) Quality estimation.
  • Implementation of five QE systems in Pytorch: NuQE [2, 3], predictor-estimator [4, 5], BERT-Estimator [6], XLM-Estimator [6] and XLMR-Estimator
  • Older systems only supported in versions <=2.0.0: QUETCH [1], APE-QE [3] and a stacked ensemble with a linear system [2, 3].
  • Easy to use API. Import it as a package in other projects or run from the command line.
  • Easy to track and reproduce experiments via yaml configuration files.
  • Based on Pytorch-Lightning making the code easier to scale, use and keep up-do-date with engineering advances.
  • Implemented using HuggingFace Transformers library to allow easy access to state-of-the-art pre-trained models.

Quick Installation

To install OpenKiwi as a package, simply run

pip install openkiwi

You can now

import kiwi

inside your project or run in the command line

kiwi

Optionally, if you'd like to take advantage of our MLflow integration, simply install it in the same virtualenv as OpenKiwi:

pip install openkiwi[mlflow]

Getting Started

Detailed usage examples and instructions can be found in the Full Documentation.

Contributing

We welcome contributions to improve OpenKiwi. Please refer to CONTRIBUTING.md for quick instructions or to contributing instructions for more detailed instructions on how to set up your development environment.

License

OpenKiwi is Affero GPL licensed. You can see the details of this license in LICENSE.

Citation

If you use OpenKiwi, please cite the following paper: OpenKiwi: An Open Source Framework for Quality Estimation.

@inproceedings{openkiwi,
    author = {Fábio Kepler and
              Jonay Trénous and
              Marcos Treviso and
              Miguel Vera and
              André F. T. Martins},
    title  = {Open{K}iwi: An Open Source Framework for Quality Estimation},
    year   = {2019},
    booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics--System Demonstrations},
    pages  = {117--122},
    month  = {July},
    address = {Florence, Italy},
    url    = {https://www.aclweb.org/anthology/P19-3020},
    organization = {Association for Computational Linguistics},
}

References

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.