Giter Site home page Giter Site logo

metopedia / moses-for-mere-mortals Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jladcr/moses-for-mere-mortals

0.0 0.0 0.0 53.71 MB

Machine translation for the real world

License: Other

CSS 0.38% Python 0.46% Emacs Lisp 0.01% Perl 42.63% Ruby 0.42% Slash 0.09% SystemVerilog 0.05% Shell 54.05% Smalltalk 0.79% NewLisp 0.43% HTML 0.15% JavaScript 0.53%

moses-for-mere-mortals's Introduction

Moses-for-Mere-Mortals: Machine translation for the real world

THIS SITE IS NO LONGER SUPPORTED. IT WAS BRIEFLY A COMPONENT OF MOSES SMT AND ITS AUTHORS WERE PART OF THE MOSES SMT DEVELOPMENT TEAM. IT WAS NICE WHILE IT LASTED, BUT THE TEAM NO LONGER EXISTS AND THE SOFTWARE HASN'T BEEN UPDATED FOR SEVERAL YEARS. IT IS THEREFORE APPROPRIATE TO SIGNAL THAT THE PROJECT HAS ENDED.

Please use the https://github.com/jladcr/Moses-for-Mere-Mortals/releases link to download the latest stable release.

Set of Linux bash scripts that, together, create a basic translation chain prototype able of processing very large corpora. It uses Moses, a widely known statistical machine translation (SMT) system.

The idea is to help build a translation chain for the real world, but it should also enable a quick evaluation of Moses for actual translation work and guide users in their first steps of using Moses.

A Tutorial and a demonstration corpus (too small for doing justice to the qualitative results that can be achieved with Moses, but able of giving a realistic view of the relative duration of the steps involved) are available. Moses for Mere Mortals has been tested and used in a professional translation context.

If you want to use the latest stable and tested version of Moses for Mere Mortals, just click the Releases button at the top of this page and choose the release you are interested in. Moses for Mere Mortals is to be run on an Ubuntu environment. The Windows addins should be installed and run in Microsoft Windows.

Moses for Mere Mortals (MMM) has been tested with the following 64 bit (AMD64) Linux distributions:

  • Ubuntu 14.04
  • Ubuntu 12.04

Documents used for corpora training should be perfectly aligned and saved in UTF-8 character encoding. Documents to be translated should also be in UTF-8 format. One would expect the users of these scripts, perhaps after having tried the provided demonstration corpus, to immediately use and get results with the real corpora they are interested in.

The two Windows add-ins allow the creation of Moses input files from *.TMX translation memories (Extract_TMX_Corpus.exe), as well as the creation of *.TMX files from Moses output files (Moses2TMX.exe). A synergy between machine translation and translation memories is therefore created.

moses-for-mere-mortals's People

Contributors

jladcr avatar ypeels avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.