Giter Site home page Giter Site logo

icml21_oaxe's Introduction

Order-Agnostic Cross Entropy (OAXE) for Non-Autoregressive Machine Translation

Note

We based this code heavily on the original code of mask-predict and disco.

Download model

Description Dataset Model
CMLM-OAXE [WMT14 English-German] download (.tar.bz2)
CMLM-OAXE [WMT14 German-English] download (.tar.bz2)
CMLM-OAXE [WMT16 English-Romanian] download (.tar.bz2)
CMLM-OAXE [WMT16 Romanian-English] download (.tar.bz2)
CMLM-OAXE [WMT17 English-Chinese] download (.tar.bz2)
CMLM-OAXE [WMT17 Chinese-English] download (.tar.bz2)

Preprocess

TBD

Train

Here we offer the XE finetuning strategy, you have to copy a pretrained CMLM model into your model dir. We adopt the pretrain models from Mask-Predict. It is better to keep update-freq x GPU NUM >= 8.

data_dir = data dir
model_dir = model dir
python3 -u train.py ${data_dir} --arch bert_transformer_seq2seq \
    --criterion oaxe --lr 5e-6  \
    --optimizer adam --adam-betas '(0.9, 0.999)' --adam-eps 1e-6 --task translation_self \
    --max-tokens 16384 --weight-decay 0.01 --dropout 0.2 --encoder-layers 6 --encoder-embed-dim 512 \
    --decoder-layers 6 --decoder-embed-dim 512 --max-source-positions 10000 \
    --max-target-positions 10000 --max-epoch 245 --seed 1 \
    --save-dir ${model_dir} \
    --keep-last-epochs 10 --share-all-embeddings \
    --keep-interval-updates 1 --fp16 \
    --update-freq 1 --ddp-backend=no_c10d \
    --warmup-init-lr 1e-7 --min-lr 1e-9 --lr-scheduler inverse_sqrt --warmup-updates 500 \
    --label-smoothing 0.1 --reset-optimizer --skip 0.15 \

Evaluation

TBD

License

Following MASK-PREDICT, our code is also CC-BY-NC 4.0. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{Du2021OAXE,
  title = {Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation},
  author = {Cunxiao Du and Zhaopeng Tu and Jing Jiang},
  booktitle = {Proc. of ICML},
  year = {2021},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.