Giter Site home page Giter Site logo

constrained-levt's Introduction

Constrained-LevT

This repository contains the code for the ACL-20 paper: Lexically Constrained Neural Machine Translation with Levenshtein Transformer. If you use this repository in your work, please cite:

@article{susanto2020lexically,
  title={Lexically Constrained Neural Machine Translation with Levenshtein Transformer},
  author={Susanto, Raymond Hendy and Chollampatt, Shamil and Tan, Liling},
  journal={arXiv preprint arXiv:2004.12681},
  year={2020}
}

Requirements and Installation

  • PyTorch version >= 1.2.0
  • Python version >= 3.6
git clone https://github.com/raymondhs/constrained-levt
cd constrained-levt
pip install --editable .

Usage

To replicate the experiments in our paper, you can download our pretrained models and evaluation sets into the root directory of this repository. These models were trained following the original instructions to train Levenshtein Transformer model. To preserve each constraint in the output, use --preserve-constraint. For example:

mkdir -p data-bin
tar -xvzf const_levt_en_de.tgz -C data-bin
cat data-bin/const_levt_en_de/newstest2014-wikt.en \
| python interactive_with_constraints.py \
    data-bin/const_levt_en_de \
    -s en -t de \
    --task translation_lev \
    --path data-bin/const_levt_en_de/checkpoint_best.pt \
    --iter-decode-max-iter 9 \
    --iter-decode-eos-penalty 0 \
    --beam 1 \
    --print-step \
    --batch-size 400 \
    --buffer-size 4000 \
    --preserve-constraint | tee /tmp/gen.out
# ...
# | Translated 3003 sentences (87040 tokens) in 11.5s (261.37 sentences/s, 7575.50 tokens/s)

# Compute term usage rate
cat /tmp/gen.out \
| grep ^H \
| sed 's/^H\-//' \
| sort -n -k 1 \
| cut -f 3 > /tmp/gen.out.sys
python scripts/term_usage_rate.py \
    -i data-bin/const_levt_en_de/newstest2014-wikt.en \
    -s /tmp/gen.out.sys
# Term use rate: 100.000

Each input line is tab-separated, where the first column corresponds to the source text and the remaining columns for the constraints. Each constraint is provided in this format: source|||target. A preprocessing script (tokenize.sh) is provided in case you want to try with your own input. It will run tokenization, BPE segmentation, and additional preprocessing for Romanian. For example:

echo 'Hello world!' | ./tokenize.sh en data-bin/const_levt_en_de/ende.code

License

The code and models in this repository are licensed under the MIT License. The evaluation datasets are licensed under CC-BY-SA 3.0.

constrained-levt's People

Contributors

alexeib avatar alvations avatar cndn avatar davidecaroselli avatar edunov avatar freewym avatar hitvoice avatar hmc-cs-mdrissi avatar huihuifan avatar jhcross avatar jingfeidu avatar jma127 avatar kahne avatar kartikayk avatar lematt1991 avatar liezl200 avatar louismartin avatar michaelauli avatar multipath avatar myleott avatar nng555 avatar pipibjc avatar raymondhs avatar rutyrinott avatar shamilcm avatar skritika avatar stephenroller avatar taylanbil avatar theweiho avatar xianxl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

constrained-levt's Issues

Incompatible with current fairseq version?

Hi, I trained a Levenshtein Transformer NMT model for German to English according to the instructions by fairseq and now I'm trying to use your code to generate translations with constraints but I get errors. I saw you're using fairseq version 0.8.0 so I thought it might be some problem with incompatible versions but I tried training with versions 0.10.0 and 0.9.0 too and still get errors. Version 0.8.0 had no translation_lev task at all so that didn't work either. What am I missing?
This is the command I used for training:

fairseq-train data-bin/prepared_data \
    --save-dir checkpoints \
    --ddp-backend=legacy_ddp \
    --task translation_lev \
    --criterion nat_loss \
    --arch levenshtein_transformer \
    --noise random_delete \
    --share-all-embeddings \
    --optimizer adam --adam-betas '(0.9,0.98)' \
    --lr 0.0002 --lr-scheduler reduce_lr_on_plateau \
    --stop-min-lr '1e-09' --warmup-updates 10000 \
    --warmup-init-lr '1e-07' --label-smoothing 0.1 \
    --dropout 0.3 --weight-decay 0.01 \
    --decoder-learned-pos \
    --encoder-learned-pos \
    --apply-bert-init \
    --log-format 'simple' --log-interval 50 \
    --log-file log \
    --fixed-validation-seed 7 \
    --max-tokens 2048 \
    --save-interval-updates 4000 \
    --max-update 300000 \
    --patience 4 \
    --skip-invalid-size-inputs-valid-test

This is the command I'm using for generation:

python interactive_with_constraints.py \
    data-bin/prepared_data \
    -s de -t en \
    --input data/test_three.de \
    --task translation_lev \
    --path checkpoints/checkpoint_best.pt \
    --iter-decode-max-iter 9 \
    --iter-decode-eos-penalty 0 \
    --beam 1 \
    --print-step \
    --batch-size 400 \
    --buffer-size 4000 \
    --preserve-constraint

These are the error tracebacks:
With version 0.10.2 (master):

Namespace(allow_insertion_constraint=False, beam=1, bpe=None, buffer_size=4000, cpu=False, criterion='cross_entropy', data='/content/drive/MyDrive/susanto_model/data-bin/prepared_data', dataset_impl=None, decoding_format=None, diverse_beam_groups=-1, diverse_beam_strength=0.5, empty_cache_freq=0, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', input='/content/drive/MyDrive/susanto_model/data/test_three.de', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=9, lazy_load=False, left_pad_source='True', left_pad_target='False', lenpen=1, load_alignments=False, log_format=None, log_interval=1000, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=400, max_source_positions=1024, max_target_positions=1024, max_tokens=None, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', momentum=0.99, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=False, no_repeat_ngram_size=0, noise='random_delete', num_shards=1, num_workers=1, optimizer='nag', path='/content/drive/MyDrive/susanto_model/checkpoints_susanto/checkpoint_best.pt', prefix_size=0, preserve_constraint=True, print_alignment=False, print_step=True, quiet=False, raw_text=False, remove_bpe=None, replace_unk=None, required_batch_size_multiple=8, results_path=None, sacrebleu=False, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang='de', target_lang='en', task='translation_lev', tbmf_wrapper=False, temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, unkpen=0, unnormalized=False, upsample_primary=1, user_dir=None, warmup_updates=0, weight_decay=0.0)
| [de] dictionary: 8544 types
| [en] dictionary: 8544 types
| loading model(s) from checkpoints/checkpoint_best.pt

Traceback (most recent call last):
  File "interactive_with_constraints.py", line 234, in <module>
    cli_main()
  File "interactive_with_constraints.py", line 230, in cli_main
    main(args)
  File "interactive_with_constraints.py", line 101, in main
    task=task,
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 178, in load_model_ensemble_and_task
    state = load_checkpoint_to_cpu(filename, arg_overrides)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 154, in load_checkpoint_to_cpu
    state = _upgrade_state_dict(state)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 323, in _upgrade_state_dict
    state['args'].task = 'translation'
AttributeError: 'NoneType' object has no attribute 'task'

With versions 0.10.0 and 0.9.0:

Traceback (most recent call last):
  File "interactive_with_constraints.py", line 234, in <module>
    cli_main()
  File "interactive_with_constraints.py", line 230, in cli_main
    main(args)
  File "interactive_with_constraints.py", line 101, in main
    task=task,
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 186, in load_model_ensemble_and_task
    model.load_state_dict(state['model'], strict=True)
  File "/content/constrained-levt/fairseq/models/fairseq_model.py", line 69, in load_state_dict
    return super().load_state_dict(state_dict, strict)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for LevenshteinTransformerModel:
	Missing key(s) in state_dict: "encoder.layers.0.self_attn.in_proj_weight", "encoder.layers.0.self_attn.in_proj_bias", "encoder.layers.1.self_attn.in_proj_weight", "encoder.layers.1.self_attn.in_proj_bias", "encoder.layers.2.self_attn.in_proj_weight", [...], "decoder.layers.5.encoder_attn.in_proj_bias". 
	Unexpected key(s) in state_dict: "encoder.layers.0.self_attn.k_proj.weight", "encoder.layers.0.self_attn.k_proj.bias", "encoder.layers.0.self_attn.v_proj.weight", "encoder.layers.0.self_attn.v_proj.bias", "encoder.layers.0.self_attn.q_proj.weight", "encoder.layers.0.self_attn.q_proj.bias", "encoder.layers.1.self_attn.k_proj.weight", "encoder.layers.1.self_attn.k_proj.bias", "encoder.layers.1.self_attn.v_proj.weight", "encoder.layers.1.self_attn.v_proj.bias", "encoder.layers.1.self_attn.q_proj.weight", "encoder.layers.1.self_attn.q_proj.bias",    [...]    "decoder.layers.5.encoder_attn.v_proj.bias", "decoder.layers.5.encoder_attn.q_proj.weight", "decoder.layers.5.encoder_attn.q_proj.bias". 

How to specify constraints?

Hi, I would like to employ the method in my own dataset. But I cannot figure out how to spe. cify constraints. In the readme, the input newstest2014-wikt.en does not seems to contain any constraints. Thanks in advance.

Question in replicate experiment

Hello, I tried to follow the README to replicate your experiments, but when I run python interactive_with_constraints.py, it was blocked, when I check GPU states I found the program was running in GPU, I think it should be a fast process, how to solve it?
My environment is Python 3.6, Pytorch 1.4.0, CUDA version is 10.1 and driver version is 430.64.
Hope for your answer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.