I am training spanish to english NMT model. It prints below logs after epoch 0 when i

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github

Hey! <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

I opened <a class="issue-link js-issue-link" data-error-text="Failed to load title" da

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Help me understand the Output/Parameters and inference. about unsupervisedmt HOT 15 CLOSED

facebookresearch commented on July 19, 2024

Help me understand the Output/Parameters and inference.

from unsupervisedmt.

Comments (15)

glample commented on July 19, 2024 5

@akanshajainn for now the simplest way would be to take the sentences you want to translate, and define them as a test set. If you tokenize these sentences, split them into BPE pieces, and binarize them into a .pth, you can then load this data as if it was your evaluation set.

Then you can simply translate your new test set with the exact same command you used for training, but with these few extra arguments: --reload_model $MODEL_PATH --reload_enc 1 --reload_dec 1 --eval_only 1 --para_dataset $PARA_DATASET. Where MODEL_PATH is the path of your trained model, and PARA_DATASET can be the same thing as before but where you replaced the original test set by your own test set.

Then the model will dump the translations of your new test set in the new dump_path folder.

from unsupervisedmt.

loretoparisi commented on July 19, 2024 2

@glample thank you fr this detailed explanations, this is the best and well documented repo for NMT. It would be very important to have inference from stdin like marian nmt does or like startspace and fasttext do, that could be helpful to test the newly trained models live.

from unsupervisedmt.

glample commented on July 19, 2024 2

glample/fastBPE@fea9a45 should help for inference. For the translation script, please check the new repo where I added this functionality: facebookresearch/XLM@0b193eb

from unsupervisedmt.

glample commented on July 19, 2024 1

XE-en-en is the reconstruction loss of the adversarial auto-encoder for English
XE-es-es is the reconstruction loss of the adversarial auto-encoder for Spanish
It's noted XE-xx-xx where XE means cross-entropy (because we use a cross-entropy reconstruction loss).

XE-en-es-en is the back-translation loss when taking an English sentence, translating it to Spanish, and back to English
You will not have a XE-es-en loss, unless you provide Spanish-English parallel data.

ENC-L2-en is the average norm of the encoded English latent vectors. This is not very relevant, you can ignore this metric.

from unsupervisedmt.

glample commented on July 19, 2024 1

No, the order does not matter for --pivo_directions. If you swap the order it just means that at each iteration the model will perform one direction before the other, but this won't have any impact in practice.

from unsupervisedmt.

akanshajainn commented on July 19, 2024 1

Hey! @mohammedayub44 I was interrupted by CUDA out of memory. I could achieve upto BLEU 17. Now I am stuck at how to infer using the trained model? @glample

from unsupervisedmt.

bittlingmayer commented on July 19, 2024 1

I opened #65 for interactive mode and/or run-time querying specifically.

from unsupervisedmt.

akanshajainn commented on July 19, 2024

Okay thanks now it makes sense.. there is one more confusion for --pivo_directions parameter for NMT/main.py. We have to give back translation direction, Does the order matter? If it does, then which one is correct: tgt-src-tgt, src-tgt-src or src-tgt-src, tgt-src-tgt ? Same for three langs?

The confusion arises(Assuming src - en and tgt - fr) in the complete description of parameters of main.py it is given(src-tgt-src, tgt-src-tgt )

## back-translation directions
--pivo_directions 'en-fr-en,fr-en-fr'       # back-translation directions (en->fr->en and fr->en->fr)

and below in section "Some parameters must respect a particular format:" it is written(tgt-src-tgt, src-tgt-src):

pivo_directions

    A list of triplets on which we want to perform back-translation.
    fr-en-fr,en-fr-en will train the model on the fr->en->fr and en->fr->en directions.
    en-fr-de,de-fr-en will train the model on the en->fr->de and de->fr->en directions (assuming that fr is the unknown language, and that English-German parallel data is provided).

and in combined command the direction given are(tgt-src-tgt, src-tgt-src) :

python main.py --exp_name test --transformer True --n_enc_layers 4 --n_dec_layers 4 --share_enc 3 --share_dec 3 --share_lang_emb True --share_output_emb True --langs 'en,fr' --n_mono -1 --mono_dataset 'en:./data/mono/all.en.tok.60000.pth,,;fr:./data/mono/all.fr.tok.60000.pth,,' --para_dataset 'en-fr:,./data/para/dev/newstest2013-ref.XX.60000.pth,./data/para/dev/newstest2014-fren-src.XX.60000.pth' --mono_directions 'en,fr' --word_shuffle 3 --word_dropout 0.1 --word_blank 0.2 --pivo_directions 'fr-en-fr,en-fr-en' --pretrained_emb './data/mono/all.en-fr.60000.vec' --pretrained_out True --lambda_xe_mono '0:1,100000:0.1,300000:0' --lambda_xe_otfd 1 --otf_num_processes 30 --otf_sync_params_every 1000 --enc_optimizer adam,lr=0.0001 --epoch_size 500000 --stopping_criterion bleu_en_fr_valid,10

Kindly help me clear this.

from unsupervisedmt.

mohammedayub44 commented on July 19, 2024

@akanshajainn did you happen to get any benchmarks with the Spanish data ?

from unsupervisedmt.

mohammedayub44 commented on July 19, 2024

@akanshajainn Thanks for that.

Hi @glample are there benchmark results you have run for Spanish data.

from unsupervisedmt.

glample commented on July 19, 2024

Hi,

No, I have not tried Spanish. I expect it would perform similarly to En-Fr. BLEU might be different, but this may be due to your test set which is more difficult. Even in En-Fr the BLEU varies quite a lot depending on the version of newstest.

from unsupervisedmt.

akanshajainn commented on July 19, 2024

Hey @glample thanks for the information. Now could you please also tell how to infer using the trained model? Another thing I am exploring is to infer on CPU. Is that also available ?

from unsupervisedmt.

mohammedayub44 commented on July 19, 2024

@glample guessing an Inference tutorial would be really helpful for all.

from unsupervisedmt.

glample commented on July 19, 2024

@mohammedayub44 Yes sorry about that. I'm very busy these days, but I will try to make a script to execute all these steps soon.

from unsupervisedmt.

mohammedayub44 commented on July 19, 2024

@glample no worries at all. Thanks for your active help.

from unsupervisedmt.

Help me understand the Output/Parameters and inference. about unsupervisedmt HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent