Giter Site home page Giter Site logo

Comments (15)

glample avatar glample commented on July 19, 2024 5

@akanshajainn for now the simplest way would be to take the sentences you want to translate, and define them as a test set. If you tokenize these sentences, split them into BPE pieces, and binarize them into a .pth, you can then load this data as if it was your evaluation set.

Then you can simply translate your new test set with the exact same command you used for training, but with these few extra arguments: --reload_model $MODEL_PATH --reload_enc 1 --reload_dec 1 --eval_only 1 --para_dataset $PARA_DATASET. Where MODEL_PATH is the path of your trained model, and PARA_DATASET can be the same thing as before but where you replaced the original test set by your own test set.

Then the model will dump the translations of your new test set in the new dump_path folder.

from unsupervisedmt.

loretoparisi avatar loretoparisi commented on July 19, 2024 2

@glample thank you fr this detailed explanations, this is the best and well documented repo for NMT. It would be very important to have inference from stdin like marian nmt does or like startspace and fasttext do, that could be helpful to test the newly trained models live.

from unsupervisedmt.

glample avatar glample commented on July 19, 2024 2

glample/fastBPE@fea9a45 should help for inference. For the translation script, please check the new repo where I added this functionality: facebookresearch/XLM@0b193eb

from unsupervisedmt.

glample avatar glample commented on July 19, 2024 1

XE-en-en is the reconstruction loss of the adversarial auto-encoder for English
XE-es-es is the reconstruction loss of the adversarial auto-encoder for Spanish
It's noted XE-xx-xx where XE means cross-entropy (because we use a cross-entropy reconstruction loss).

XE-en-es-en is the back-translation loss when taking an English sentence, translating it to Spanish, and back to English
You will not have a XE-es-en loss, unless you provide Spanish-English parallel data.

ENC-L2-en is the average norm of the encoded English latent vectors. This is not very relevant, you can ignore this metric.

from unsupervisedmt.

glample avatar glample commented on July 19, 2024 1

No, the order does not matter for --pivo_directions. If you swap the order it just means that at each iteration the model will perform one direction before the other, but this won't have any impact in practice.

from unsupervisedmt.

akanshajainn avatar akanshajainn commented on July 19, 2024 1

Hey! @mohammedayub44 I was interrupted by CUDA out of memory. I could achieve upto BLEU 17. Now I am stuck at how to infer using the trained model? @glample

from unsupervisedmt.

bittlingmayer avatar bittlingmayer commented on July 19, 2024 1

I opened #65 for interactive mode and/or run-time querying specifically.

from unsupervisedmt.

akanshajainn avatar akanshajainn commented on July 19, 2024

Okay thanks now it makes sense.. there is one more confusion for --pivo_directions parameter for NMT/main.py. We have to give back translation direction, Does the order matter? If it does, then which one is correct: tgt-src-tgt, src-tgt-src or src-tgt-src, tgt-src-tgt ? Same for three langs?

The confusion arises(Assuming src - en and tgt - fr) in the complete description of parameters of main.py it is given(src-tgt-src, tgt-src-tgt )

## back-translation directions
--pivo_directions 'en-fr-en,fr-en-fr'       # back-translation directions (en->fr->en and fr->en->fr) 

and below in section "Some parameters must respect a particular format:" it is written(tgt-src-tgt, src-tgt-src):

pivo_directions

    A list of triplets on which we want to perform back-translation.
    fr-en-fr,en-fr-en will train the model on the fr->en->fr and en->fr->en directions.
    en-fr-de,de-fr-en will train the model on the en->fr->de and de->fr->en directions (assuming that fr is the unknown language, and that English-German parallel data is provided).

and in combined command the direction given are(tgt-src-tgt, src-tgt-src) :

python main.py --exp_name test --transformer True --n_enc_layers 4 --n_dec_layers 4 --share_enc 3 --share_dec 3 --share_lang_emb True --share_output_emb True --langs 'en,fr' --n_mono -1 --mono_dataset 'en:./data/mono/all.en.tok.60000.pth,,;fr:./data/mono/all.fr.tok.60000.pth,,' --para_dataset 'en-fr:,./data/para/dev/newstest2013-ref.XX.60000.pth,./data/para/dev/newstest2014-fren-src.XX.60000.pth' --mono_directions 'en,fr' --word_shuffle 3 --word_dropout 0.1 --word_blank 0.2 --pivo_directions 'fr-en-fr,en-fr-en' --pretrained_emb './data/mono/all.en-fr.60000.vec' --pretrained_out True --lambda_xe_mono '0:1,100000:0.1,300000:0' --lambda_xe_otfd 1 --otf_num_processes 30 --otf_sync_params_every 1000 --enc_optimizer adam,lr=0.0001 --epoch_size 500000 --stopping_criterion bleu_en_fr_valid,10

Kindly help me clear this.

from unsupervisedmt.

mohammedayub44 avatar mohammedayub44 commented on July 19, 2024

@akanshajainn did you happen to get any benchmarks with the Spanish data ?

from unsupervisedmt.

mohammedayub44 avatar mohammedayub44 commented on July 19, 2024

@akanshajainn Thanks for that.

Hi @glample are there benchmark results you have run for Spanish data.

from unsupervisedmt.

glample avatar glample commented on July 19, 2024

Hi,

No, I have not tried Spanish. I expect it would perform similarly to En-Fr. BLEU might be different, but this may be due to your test set which is more difficult. Even in En-Fr the BLEU varies quite a lot depending on the version of newstest.

from unsupervisedmt.

akanshajainn avatar akanshajainn commented on July 19, 2024

Hey @glample thanks for the information. Now could you please also tell how to infer using the trained model? Another thing I am exploring is to infer on CPU. Is that also available ?

from unsupervisedmt.

mohammedayub44 avatar mohammedayub44 commented on July 19, 2024

@glample guessing an Inference tutorial would be really helpful for all.

from unsupervisedmt.

glample avatar glample commented on July 19, 2024

@mohammedayub44 Yes sorry about that. I'm very busy these days, but I will try to make a script to execute all these steps soon.

from unsupervisedmt.

mohammedayub44 avatar mohammedayub44 commented on July 19, 2024

@glample no worries at all. Thanks for your active help.

from unsupervisedmt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.