Comments (2)
That is weird. Maybe fine-tuning ran for too long on very smallish data sets and the model heavily overfitted to the fine-tuning data set and forgot everything else? Did you see strange perplexity scores during fine-tuning?
from opus-mt-train.
[2022-09-01 15:00:07] Allocating memory for Adam-specific shards
[2022-09-01 15:00:07] [memory] Reserving 343 MB, device cpu0
[2022-09-01 15:06:27] Seen 2,467 samples
[2022-09-01 15:06:27] Starting data epoch 2 in logical epoch 2
[2022-09-01 15:12:58] Seen 2,467 samples
[2022-09-01 15:12:58] Starting data epoch 3 in logical epoch 3
[2022-09-01 15:19:30] Seen 2,467 samples
[2022-09-01 15:19:30] Starting data epoch 4 in logical epoch 4
[2022-09-01 15:26:01] Seen 2,467 samples
[2022-09-01 15:26:01] Starting data epoch 5 in logical epoch 5
[2022-09-01 15:32:32] Seen 2,467 samples
[2022-09-01 15:32:32] Starting data epoch 6 in logical epoch 6
[2022-09-01 15:32:32] Training finished
[2022-09-01 15:32:51] Saving model weights and runtime parameters to /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz.best-perplexity.npz
[2022-09-01 15:32:51] [valid] Ep. 6 : Up. 150 : perplexity : 700.626 : new best
[2022-09-01 15:32:51] Saving model weights and runtime parameters to /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz
[2022-09-01 15:32:52] Saving Adam parameters
[2022-09-01 15:32:54] [training] Saving training checkpoint to /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz and /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz.optimizer.npz
It looked like it only went through one round? What is even weirder is the compare file (Tatoeba-test-v2021-08-07.afr-eng.opus-tuned4afr2eng.spm1k-spm1k1.transformer-align.afr.eng)
shows the translations as the ssss and blank lines:
sssssssssssssssssssssssssssssssss
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
sssssssssssssssssssssssssssssssss
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
sssssssssssssssssssssssssssssssss
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
And then of course the eval file records the bleu score as 0. I double checked all the data...I used about 1500 lines of afr-eng data to finetune mul-eng model). I am really at a loss here because I can tune monolingual models just fine using the same steps. Do you have anymore insight?
from opus-mt-train.
Related Issues (20)
- What's the dataset used for training opus-mt-en-de HOT 1
- Language Code Difference HOT 1
- What is tatoeba-langtune? HOT 2
- Preprocessing Script Question
- Korean Finetuning
- What could cause widely varying inference time when using pre-trained opus-mt-en-fr model with python transformers library? HOT 2
- Wrong tokenizer/vocab for the 'Helsinki-NLP/opus-mt-tc-big-en-ko' model
- How to translate from english to Japan?
- Using OPUS-MT with DeepSpeed
- update Dockerfile.gpu--fixed
- different sizes of dictionaries in different models HOT 1
- Reproduced crash on Opus-mt-en-de model using string "J" and "J-10" HOT 1
- Unable to find current origin/master revision in submodule path HOT 2
- Hyperparameters used for pretrained models? HOT 1
- how to train our dataset HOT 3
- Unbelievably High BLEU scores from finetuning... HOT 3
- Data for Brazilian Portuguese HOT 2
- Lack of transparency on used training data. - Does finetuning make sense? HOT 1
- preprocess.sh [: ==: unary operator expected HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opus-mt-train.