Giter Site home page Giter Site logo

vasistalodagala / whisper-finetune Goto Github PK

View Code? Open in Web Editor NEW
215.0 215.0 52.0 56 KB

Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.

License: MIT License

Python 100.00%
asr jax pytorch speech-recognition transformers whisper

whisper-finetune's People

Contributors

vasistalodagala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

whisper-finetune's Issues

how to add a new language?

Dear All,

I would like to recognize Taiwanese Hakka speech using fine-tuned Whisper. However, Hakka is not supported by WhisperTokenizer. Any idea?

Here is my code and log:

ngpu=10  # number of GPUs to perform distributed training on.

torchrun --nproc_per_node=${ngpu} train/fine-tune_on_custom_dataset.py \
--model_name vasista22/whisper-telugu-base \
--language hakka \
--sampling_rate 16000 \
--num_proc 4 \
--train_strategy epoch \
--learning_rate 3e-3 \
--warmup 1000 \
--train_batchsize 16 \
--eval_batchsize 8 \
--num_epochs 20 \
--resume_from_ckpt None \
--output_dir op_dir_epoch \
--train_datasets output_data/train  \
--eval_datasets output_data/dev output_data/test


ValueError: Unsupported language: hakka. Language should be one of: ['english', 'chinese', 'german', 'spanish', 'russian', 'korean', 'french', 'japanese', 'portuguese', 'turkish', 'polish', 'catalan', 'dutch', 'arabic', 'swedish', 'italian', 'indonesian', 'hindi', 'finnish', 'vietnamese', 'hebrew', 'ukrainian', 'greek', 'malay', 'czech', 'romanian', 'danish', 'hungarian', 'tamil', 'norwegian', 'thai', 'urdu', 'croatian', 'bulgarian', 'lithuanian', 'latin', 'maori', 'malayalam', 'welsh', 'slovak', 'telugu', 'persian', 'latvian', 'bengali', 'serbian', 'azerbaijani', 'slovenian', 'kannada', 'estonian', 'macedonian', 'breton', 'basque', 'icelandic', 'armenian', 'nepali', 'mongolian', 'bosnian', 'kazakh', 'albanian', 'swahili', 'galician', 'marathi', 'punjabi', 'sinhala', 'khmer', 'shona', 'yoruba', 'somali', 'afrikaans', 'occitan', 'georgian', 'belarusian', 'tajik', 'sindhi', 'gujarati', 'amharic', 'yiddish', 'lao', 'uzbek', 'faroese', 'haitian creole', 'pashto', 'turkmen', 'nynorsk', 'maltese', 'sanskrit', 'luxembourgish', 'myanmar', 'tibetan', 'tagalog', 'malagasy', 'assamese', 'tatar', 'hawaiian', 'lingala', 'hausa', 'bashkir', 'javanese', 'sundanese', 'burmese', 'valencian', 'flemish', 'haitian', 'letzeburgesch', 'pushto', 'panjabi', 'moldavian', 'moldovan', 'sinhalese', 'castilian'].
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/datasets/utils/py_utils.py", line 1353, in _write_generator_to_queue
    for i, result in enumerate(func(**kwargs)):
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3358, in _map_single
    example = apply_function_on_filtered_inputs(example, i, offset=offset)
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3261, in apply_function_on_filtered_inputs
    processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
  File "/usr1/liao/whisper-hakka/train/fine-tune_on_custom_dataset.py", line 198, in prepare_dataset
    batch["labels"] = processor.tokenizer(transcription).input_ids
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2538, in __call__
    encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2644, in _call_one
    return self.encode_plus(
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2717, in encode_plus
    return self._encode_plus(
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 652, in _encode_plus
    return self.prepare_for_model(
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3156, in prepare_for_model
    total_len = len_ids + len_pair_ids + (self.num_special_tokens_to_add(pair=pair) if add_special_tokens else 0)
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 479, in num_special_tokens_to_add
    return len(self.build_inputs_with_special_tokens(token_ids_0, token_ids_1 if pair else None))
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/models/whisper/tokenization_whisper.py", line 428, in build_inputs_with_special_tokens
    return self.prefix_tokens + token_ids_0 + [self.eos_token_id]
  File "/home/liao/anaconda3/envs/pytorch/lib/python3.9/site-packages/transformers/models/whisper/tokenization_whisper.py", line 406, in prefix_tokens
    raise ValueError(
ValueError: Unsupported language: hakka. Language should be one of: ['english', 'chinese', 'german', 'spanish', 'russian', 'korean', 'french', 'japanese', 'portuguese', 'turkish', 'polish', 'catalan', 'dutch', 'arabic', 'swedish', 'italian', 'indonesian', 'hindi', 'finnish', 'vietnamese', 'hebrew', 'ukrainian', 'greek', 'malay', 'czech', 'romanian', 'danish', 'hungarian', 'tamil', 'norwegian', 'thai', 'urdu', 'croatian', 'bulgarian', 'lithuanian', 'latin', 'maori', 'malayalam', 'welsh', 'slovak', 'telugu', 'persian', 'latvian', 'bengali', 'serbian', 'azerbaijani', 'slovenian', 'kannada', 'estonian', 'macedonian', 'breton', 'basque', 'icelandic', 'armenian', 'nepali', 'mongolian', 'bosnian', 'kazakh', 'albanian', 'swahili', 'galician', 'marathi', 'punjabi', 'sinhala', 'khmer', 'shona', 'yoruba', 'somali', 'afrikaans', 'occitan', 'georgian', 'belarusian', 'tajik', 'sindhi', 'gujarati', 'amharic', 'yiddish', 'lao', 'uzbek', 'faroese', 'haitian creole', 'pashto', 'turkmen', 'nynorsk', 'maltese', 'sanskrit', 'luxembourgish', 'myanmar', 'tibetan', 'tagalog', 'malagasy', 'assamese', 'tatar', 'hawaiian', 'lingala', 'hausa', 'bashkir', 'javanese', 'sundanese', 'burmese', 'valencian', 'flemish', 'haitian', 'letzeburgesch', 'pushto', 'panjabi', 'moldavian', 'moldovan', 'sinhalese', 'castilian'].
"""


Fine tuning whisper model on custom dataset -- trainer error.

I am running the fine-tuning script on a custom dataset. In the trainer initialization,

trainer = Seq2SeqTrainer(
    training_args,
    model=model,
    train_dataset=raw_dataset["train"],
    eval_dataset=raw_dataset["eval"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=processor.feature_extractor,
)

I am getting a TypeError: __init__() got multiple values for argument 'model' .
Can anyone help me with this?

Information On Batch Size And Learning Rate

The discord link in the README does not work for me.

Do you have any information on what batch size or learning rate to use? I could only find the max learning rate that was used in the paper. Experimentally, I found that too small of a batch size seems to cause issues

What batch size and learning rate do you recommend and why?

dataset with segment metadata

Hi dear,
Thank you for this code.

If my dataset contains long waves with segment metadata how can prepare it?
for example:
wav_1 path_wave

seg_1 wav_1 beginning_segment end_segment
seg_1 wav_1 1.2 3.2

pytorch_model.bin not getting saved

While fine tuning vasita/whisper-kannada-small model on customized dataset, after training, in the output directory, all the other json and bin files are getting saved except pytorch_model.bin file and after saving it says some keys were missing while saving the model [proj_out.weight]. Why is that I have no clue, I actually run the whole thing on google colab.

import torch
import evaluate
from dataclasses import dataclass
from typing import Any, Dict, List, Union
from datasets import DatasetDict, Audio, load_from_disk, concatenate_datasets
from transformers.models.whisper.english_normalizer import BasicTextNormalizer
from transformers import (
WhisperFeatureExtractor,
WhisperTokenizer,
WhisperProcessor,
WhisperForConditionalGeneration,
Seq2SeqTrainingArguments,
Seq2SeqTrainer,
)

model_name = 'vasista22/whisper-kannada-small'
language = 'Kannada'
sampling_rate = 16000
num_proc = 1
train_strategy = 'steps'
learning_rate = 1.75e-5*0.1
warmup = 20
train_batchsize = 16
eval_batchsize = 8

num_epochs = 20

num_steps = 50
resume_from_ckpt = None
output_dir = 'model_1'
train_datasets = ['/content/drive/MyDrive/Children/prepared_data']
eval_datasets = ['/content/drive/MyDrive/Children/prepared_data']

gradient_checkpointing = True
freeze_feature_encoder = False
freeze_encoder = False
do_normalize_eval = True
do_lower_case = False
do_remove_punctuation = False
normalizer = BasicTextNormalizer()

feature_extractor = WhisperFeatureExtractor.from_pretrained(model_name)
tokenizer = WhisperTokenizer.from_pretrained(model_name, language=language, task="transcribe")
processor = WhisperProcessor.from_pretrained(model_name, language=language, task="transcribe")
model = WhisperForConditionalGeneration.from_pretrained(model_name)

if model.config.decoder_start_token_id is None:
raise ValueError("Make sure that config.decoder_start_token_id is correctly defined")

if freeze_feature_encoder:
model.freeze_feature_encoder()

if freeze_encoder:
model.freeze_encoder()
model.model.encoder.gradient_checkpointing = False

model.config.forced_decoder_ids = None
model.config.suppress_tokens = []

if gradient_checkpointing:
model.config.use_cache = False

def load_custom_dataset(split):
ds = []
if split == 'train':
for dset in train_datasets:
ds.append(load_from_disk(dset))
if split == 'eval':
for dset in eval_datasets:
ds.append(load_from_disk(dset))

ds_to_return = concatenate_datasets(ds)
ds_to_return = ds_to_return.shuffle(seed=22)
return ds_to_return

def prepare_dataset(batch):
# load and (possibly) resample audio data to 16kHz
audio = batch["audio"]

# compute log-Mel input features from input audio array
batch["input_features"] = processor.feature_extractor(audio["array"], sampling_rate=audio["sampling_rate"]).input_features[0]
# compute input length of audio sample in seconds
batch["input_length"] = len(audio["array"]) / audio["sampling_rate"]

# optional pre-processing steps
transcription = batch["sentence"]
if do_lower_case:
    transcription = transcription.lower()
if do_remove_punctuation:
    transcription = normalizer(transcription).strip()

# encode target text to label ids
batch["labels"] = processor.tokenizer(transcription).input_ids
return batch

max_label_length = model.config.max_length
min_input_length = 0.0
max_input_length = 30.0
def is_in_length_range(length, labels):
return min_input_length < length < max_input_length and 0 < len(labels) < max_label_length

print('DATASET PREPARATION IN PROGRESS...')
raw_dataset = DatasetDict()
raw_dataset["train"] = load_custom_dataset('train')
raw_dataset["eval"] = load_custom_dataset('eval')

raw_dataset = raw_dataset.cast_column("audio", Audio(sampling_rate=sampling_rate))
raw_dataset = raw_dataset.map(prepare_dataset, num_proc=num_proc)

raw_dataset = raw_dataset.filter(
is_in_length_range,
input_columns=["input_length", "labels"],
num_proc=num_proc,
)

@DataClass
class DataCollatorSpeechSeq2SeqWithPadding:
processor: Any

def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:
    # split inputs and labels since they have to be of different lengths and need different padding methods
    # first treat the audio inputs by simply returning torch tensors
    input_features = [{"input_features": feature["input_features"]} for feature in features]
    batch = self.processor.feature_extractor.pad(input_features, return_tensors="pt")

    # get the tokenized label sequences
    label_features = [{"input_ids": feature["labels"]} for feature in features]
    # pad the labels to max length
    labels_batch = self.processor.tokenizer.pad(label_features, return_tensors="pt")

    # replace padding with -100 to ignore loss correctly
    labels = labels_batch["input_ids"].masked_fill(labels_batch.attention_mask.ne(1), -100)

    # if bos token is appended in previous tokenization step,
    # cut bos token here as it's append later anyways
    if (labels[:, 0] == self.processor.tokenizer.bos_token_id).all().cpu().item():
        labels = labels[:, 1:]

    batch["labels"] = labels

    return batch

data_collator = DataCollatorSpeechSeq2SeqWithPadding(processor=processor)
print('DATASET PREPARATION COMPLETED')

metric = evaluate.load("wer")
def compute_metrics(pred):
pred_ids = pred.predictions
label_ids = pred.label_ids

# replace -100 with the pad_token_id
label_ids[label_ids == -100] = processor.tokenizer.pad_token_id

# we do not want to group tokens when computing the metrics
pred_str = processor.tokenizer.batch_decode(pred_ids, skip_special_tokens=True)
label_str = processor.tokenizer.batch_decode(label_ids, skip_special_tokens=True)

if do_normalize_eval:
    pred_str = [normalizer(pred) for pred in pred_str]
    label_str = [normalizer(label) for label in label_str]

wer = 100 * metric.compute(predictions=pred_str, references=label_str)
return {"wer": wer}

############################### TRAINING ARGS AND TRAINING ############################

if train_strategy == 'epoch':
training_args = Seq2SeqTrainingArguments(
output_dir=output_dir,
per_device_train_batch_size=train_batchsize,
gradient_accumulation_steps=1,
learning_rate=learning_rate,
warmup_steps=warmup,
gradient_checkpointing=gradient_checkpointing,
fp16=True,
evaluation_strategy="epoch",
save_strategy="epoch",
num_train_epochs=num_epochs,
save_total_limit=10,
per_device_eval_batch_size=eval_batchsize,
predict_with_generate=True,
generation_max_length=225,
logging_steps=500,
report_to=["tensorboard"],
load_best_model_at_end=True,
metric_for_best_model="wer",
greater_is_better=False,
optim="adamw_bnb_8bit",
resume_from_checkpoint= resume_from_ckpt,
)

elif train_strategy == 'steps':
training_args = Seq2SeqTrainingArguments(
output_dir=output_dir,
per_device_train_batch_size=train_batchsize,
gradient_accumulation_steps=1,
learning_rate=learning_rate,
warmup_steps=warmup,
gradient_checkpointing=gradient_checkpointing,
fp16=True,
evaluation_strategy="steps",
eval_steps=50,
save_strategy="steps",
save_steps=50,
max_steps=num_steps,
save_total_limit=10,
per_device_eval_batch_size=eval_batchsize,
predict_with_generate=True,
generation_max_length=225,
logging_steps=500,
report_to=["tensorboard"],
load_best_model_at_end=True,
metric_for_best_model="wer",
greater_is_better=False,
optim="adamw_bnb_8bit",
resume_from_checkpoint=resume_from_ckpt,
)

trainer = Seq2SeqTrainer(
args=training_args,
model=model,
train_dataset=raw_dataset["train"],
eval_dataset=raw_dataset["eval"],
data_collator=data_collator,
compute_metrics=compute_metrics,
tokenizer=processor.feature_extractor,
)

processor.save_pretrained(output_dir)
model.save_pretrained(output_dir)
print('TRAINING IN PROGRESS...')
trainer.train()
print('DONE TRAINING')

This was the code, please help

Finetuning is not progressing, it just stops, remaining time is a question mark.

ARGUMENTS OF INTEREST:
{'model_name': 'openai/whisper-large-v3', 'language': 'hungarian', 'sampling_rate': 16000, 'num_proc': 2, 'train_strategy': 'epoch', 'learning_rate': 0.003, 'warmup': 1000, 'train_batchsize': 16, 'eval_batchsize': 8, 'num_epochs': 20, 'num_steps': 100000, 'resume_from_ckpt': 'None', 'output_dir': 'models', 'train_datasets': ['/data/train'], 'eval_datasets': ['/data/dev']}

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
DATASET PREPARATION IN PROGRESS...
Loading cached shuffled indices for dataset at /tmp/tmpiriwoix_/data//train/cache-b8d5b2a7daea0d9d.arrow
Loading cached shuffled indices for dataset at /tmp/tmpiriwoix_/data/dev/cache-fedae2d849cf9c83.arrow
Map (num_proc=2): 0%| | 0/1000 [00:00<?, ? examples/s]

Error while evaluating on hf dataset

Im getting this error while evaluating on hf dataset:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/datasets/load.py:2554: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=<use_auth_token>' instead.
warnings.warn(
Decode Progress: 0it [00:01, ?it/s]
Traceback (most recent call last):
File "/Users/prox/PycharmProjects/liveWhisper/testing.py", line 227, in
main(args)
File "/Users/prox/PycharmProjects/liveWhisper/testing.py", line 102, in main
for out in tqdm(whisper_asr(data(dataset), batch_size=args.batch_size), desc='Decode Progress'):
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 124, in next
item = next(self.iterator)
^^^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/transformers/pipelines/pt_utils.py", line 269, in next
processed = self.infer(next(self.iterator), **self.params)
^^^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 675, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 42, in fetch
return self.collate_fn(data)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 194, in inner
padded[key] = _pad(items, key, _padding_value, padding_side)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 100, in _pad
max_length = max(item[key].shape[1] for item in items)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/prox/PycharmProjects/liveWhisper/venv/lib/python3.11/site-packages/transformers/pipelines/base.py", line 100, in
max_length = max(item[key].shape[1] for item in items)
~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range

missing arguments

in line 10 of data_prep.py the .txt arguments are missing which are causing directory errors please update it
scp_entries = open(f"{args.source_data_dir}/audio_paths**.txt**", 'r').readlines()
txt_entries = open(f"{args.source_data_dir}/text**.txt**", 'r').readlines()

Create LICENSE

Please add a license to this repo. Might I suggest an MIT license?

Insufficient VRAM

While trying to finetune the openai/whisper-medium model with the google/fleurs dataset, even only using one language (greek) I very soon run out of VRAM, on a 20GB VRAM GPU.

Is there some way to reduce the VRAM consumption?

fine-tuning does not seem to improve/converge

I succeeded to start fine-tuning process with my own labelled English speech data, i.e. using
fine-tune_on_custom_dataset.py
see output below.

However, the process does not seem to converge and eval_wer stuck to be at pretty high level.
Any idea what may go wrong?
I am using the 'standard' parameters as used in the example codes.
Question regarding the audio files: I assume that 16kHz wav files (short int values) are expected (i.e. with wav header,
no headerless pcm in any particular byte order), right?

Thanks for any hint !
kind regards

{'loss': 1.2395, 'learning_rate': 0.001488, 'epoch': 0.13}
{'loss': 1.8445, 'learning_rate': 0.002988, 'epoch': 0.27}
{'loss': 1.8692, 'learning_rate': 0.002979891891891892, 'epoch': 0.4}
{'loss': 1.8025, 'learning_rate': 0.0029596621621621622, 'epoch': 0.53}
{'loss': 1.7203, 'learning_rate': 0.002939391891891892, 'epoch': 0.67}
{'loss': 1.5855, 'learning_rate': 0.0029191621621621625, 'epoch': 0.8}
{'loss': 1.5751, 'learning_rate': 0.002900716216216216, 'epoch': 0.93}
{'eval_loss': nan, 'eval_wer': 100.0, 'eval_runtime': 22.4018, 'eval_samples_per_second': 2.232, 'eval_steps_per_second': 0.312, 'epoch': 1.0}
{'loss': 2.4114, 'learning_rate': 0.002896135135135135, 'epoch': 1.07}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 1.2}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 1.33}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 1.47}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 1.6}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 1.73}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 1.87}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.0}
{'eval_loss': nan, 'eval_wer': 100.0, 'eval_runtime': 21.6604, 'eval_samples_per_second': 2.308, 'eval_steps_per_second': 0.323, 'epoch': 2.0}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.13}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.27}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.4}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.53}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.67}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.8}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 2.93}
{'eval_loss': nan, 'eval_wer': 100.0, 'eval_runtime': 21.5522, 'eval_samples_per_second': 2.32, 'eval_steps_per_second': 0.325, 'epoch': 3.0}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.07}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.2}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.33}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.47}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.6}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.73}
{'loss': 0.0, 'learning_rate': 0.002896135135135135, 'epoch': 3.87}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.