Comments (3)
The above table is with --language "en"
in the short form bash scripts. By removing this flag and rerunning the evaluation the eval/wer
values are lower.
E.g.:
model | eval/wer with --language "en" |
eval/wer without option --language |
HF model card WER |
---|---|---|---|
OpenAI Large-v2 | 3.1683 | 2.5685 | 3.0004 |
OpenAI Small | 4.0682 | 3.44541 | 3.4322 |
Without the --language
flag:
- Large-v2 model
eval/wer
is lower than the HuggingFace model card WER value, and lower than the original OpenAI paper result of 2.7% in Table 2. - Small model
eval/wer
is similar to the HuggingFace model card WER value.
from distil-whisper.
Added Tiny model script and result here: https://github.com/guynich/distil-whisper/tree/main/training/scripts#summary.
from distil-whisper.
I'm closing this issue: the small and tiny model results for HF model card
and eval/wer without option --language
are aligned sufficiently for me.
(I don't understand the discrepancy in values for Large-V2 but can leave that issue)
from distil-whisper.
Related Issues (20)
- large-v2 for english lost voice to text HOT 1
- Finetuning on which model? HOT 1
- Resuming training fails HOT 3
- [Issue] latest run_pseudo_labelling.py
- [Question] Can we distill for multiple langauges for distil-small-whisper HOT 3
- Quantize distil-whisper?
- perceptually faster inference through pre-completion inference of audio
- RuntimeError: User specified an unsupported autocast device_type 'mps'
- question about when to apply WER threshold filtering strategy with concatenated audio
- Problems in concatenate_dataset
- How to set the target language for examples in README? HOT 7
- transcription results are inconsistent and timestamps are None type. Issue appears in the latest version of the transformers==4.38.1.
- Question: should the pseudo-labelling model and teacher model be the same? HOT 2
- BetterTransformer optimization / flash_attn{_2} HOT 6
- Cached English Common Voice dataset size. HOT 1
- How to use distil-whisper-large-v3-de-kd model from HF? HOT 10
- Pseudo-labelling librispeech_asr (train.360): KeyError `train-360` when not streaming. HOT 1
- Training README datasets table: text column and id column HOT 4
- Voxpopuli text column "raw_text" HF dataset card shows empty string. HOT 1
- any executable script for running on custom data/given dataset HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from distil-whisper.