Comments (3)
Hey @9throok - cool to see that you're using Distil-Whisper in combination with Faster-Whisper! I believe the .transcribe
method in Faster-Whisper handles the long-form generation algorithm: https://github.com/guillaumekln/faster-whisper#usage Is this the API that you've been using? If you could share a reproducible code snippet that showcases the behaviour you're seeing that would be great, thanks!
from distil-whisper.
@9throok, any update on the issue that you mentioned?
from distil-whisper.
Hi, I have been working on faster whisper and trying to use the distil-whisper model. However, distil-whisper supports 30s of audio chunks and using it with faster whisper only outputs the first 30 seconds.
I had same issue, after the first chunk nada in output, then looked at debug - distill model just hallucinated non stop after the first chunk, solution is to disable context prompt, initial prompt has negative effect too.
How can it be used with the faster-whisper implementation?
Now it has official support -> SYSTRAN/faster-whisper@ad3c830
Or you can use the standalone executable -> https://github.com/Purfview/whisper-standalone-win
from distil-whisper.
Related Issues (20)
- large-v2 for english lost voice to text HOT 1
- Finetuning on which model? HOT 1
- Resuming training fails HOT 3
- [Issue] latest run_pseudo_labelling.py
- [Question] Can we distill for multiple langauges for distil-small-whisper HOT 1
- Quantize distil-whisper?
- perceptually faster inference through pre-completion inference of audio
- RuntimeError: User specified an unsupported autocast device_type 'mps'
- Short form evaluation WER % for Librispeech clean test HOT 3
- Best way to implement streaming application?
- distil-small.en AttributeError HOT 3
- transcription results are inconsistent and timestamps are None type. Issue appears in the latest version of the transformers==4.38.1.
- Question: should the pseudo-labelling model and teacher model be the same? HOT 2
- BetterTransformer optimization / flash_attn{_2} HOT 6
- Cached English Common Voice dataset size. HOT 1
- How to use distil-whisper-large-v3-de-kd model from HF? HOT 10
- Pseudo-labelling librispeech_asr (train.360): KeyError `train-360` when not streaming. HOT 1
- Training README datasets table: text column and id column HOT 4
- Voxpopuli text column "raw_text" HF dataset card shows empty string. HOT 1
- any executable script for running on custom data/given dataset HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from distil-whisper.