Comments (2)
@agandhigoto I am not a whisper expert but after exploring the codebase a bit, this is what I found.
- The error raises due to
forced_decoder_ids
in the distil-whisper model config. It was not failing inopenai-whisper
because it does not have it by default. I opened a PR to fix it, until it gets merged you can use this code as a workaround
result = model.generate(**inputs, condition_on_prev_tokens=False, temperature=(0.0, 0.2, 0.4, 0.6, 0.8, 1.0), logprob_threshold=-1.0, compression_ratio_threshold=1.35, forced_decoder_ids=None, return_timestamps=True)
- Whisper longform in pipeline cannot do batched generation right now. You can still pass multiple samples and have
batch size=1
by default, in which case the inputs will be processed one by one sequentially. To use longform whisper generation with batches more than 1, you can instantiate aWhisperForConditionalGeneration
from transformers import WhisperForConditionalGeneration, AutoProcessor
processor = AutoProcessor.from_pretrained("distil-whisper/distil-large-v2")
model = WhisperForConditionalGeneration.from_pretrained("distil-whisper/distil-large-v2").to("cuda:0")
inputs = processor(batch_of_long_audios, return_tensors="pt", truncation=False, padding=True, return_attention_mask=True, sampling_rate=16_000)
inputs = inputs.to("cuda:0")
result = model.generate(**inputs, return_timestamps=True)
decoded = processor.batch_decode(result, skip_special_tokens=True)
print(decoded)
- This statement you cited is about comparing speed-ups when using batch_size 1 vs more than 1. And as stated in (2) batch size>1 is not possible for pipelines. Please use the code above for batched generation. I tried to measure time with a toy sample of 50 audio each 40-50 seconds with
openai/whisper-large-v2
and validated there is speed up for higher batch size.
Hope this helps to understand how to use batched longform generation in Whisper 🤗
from transformers.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
from transformers.
Related Issues (20)
- KV cache with CPU offloading HOT 4
- Refusal rejection removal as a feature
- Add static cache support for Whisper HOT 3
- from_pretrained torch_dtype DO NOT affect model buffers
- Error with tf-keras when trying to geneate random seeds HOT 1
- Error while runing T5 trainer: TypeError: argument 'ids': 'list' object cannot be interpreted as an integer HOT 2
- Is `model. generate` supported during the training process?
- CLIPProcessor is not loading the saved Processor of the same version HOT 12
- Failed to Download GPT2-large Model from Hub
- Add TableTransformerImageProcessor HOT 3
- error when convert llama1 ckpts to hf formath HOT 5
- `hub_strategy="every_save"` won't push the model to the Hub if large
- Support for Multiple Datasets and Domain-Specific Loss Calculation in Trainer HOT 2
- AttributeError: 'HQQLinear' object has no attribute 'weight' HOT 8
- Assisted model doesn't seem to be working for Meta-Llama-3-8B HOT 2
- Mixtral past_key_values and output_router_logits incompatible HOT 1
- Disable Progress Bar? HOT 1
- Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B HOT 2
- [DOCS] - Model outputs of RecurrentGemmaCausalLM doesn't align with the documentation HOT 1
- [Batched Whisper] ValueError on input mel features HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.