Hi Everyone, I want to use whisper_streaming with my custom fine-tuned whisper mod

Running whisper_streaming with a fine-tuned whisper model about whisper_streaming HOT 9 CLOSED

yilmazay74 commented on August 22, 2024

Running whisper_streaming with a fine-tuned whisper model

from whisper_streaming.

Comments (9)

Gldkslfmsd commented on August 22, 2024

yes. Option

--model_dir MODEL_DIR
Dir where Whisper model.bin and other files are saved. This option overrides --model and --model_cache_dir parameter.

from whisper_streaming.

yilmazay74 commented on August 22, 2024

it looks like model_dir parameter hasn't been implemented yet.
on the 58'th line ofthe whisper_online.py it says this:
def load_model(self, modelsize=None, cache_dir=None, model_dir=None):
if model_dir is not None:
print("ignoring model_dir, not implemented",file=sys.stderr)
return whisper.load_model(modelsize, download_root=cache_dir)

from whisper_streaming.

yilmazay74 commented on August 22, 2024

May I ask whether there will be any solution for this issue in the near future?

from whisper_streaming.

Gldkslfmsd commented on August 22, 2024

It's implemented in FasterWhisperASR, but not in WhisperTimestampedASR. I recommend using FasterWhisperASR with faster-whisper backend because it is much faster.

Do you really need whisper_timestamped backend? E.g. due to installation requirements? If yes, then you can implement and test the option that passes model_dir correctly. I'm busy now, I'm not planning it.

from whisper_streaming.

yilmazay74 commented on August 22, 2024

Well, since I am working on CPU, I thought I need to use whisper_timestamped backend.
Actually with some tweaks on the code, I made it work on whisper_timestamped as well.
However, the accuracy results are not so satisfying.
When we test our finetuned model with normal whisper, our finetuned model performs a lot better than the base whisper model.
So, we were expecting same better results on whisper_streaming as well.
I will try faster_whisper and see if it makes any difference on the accuracy.
Thanks anyway for guidance.

from whisper_streaming.

Gldkslfmsd commented on August 22, 2024

faster-whisper also works on CPU.
You can also try longer MinChunkSize. The longer, the higher quality, but also higher latency. E.g. 5 or 10 seconds MinChunkSize should be very close to offline mode.

from whisper_streaming.

yilmazay74 commented on August 22, 2024

Thanks a lot for your advice, that is definetly very useful.
Althoug, I made it work With whisper_timestamped mode, it is not very promising.
However, I also am going to try with longer chunk sizes as you suggested.
But mainly I want to focus on faster-whisper since it is designed to work with finetuned models.
Regarding faster-whisper mode, it works with pretrained whisper base model.
Unfortunately I could not make faster-whisper work with my own fine-tuned model. It is giving the following errors:

_**E:\SpeechTeam\Whispers\whisper_streaming\venv2\Scripts\python.exe E:\SpeechTeam\Whispers\whisper_streaming\whisper_online_server.py --lang tr --task transcribe --min-chunk-size 5 --backend faster-whisper --model_dir E:\DEPLOY\MODELS\Global\WhisperModels\sai-base-v13
Loading Whisper base model for tr... Loading whisper model from model_dir E:\DEPLOY\MODELS\Global\WhisperModels\sai-base-v13. modelsize and cache_dir parameters are not used.
Traceback (most recent call last):
File "E:\SpeechTeam\Whispers\whisper_streaming\whisper_online_server.py", line 48, in
asr = asr_cls(modelsize=size, lan=language, cache_dir=args.model_cache_dir, model_dir=args.model_dir)
File "E:\SpeechTeam\Whispers\whisper_streaming\whisper_online.py", line 33, in init
self.model = self.load_model(modelsize, cache_dir, model_dir)
File "E:\SpeechTeam\Whispers\whisper_streaming\whisper_online.py", line 107, in load_model
model = WhisperModel(model_size_or_path, device=device, compute_type=compute_type, download_root=cache_dir)
File "E:\SpeechTeam\Whispers\whisper_streaming\venv2\lib\site-packages\faster_whisper\transcribe.py", line 128, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: Unsupported model binary version. This executable supports models with binary version v6 or below, but the model has binary version v67324752. This usually means that the model was generated by a later version of CTranslate2. (Forward compatibility is not guaranteed.)

Process finished with exit code 1**_
It looks like, faster-whisper is somehow bounded with older versions of some dependencies.
I will very much appreciate it if you can give me some ideas about what I can do to overcome this problem.
Thanks in advance.

from whisper_streaming.

yilmazay74 commented on August 22, 2024

As an additional information; I updated ctranslate2 lib to the latest version, but it didn't make any difference, it is still giving the same error message.
As far as I understand, the error message suggests me to re-do the fine-tuning training of the whisper model with older versions of the dependency libs. But it seems too complicated to me, because downgrading a lib causes other dependent libs cease to work. So, I think, there must be an easier way.

Just as a simple thought, is it possible to update whisper_streaming so that it can handle newer fine-tuned models?

from whisper_streaming.

Gldkslfmsd commented on August 22, 2024

hi, I have no idea what versions of models faster-whisper needs. Maybe try if your model works with faster-whisper offline mode implementation. And if not, consult faster-whisper authors.

from whisper_streaming.

Running whisper_streaming with a fine-tuned whisper model about whisper_streaming HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent