Giter Site home page Giter Site logo

Comments (9)

Gldkslfmsd avatar Gldkslfmsd commented on August 22, 2024

yes. Option

--model_dir MODEL_DIR
Dir where Whisper model.bin and other files are saved. This option overrides --model and --model_cache_dir parameter.

from whisper_streaming.

yilmazay74 avatar yilmazay74 commented on August 22, 2024

it looks like model_dir parameter hasn't been implemented yet.
on the 58'th line ofthe whisper_online.py it says this:
def load_model(self, modelsize=None, cache_dir=None, model_dir=None):
if model_dir is not None:
print("ignoring model_dir, not implemented",file=sys.stderr)
return whisper.load_model(modelsize, download_root=cache_dir)

Uploading image.png…

from whisper_streaming.

yilmazay74 avatar yilmazay74 commented on August 22, 2024

May I ask whether there will be any solution for this issue in the near future?

from whisper_streaming.

Gldkslfmsd avatar Gldkslfmsd commented on August 22, 2024

It's implemented in FasterWhisperASR, but not in WhisperTimestampedASR. I recommend using FasterWhisperASR with faster-whisper backend because it is much faster.

Do you really need whisper_timestamped backend? E.g. due to installation requirements? If yes, then you can implement and test the option that passes model_dir correctly. I'm busy now, I'm not planning it.

from whisper_streaming.

yilmazay74 avatar yilmazay74 commented on August 22, 2024

Well, since I am working on CPU, I thought I need to use whisper_timestamped backend.
Actually with some tweaks on the code, I made it work on whisper_timestamped as well.
However, the accuracy results are not so satisfying.
When we test our finetuned model with normal whisper, our finetuned model performs a lot better than the base whisper model.
So, we were expecting same better results on whisper_streaming as well.
I will try faster_whisper and see if it makes any difference on the accuracy.
Thanks anyway for guidance.

from whisper_streaming.

Gldkslfmsd avatar Gldkslfmsd commented on August 22, 2024

faster-whisper also works on CPU.
You can also try longer MinChunkSize. The longer, the higher quality, but also higher latency. E.g. 5 or 10 seconds MinChunkSize should be very close to offline mode.

from whisper_streaming.

yilmazay74 avatar yilmazay74 commented on August 22, 2024

Thanks a lot for your advice, that is definetly very useful.
Althoug, I made it work With whisper_timestamped mode, it is not very promising.
However, I also am going to try with longer chunk sizes as you suggested.
But mainly I want to focus on faster-whisper since it is designed to work with finetuned models.
Regarding faster-whisper mode, it works with pretrained whisper base model.
Unfortunately I could not make faster-whisper work with my own fine-tuned model. It is giving the following errors:

_**E:\SpeechTeam\Whispers\whisper_streaming\venv2\Scripts\python.exe E:\SpeechTeam\Whispers\whisper_streaming\whisper_online_server.py --lang tr --task transcribe --min-chunk-size 5 --backend faster-whisper --model_dir E:\DEPLOY\MODELS\Global\WhisperModels\sai-base-v13
Loading Whisper base model for tr... Loading whisper model from model_dir E:\DEPLOY\MODELS\Global\WhisperModels\sai-base-v13. modelsize and cache_dir parameters are not used.
Traceback (most recent call last):
File "E:\SpeechTeam\Whispers\whisper_streaming\whisper_online_server.py", line 48, in
asr = asr_cls(modelsize=size, lan=language, cache_dir=args.model_cache_dir, model_dir=args.model_dir)
File "E:\SpeechTeam\Whispers\whisper_streaming\whisper_online.py", line 33, in init
self.model = self.load_model(modelsize, cache_dir, model_dir)
File "E:\SpeechTeam\Whispers\whisper_streaming\whisper_online.py", line 107, in load_model
model = WhisperModel(model_size_or_path, device=device, compute_type=compute_type, download_root=cache_dir)
File "E:\SpeechTeam\Whispers\whisper_streaming\venv2\lib\site-packages\faster_whisper\transcribe.py", line 128, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: Unsupported model binary version. This executable supports models with binary version v6 or below, but the model has binary version v67324752. This usually means that the model was generated by a later version of CTranslate2. (Forward compatibility is not guaranteed.)

Process finished with exit code 1**_
It looks like, faster-whisper is somehow bounded with older versions of some dependencies.
I will very much appreciate it if you can give me some ideas about what I can do to overcome this problem.
Thanks in advance.

from whisper_streaming.

yilmazay74 avatar yilmazay74 commented on August 22, 2024

As an additional information; I updated ctranslate2 lib to the latest version, but it didn't make any difference, it is still giving the same error message.
As far as I understand, the error message suggests me to re-do the fine-tuning training of the whisper model with older versions of the dependency libs. But it seems too complicated to me, because downgrading a lib causes other dependent libs cease to work. So, I think, there must be an easier way.

Just as a simple thought, is it possible to update whisper_streaming so that it can handle newer fine-tuned models?

from whisper_streaming.

Gldkslfmsd avatar Gldkslfmsd commented on August 22, 2024

hi, I have no idea what versions of models faster-whisper needs. Maybe try if your model works with faster-whisper offline mode implementation. And if not, consult faster-whisper authors.

from whisper_streaming.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.