Giter Site home page Giter Site logo

Comments (8)

jonatasgrosman avatar jonatasgrosman commented on July 16, 2024 1

I've created a PR to fix this issue by forcing the model to ignore the embedded language model if it exists in the model's repo

from whisperx.

rhenanbartels avatar rhenanbartels commented on July 16, 2024

I've looked at the code in whisperx/transcribe.py and the original exception is not the one that says align_model could not be found.

First, the original exception was:

ImportError:
Wav2Vec2ProcessorWithLM requires the pyctcdecode library but it was not found in your environment. You can install it with pip:
pip install pyctcdecode. Please note that you may need to restart your runtime after installation.

After installing pip install pyctcdecode

the exception became:

name 'kenlm' is not defined
Error loading model from huggingface, check https://huggingface.co/models for finetuned wav2vec2.0 models

and installing pip install kenlm did the trick (as a temporary solution).

I don't know the cause of the error yet, once the model is available here.


Another potential solution is to use the original source facebook/wav2vec2-large-xlsr-53-portuguese

Note: the same error is thrown with "nl" language.

from whisperx.

vogelcodes avatar vogelcodes commented on July 16, 2024

pip install kenlm failed and I didn't noticed it
I'm on Windows. I had to download Visual Studio C++ Build Tools.
I have installed pyctcdecode also. But still the same error:

New language found (pt)! Previous was (en), loading new alignment model for new language...
C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\huggingface_hub\utils_deprecation.py:100: FutureWarning: Deprecated argument(s) used in 'snapshot_download': allow_regex. Will not be supported from version '0.12'.

Please use allow_patterns and ignore_patterns instead.
warnings.warn(message, FutureWarning)
Fetching 4 files: 100%|████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 4032.98it/s]
'charmap' codec can't decode byte 0x81 in position 33: character maps to
Error loading model from huggingface, check https://huggingface.co/models for finetuned wav2vec2.0 models
Traceback (most recent call last):
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\whisperx\transcribe.py", line 428, in load_align_model
processor = AutoProcessor.from_pretrained(model_name)
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\transformers\models\auto\processing_auto.py", line 259, in from_pretrained
return processor_class.from_pretrained(
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\transformers\models\wav2vec2_with_lm\processing_wav2vec2_with_lm.py", line 161, in from_pretrained
decoder = BeamSearchDecoderCTC.load_from_hf_hub(
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pyctcdecode\decoder.py", line 831, in load_from_hf_hub
return cls.load_from_dir(cached_directory)
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pyctcdecode\decoder.py", line 792, in load_from_dir
alphabet = Alphabet.loads(fi.read())
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2544.0_x64__qbz5n2kfra8p0\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 33: character maps to

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\Scripts\whisperx-script.py", line 33, in
sys.exit(load_entry_point('whisperx==1.0', 'console_scripts', 'whisperx')())
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\whisperx\transcribe.py", line 525, in cli
align_model, align_metadata = load_align_model(result["language"], device)
File "C:\Users\danie\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\whisperx\transcribe.py", line 433, in load_align_model
raise ValueError(f'The chosen align_model "{model_name}" could not be found in huggingface (https://huggingface.co/models) or torchaudio (https://pytorch.org/audio/stable/pipelines.html#id14)')
ValueError: The chosen align_model "jonatasgrosman/wav2vec2-large-xlsr-53-portuguese" could not be found in huggingface (https://huggingface.co/models) or torchaudio (https://pytorch.org/audio/stable/pipelines.html#id14)

from whisperx.

DvGils avatar DvGils commented on July 16, 2024

I get the same error. Persistent after installing pyctcdecode, Visual Studio C++ Build Tools, and kenlm.

from whisperx.

Jeronymous avatar Jeronymous commented on July 16, 2024

I have the same issue when I try the Russian wav2vec model https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-russian

Traceback (most recent call last):
  File "/usr/local/bin/whisperx", line 33, in <module>
    sys.exit(load_entry_point('whisperx', 'console_scripts', 'whisperx')())
  File "/xxx/whisperX/whisperx/transcribe.py", line 453, in cli
    align_model, align_metadata = load_align_model(align_language, device, model_name=align_model)
  File "/xxx/whisperX/whisperx/alignment.py", line 62, in load_align_model
    raise ValueError(f'The chosen align_model "{model_name}" could not be found in huggingface (https://huggingface.co/models) or torchaudio (https://pytorch.org/audio/stable/pipelines.html#id14)')
ValueError: The chosen align_model "jonatasgrosman/wav2vec2-large-xlsr-53-russian" could not be found in huggingface (https://huggingface.co/models) or torchaudio (https://pytorch.org/audio/stable/pipelines.html#id14)

from whisperx.

m-bain avatar m-bain commented on July 16, 2024

These are errors with the huggingface model, maybe the author of these models can help @jonatasgrosman ?

from whisperx.

jonatasgrosman avatar jonatasgrosman commented on July 16, 2024

Hi everyone! I don't know what's happening here precisely, but it may be because whisperx is using the hugging face's pipelines for the ASR. And there are some rules on these pipelines that force the usage of a language model in the presence of one in the model repository (you can see a language_model folder on these reported repos).

That is why the whisperx works for other models that don't have a language_model folder on them. For the models with a language_model folder, you'll need to install the pyctcdecode and kenlm deps to make the whisperx works.

@m-bain I think you could force the pipeline to ignore the embedded language model by default to prevent this kind of issue.

from whisperx.

m-bain avatar m-bain commented on July 16, 2024

Thanks @jonatasgrosman !

from whisperx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.