Comments (46)
Hi @Zacharie-Jacob , I tried additional test on a 4min clip with same command line:
whisper-ctranslate2 $_ --model medium --language 'Japanese' --vad_filter True --device cuda --compute_type int8 --output_format srt --output_dir $directory --task translate --word_timestamps True --verbose True > $directory\$filename.md
Here are the results:
- 4min video mp4: No output.
- 4min audio wav: No output.
- 4min audio wav 16khz mono: Works! The srt was generated.
I repeated the same test with --device cpu
, and it worked well on all 3 above tests.
from whisper-ctranslate2.
I think it's exactly the same problem.
the software finishes transcribing on GPU but no output files are created, you can still copy the results from the terminal.
the software finishes transcribing in CPU and creates output files.
this is probably a bug in faster-whisper if you can't find any problems in your code.
from whisper-ctranslate2.
I can confirm that I have the same problem. No output file is created. My command line is (in powershell):
whisper-ctranslate2 $_ --model medium --language 'Japanese' --vad_filter True --device cuda --compute_type int8 --output_format srt --output_dir $directory --task translate --word_timestamps True --verbose True > $directory\$filename.md
from whisper-ctranslate2.
In my environment, I can almost stably trigger the bug. It prints completely in command line, but nothing outputs in current directory and there is a windows error python has stopped working
. The problem is probably about dictionary referencing and memory reclamation issue. My temporary solution is to transfer writer
of whisper_ctranslate2.py
to transcribe.py
. Although it damages the code structure, it is currently important for me that it works
# \Anaconda\envs\Lib\site-packages\src\whisper_ctranslate2\whisper_ctranslate2.py
def main():
...
for audio_path in audio:
result = Transcribe().inference(
...
output_format,
output_dir,
audio_path,
)
# writer = get_writer(output_format, output_dir)
# writer(result, audio_path)
# \Anaconda\envs\Lib\site-packages\src\whisper_ctranslate2\transcribe.py
class Transcribe:
...
def inference(
...
output_format,
output_dir,
audio_path,
):
...
result = dict(
text=all_text,
segments=list_segments,
language=language_name,
)
from .writers import get_writer
writer = get_writer(output_format, output_dir)
writer(result, audio_path)
# return result
The detailed process of my debugging
environment
OS: Windows 10
python: 3.9.16150.1013
GPU: GTX1660ti (mobile)
IDE: VS code
package:
numpy==1.23.3
faster-whisper==0.4.1
ctranslate2==3.11.0
tqdm==4.65.0
sounddevice==0.4.6
trigger the bug
-
Audio file:
5m.mp3
. about 100 segments. -
Model: guillaumekln/faster-whisper-tiny or guillaumekln/faster-whisper-large-v2
-
cmd or powershell
whisper-ctranslate2 ".\5m.mp3" --language Japanese --model_directory "..\model\faster-whisper-tiny"
-
It will print the results on the screen correctly. After that, the python has stopped working and no output files.
Set the breakpoint
# whisper_ctranslate2\whisper_ctranslate2.py
for audio_path in audio:
result = Transcribe().inference(...)
print(result) # some operation. Setting breakpoint here and moving the mouse on result will trigger `python has stopped working`
error analysis (unconfirmed)
-
small audio file works well but failed in large file.
-
openai/whisper
works well for me. The difference withopenai/whisper
is that inwhisper_ctranslate2
,def transcribe(...)
has been changed to:
class Transcribe:
...
def inference(...):
list_segments = []
last_pos = 0
accumated_inc = 0
all_text = ""
...
return dict(
text=all_text,
segments=list_segments,
language=language_name,
)
I guess it is suspected that list_segments
is a local variable of Transcribe.inference
, and after calling result = Transcribe().inference(...)
, the memory recycling mechanism causes the memory pointed to by result["segments"]
to be recycled.
list_segments = [
{ },
...
]
Some failed attempts
ucrtbase.dll
In Windows Event Viewer, we can see that the crash seems to be related to ucrtbase.dll
. However, I have tried search it online but no result related and I have tried updated it but it also doesn't work.
Writers
-
Replace the main content of
whisper_ctranslate2/writer.py'
withopenai/whisper/utils. py
and make modifications, useless -
Place the content of
writer. py
directly inwhisper_ctranslate2/whisper_ctranslate2.py
is also useless.
from whisper-ctranslate2.
no luck getting any kind of output, using a 16khz wav that i use for testing Const-me whisper and whisper cpp, expected is a 10 minute translation.
C:\Users\emcod>whisper-ctranslate2 c:\temp\test.wav --model medium
There are old cache files at `C:\Users\emcod\.cache\whisper-ctranslate2` which are no longer used. Consider deleting them
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Downloading (…)56e98277/config.json: 100%|█████████████████████████████████████████| 2.26k/2.26k [00:00<00:00, 752kB/s]
C:\python3100\lib\site-packages\huggingface_hub\file_download.py:133: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\emcod\.cache\huggingface\hub. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Downloading (…)98277/vocabulary.txt: 100%|██████████████████████████████████████████| 460k/460k [00:00<00:00, 2.17MB/s]
Downloading (…)98277/tokenizer.json: 100%|████████████████████████████████████████| 2.20M/2.20M [00:01<00:00, 2.18MB/s]
Downloading model.bin: 100%|██████████████████████████████████████████████████████| 1.53G/1.53G [03:02<00:00, 8.39MB/s]
C:\Users\emcod>
Or some try with default params
C:\Users\emcod>whisper-ctranslate2 c:\temp\test.wav --language de
There are old cache files at `C:\Users\emcod\.cache\whisper-ctranslate2` which are no longer used. Consider deleting them
Detected language 'German' with probability 1.000000
Then me follows the instructions and delete "old cache files" C:\Users\emcod\.cache\whisper-ctranslate2
(i delete the whole .cache folder):
C:\Users\emcod>whisper-ctranslate2 c:\temp\test.wav --language de
Downloading (…)e94b4c8a/config.json: 100%|████████████████████████████████████████| 2.37k/2.37k [00:00<00:00, 1.19MB/s]
C:\python3100\lib\site-packages\huggingface_hub\file_download.py:133: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\emcod\.cache\huggingface\hub. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Downloading (…)b4c8a/vocabulary.txt: 100%|██████████████████████████████████████████| 460k/460k [00:00<00:00, 1.44MB/s]
Downloading (…)b4c8a/tokenizer.json: 100%|█████████████████████████████████████████| 2.20M/2.20M [00:07<00:00, 308kB/s]
Downloading model.bin: 100%|████████████████████████████████████████████████████████| 484M/484M [00:57<00:00, 8.35MB/s]
Detected language 'German' with probability 1.000000███████████████████████████████| 2.20M/2.20M [00:07<00:00, 309kB/s]
C:\Users\emcod>
Try to enable debug logging:
C:\Users\emcod>whisper-ctranslate2 --verbose true c:\temp\test.wav
whisper-ctranslate2: error: argument --verbose: invalid str2bool value: 'true'
same with
C:\Users\emcod>whisper-ctranslate2 --verbose 1 c:\temp\test.wav
whisper-ctranslate2: error: argument --verbose: invalid str2bool value: '1'
Now, read some python docs and see that "true" often is written as "True":
C:\Users\emcod>whisper-ctranslate2 --verbose True c:\temp\test.wav
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
C:\Users\emcod>
ok, try some other stuff:
C:\Users\emcod>whisper-ctranslate2 --verbose True c:\temp\test.wav --compute_type int8
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
C:\Users\emcod>
from whisper-ctranslate2.
Version 0.2.6 should fix this.
from whisper-ctranslate2.
- Please make sure that you use version 0.16. If you are not, please update to this version.
Im on 0.17, checked by running " whisper-ctranslate2 --version "
- While the tool is running, can you see anything on the terminal? You should see the transcription that is doing.
Yes, the transcriptions appears on the terminal
- Can you try just "whisper-ctranslate2 [the video file] --model large-v2". Does it work?
Ok, ive just tried this and noticed something. Btw i decided to run on a 2 minutes flac audio to speed up things. I ran the program using "whisper-ctranslate2 [the audio file] --model tiny": didnt work. Then i ran with large-v2 and to my surprise, it worked. Then i tried again with large-v2 and it worked again. Then i came back to tiny and it stopped working. Then i tried with base: doesnt work. THen i finally tried with large-v2 again and it worked. But previously even the large-v2 was not working.
from whisper-ctranslate2.
Hi, I found that this only happens on GPU, it produces output when I add "--device CPU"
I'm sorry I can't provide anything because it's only happening on my personal videos.
videos with a clear professional audio isn't looping and producing output, it's probable an issue with whisper itself and not your software.
is there a way to produce log files?
from whisper-ctranslate2.
Hi Jordimas, I am having similar issues.
The first is that nothing gets output unless output type and location are set (though perhaps that is by design?)
The second is that unless I add "--device CPU" no data is returned- I just go back to the command prompt. This is true for short clear wav, longer mp4, English and Japanese.
I have a RTX 2080 Super with the current studio driver (531.61). I am able to use basic Whisper installations with CUDA as well as Const-me, etc. Is there something I need to set up here or in NVIDIA control panel?
For test video we can use the same one I shared before.
whisper-ctranslate2.exe --language ja --model "large-v2" --device CPU --output_dir "C:\Users\rsmit\Dropbox\Videos" --output_format "srt" "C:\Users\rsmit\Dropbox\Videos\10 MPantry final new titles 2.mp4"
This works- actually very well in terms of quality! No issues at all.
Change to CUDA and it fails.
whisper-ctranslate2.exe --language ja --model "large-v2" --device CUDA --output_dir "C:\Users\rsmit\Dropbox\Videos" --output_format "srt" "C:\Users\rsmit\Dropbox\Videos\10 MPantry final new titles 2.mp4"
Base model, etc. also fail.
NVIDIA Control Panel reports I have NVIDIA CUDA 12.1.107 driver. It has a compute capability of 7.5.
I also installed the standalone cuda_12.1.0_531.14_windows.exe
from whisper-ctranslate2.
The first is that nothing gets output unless output type and location are set (though perhaps that is by design?)
No, this is not by design. By design it outputs all formats and writes in the current directory that you are.
@rsmith02ct Is possible please to create a separate ticker for this issue? It's different to the other one. Thanks
from whisper-ctranslate2.
I had this same problem. I was unable to pinpoint it to specifically whisper-ctranslate2, but the problem is exactly the same as yours. It displays the translation. There are no errors. No output files are written.
It does write out if I choose a very small file (like a minute or two long), but longer files just mysteriously do not have any outputs.
I do not know enough about the code itself to know if it makes sense that longer files would not produce outputs but shorter files will.
from whisper-ctranslate2.
I can confirm that 16khz mono conversion works, but a lot of the information are lost and the output is very different than CPU on original file.
from whisper-ctranslate2.
have the same problem. Can see all the text in the powershell of it transcribing and translating, then when its done. nothing. no srt files are generated. whisper-ctranslate2 "file name here.mp4" --device cuda --device_index 0 --vad_filter true --vad_min_speech_duration_ms 50 --vad_min_silence_duration_ms 2000 --vad_max_speech_duration_s 10 --condition_on_previous_text False --language Japanese --task translate --output_format srt --model large-v2
from whisper-ctranslate2.
@Qel0droma Are you using a GPU?
yes
from whisper-ctranslate2.
Hi,
I think it's the same issue as SYSTRAN/faster-whisper#71 which I can now reproduce on Windows.
When the output files are missing, you can verify that the process crashed with a non-zero exit code:
PS > $LASTEXITCODE
-1073740791
The process crashes when the model is unloaded but only when the transcription triggered the temperature fallback. If you disable the temperature fallback it should work without issue. Try adding this option on the command line:
--temperature_increment_on_fallback None
The crash seems to happen only on Windows.
@jordimas In the meantime, you could slightly change the code to ensure the WhisperModel
instance is still alive when writing the results on disk.
from whisper-ctranslate2.
Even using --temperature_increment_on_fallback None
, I am getting zero output (even on the console) if I use the GPU on Windows. I am using a 3090, and I did install the various dependencies as far as I can tell. It would be nice if we got an error message of some kind.
from whisper-ctranslate2.
You could load the model once and then use the same model instance to transcribe each file. This should work around the issue and also be more efficient than reloading the model each time.
from whisper-ctranslate2.
I followed guillaumekln's tip and modified the code:
move the WhisperModel generation to the main function of whisper_ctranslate2.py instead of the inference function. The model should be passed to the inference function. You also need to add
from faster_whisper import WhisperModel
to whisper_ctranslate2.py.
from whisper-ctranslate2.
Hi, this change does not fix the issue according to user reports in SYSTRAN/faster-whisper#71. I have a hard time debugging this issue as I don't typically develop on Windows.
For now I suggest that you update the code to keep the model alive until all transcriptions are complete.
from whisper-ctranslate2.
Loaded 0.2.7 and sure enough this fixed the problem for me. I had been forced to use --device cpu for a while now, which is significantly slower than cuda with my 3080. Thank you.
from whisper-ctranslate2.
Thanks for reporting this
If you can do the following:
-
Please make sure that you use version 0.16. If you are not, please update to this version.
-
While the tool is running, can you see anything on the terminal? You should see the transcription that is doing.
-
Can you try just "whisper-ctranslate2 [the video file] --model large-v2". Does it work?
Thanks
from whisper-ctranslate2.
Hello. I'm unable to reproduce this problem in my Windows machine.
My only comment if you have tried doing inference in CPU vs GPU and if this makes any difference.
Thanks
from whisper-ctranslate2.
Hi, I Have the same problem, transcription appears on the screen until the end of the duration but no files are produced.
the model is only using 50% of vram so it's definitely not crashing.
I'm also on windows, python3.9
this only happens on some files, smaller files or "clearer" files work fine, I think it's looping in the end or something like that.
"--vad_filter True" doesn't seem to do anything.
from whisper-ctranslate2.
Do you have any file that you can share then I can try to reproduce it? Thanks
from whisper-ctranslate2.
I can confirm that I have the same problem. No output file is created. My command line is (in powershell):
whisper-ctranslate2 $_ --model medium --language 'Japanese' --vad_filter True --device cuda --compute_type int8 --output_format srt --output_dir $directory --task translate --word_timestamps True --verbose True > $directory\$filename.md
Could you try running this on a clip that is only one or two minutes, and see if it works? That seems like it works for me, which may help narrow down a cause if that is a reproducible pattern.
from whisper-ctranslate2.
Hmm, here I don't see any text in the cmd terminal window when --cuda is enabled (and there's no text output). When set to CPU it works fine on every file I've given it in English and Japanese. I'm using an NVIDIA RTX 2080 Super with the current studio driver and CUDA SDK also installed (Windows 11).
from whisper-ctranslate2.
Thanks for investing time on this @runw99
Regarding memory, Python uses reference counting then it should delete the variable when it does out of scope.
Here you have an article that explains how memory works in Python:
https://rushter.com/blog/python-garbage-collector/
Actually you have check the reference that it has by doing:
import sys
print(sys.getrefcount(foo))
I have no idea why this happens, but I do not believe that is due to the variable going out of scope (it's recycled)
from whisper-ctranslate2.
Thanks for investing time on this @runw99
Regarding memory, Python uses reference counting then it should delete the variable when it does out of scope.
Here you have an article that explains how memory works in Python:
https://rushter.com/blog/python-garbage-collector/
Actually you have check the reference that it has by doing:
import sys print(sys.getrefcount(foo))
I have no idea why this happens, but I do not believe that is due to the variable going out of scope (it's recycled)
Thansk for your reply. The article you mentioned helps me review the Garbage Collection in Python and learn something new.
And I went back and tried some copy.deepcopy(list_segments)
operations, but still couldn't solve this bug. So, perhaps the Garbage Collection is really not the reason for this.
I have never encountered such a bug before, and I am curious about its causes and solutions. Looking forward to the follow-up
Thank you again for the patient answer and this project really saves me a lot of effort to run a big model.
from whisper-ctranslate2.
I ran 355 files, ranging in length from 10 to 120 minutes.
In the output I got 150(*5) files with text.
So I confirm that there is definitely a problem.
The original whisper project works correctly, so it's strange...
from whisper-ctranslate2.
@rsmith02ct reported that my standalone compile doesn't have this bug. [it doesn't use cli from this repo]
I can confirm that 16khz mono conversion works, but a lot of the information are lost and the output is very different than CPU on original file.
Faster-whisper converts to same audio format using PyAV library, OpenAI is using ffmpeg.
Strangely, transcription quality and timestamps accuracy ~significantly suffers on audios converted by ffmpeg.exe, no idea why this happens, I'm too lazy to investigate this...
from whisper-ctranslate2.
I second @rsmith02ct , I too have noticed that when I convert audio by Audacity, the results are better than ffmpeg.
from whisper-ctranslate2.
same problem , 1.0 could outputs , but will frequently missing large dialogues
from whisper-ctranslate2.
@Qel0droma Are you using a GPU?
from whisper-ctranslate2.
Win 11:
C:\Users\emcod>whisper-ctranslate2 --verbose True --temperature_increment_on_fallback None c:\temp\1234.wav
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
C:\Users\emcod>echo %errorlevel%
-1073740791
C:\Users\emcod>whisper-ctranslate2 --verbose True c:\temp\test.wav --compute_type int8
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
C:\Users\emcod>whisper-ctranslate2 --verbose True --temperature_increment_on_fallback None c:\temp\1234.wav
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
C:\Users\emcod>echo %errorlevel%
-1073740791
C:\Users\emcod>whisper-ctranslate2 --temperature_increment_on_fallback None c:\temp\1234.wav
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
C:\Users\emcod>whisper-ctranslate2 --temperature_increment_on_fallback None --language de c:\temp\1234.wav
Detected language 'German' with probability 1.000000
C:\Users\emcod>whisper-ctranslate2 --temperature_increment_on_fallback None --language de c:\temp\test.wav
Detected language 'German' with probability 1.000000
C:\Users\emcod>echo %errorlevel%
-1073740791
Going to try on other OS tomorrow.
from whisper-ctranslate2.
Hi,
I think it's the same issue as guillaumekln/faster-whisper#71 which I can now reproduce on Windows.
When the output files are missing, you can verify that the process crashed with a non-zero exit code:
PS > $LASTEXITCODE -1073740791
The process crashes when the model is unloaded but only when the transcription triggered the temperature fallback. If you disable the temperature fallback it should work without issue. Try adding this option on the command line:
--temperature_increment_on_fallback None
The crash seems to happen only on Windows.
@jordimas In the meantime, you could slightly change the code to ensure the
WhisperModel
instance is still alive when writing the results on disk.
Thank you, this fixes my problem. Yes, I am on Windows.
Unfortunately, that setting was particularly useful, as it prevents the translation from falling into ruts. I will have to make do with a combination of other settings for now.
from whisper-ctranslate2.
Even using
--temperature_increment_on_fallback None
, I am getting zero output (even on the console) if I use the GPU on Windows. I am using a 3090, and I did install the various dependencies as far as I can tell. It would be nice if we got an error message of some kind.
It looks like it is linked to general use of Temperature, perhaps? I was under the impression that you can have no temperature increment while still using temperature and best_of, but it looks like I get intermittent missing outputs if I am using any temperature settings at all other than just setting the fallback to None.
from whisper-ctranslate2.
@jordimas In the meantime, you could slightly change the code to ensure the
WhisperModel
instance is still alive when writing the results on disk.
Thanks a lot for looking into this issue. I was trying to get more evidence before reporting it to CTranslate issue, but it's great that you are looking a this.
Based on the feedback on this thread and the fact that I do not even have a Windows box with CUDA to test it, I do not know if it's worth to do a fix in whisper-ctranslate2 or just wait for the issue to be fixed in ctranslate2. I
from whisper-ctranslate2.
Just to see I made a local change to ensure the model was unloaded after outputs were written out. This sort of works, in that if it was going to crash, the files are written out before it crashes, but if you passed multiple files in to be processed it still crashes when the model is unloaded, so:
PS X:\to-process> whisper-ctranslate2 --model large-v2 --task translate --vad_filter True --language ja --output_format all --patience 2.0 -o translate-out file1.wav file2.wav file3.wav
Assuming the crash currently occurs with file2.wav
, before the change it only output the files for file1.wav
, now it outputs file2.wav
then crashes, so file3.wav
still isn't processed.
diff --git a/src/whisper_ctranslate2/transcribe.py b/src/whisper_ctranslate2/transcribe.py
index ca53fac..c422037 100644
--- a/src/whisper_ctranslate2/transcribe.py
+++ b/src/whisper_ctranslate2/transcribe.py
@@ -187,7 +187,7 @@ class Transcribe:
last_pos = segment.end
pbar.update(increment)
- return dict(
+ return model, dict(
text=all_text,
segments=list_segments,
language=language_name,
diff --git a/src/whisper_ctranslate2/whisper_ctranslate2.py b/src/whisper_ctranslate2/whisper_ctranslate2.py
index 1ff8335..58862a8 100644
--- a/src/whisper_ctranslate2/whisper_ctranslate2.py
+++ b/src/whisper_ctranslate2/whisper_ctranslate2.py
@@ -514,7 +514,7 @@ def main():
return
for audio_path in audio:
- result = Transcribe().inference(
+ model, result = Transcribe().inference(
audio_path,
model_dir,
cache_directory,
@@ -531,6 +531,7 @@ def main():
)
writer = get_writer(output_format, output_dir)
writer(result, audio_path, writer_args)
+ model = None
if verbose:
print(f"Transcription results written to '{output_dir}' directory")
So it's not that helpful to try and work around it from whisper-ctranslate2. Hopefully it can be resolved upstream.
from whisper-ctranslate2.
Is there a good workaround for this? Not having access to Temperature at all results in substantially worse model results.
from whisper-ctranslate2.
Hello @guillaumekln. Do you have a timeline to release OpenNMT/CTranslate2#1201 ? If it's going to take more than a week, I can release a version changing the structure of the code (while my preference is to get this fixed upstream).
Thanks,
Jordi
from whisper-ctranslate2.
I will then merge https://github.com/Softcatala/whisper-ctranslate2/pull/44/files in the next hours. This should fix the issue. If somebody wants to provide feedback since I do not have a Windows box handy neither. Thanks
from whisper-ctranslate2.
Version 0.2.6 should fix this.
Currently am having the same issue on 0.2.7.
C:\Users\igerm\Desktop\whisper〉whisper-ctranslate2 --model large-v2 --language English -f all --verbose True audio.wav
Detected language 'English' with probability 1.000000
And then it exits. CPU works.
from whisper-ctranslate2.
Detected language 'English' with probability 1.000000
IMHO that should be fixed, i mean it actually did not "detect" anything because the user disabled automated detection by specifying the language.
@iGerman00 try if this works for you https://github.com/Purfview/whisper-standalone-win
from whisper-ctranslate2.
I also have a similar problem, but in my case, there is no effective output. And the return code is not 0
(whisper) PS D:\BaiduNetdiskDownload> pip list
Package Version
------------------- ----------
av 10.0.0
certifi 2023.11.17
cffi 1.16.0
charset-normalizer 3.3.2
colorama 0.4.6
coloredlogs 15.0.1
ctranslate2 3.23.0
faster-whisper 0.10.0
filelock 3.13.1
flatbuffers 23.5.26
fsspec 2023.12.2
huggingface-hub 0.19.4
humanfriendly 10.0
idna 3.6
mpmath 1.3.0
numpy 1.26.2
onnxruntime 1.16.3
packaging 23.2
pip 23.3.1
protobuf 4.25.1
pycparser 2.21
pyreadline3 3.4.1
PyYAML 6.0.1
requests 2.31.0
setuptools 68.2.2
sounddevice 0.4.6
sympy 1.12
tokenizers 0.15.0
tqdm 4.66.1
typing_extensions 4.9.0
urllib3 2.1.0
wheel 0.41.2
whisper-ctranslate2 0.3.4
(whisper) PS D:\BaiduNetdiskDownload> whisper-ctranslate2.exe aaa.mp4 --model small --language zh --verbose True
stream 0, timescale not set
Detected language 'Chinese' with probability 1.000000
(whisper) PS D:\BaiduNetdiskDownload>
from whisper-ctranslate2.
Does this problem still exist? I am seeing it, so I think it is...
from whisper-ctranslate2.
from whisper-ctranslate2.
Related Issues (20)
- invalid int value HOT 1
- Multi Files HOT 1
- Issue with output redirection to R console using processx HOT 2
- GPU or CoreML acceleration on Mac Arm
- RuntimeError HOT 4
- Docker image for running whisper-ctranslate2 HOT 1
- error while installing on windows HOT 1
- Transcribing from live video stream source
- Save/record the audio into files HOT 1
- do not support distil-***.en model HOT 1
- Exploit found in Malwarebytes
- Can you please add support for word_timestamps, highlight_words and max_words_per_line? HOT 2
- Problem with GPU allocation after updating to CTranslate2 4.0.0 HOT 2
- Problem running it on Windows 11 HOT 2
- Mic input for live transcription HOT 2
- --word_timestamps = True not working HOT 1
- Disable output_dir? HOT 1
- parallel_for failed: cudaErrorNoKernelImageForDevice on Windows 11 in Powershell
- Crash after dedecting language HOT 1
- Path to local model directory should be considered a valid model option HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whisper-ctranslate2.