Whisper command line client compatible with original OpenAI client based on CTranslate2.
It uses CTranslate2 and Faster-whisper Whisper implementation that is up to 4 times faster than openai/whisper for the same accuracy while using less memory.
Goals of the project:
- Provide an easy way to use the CTranslate2 Whisper implementation
- Ease the migration for people using OpenAI Whisper CLI
Just type:
pip install -U whisper-ctranslate2
Alternatively, the following command will pull and install the latest commit from this repository, along with its Python dependencies:
pip install git+https://github.com/jordimas/whisper-ctranslate2.git
Same command line that OpenAI whisper.
To transcribe:
whisper-ctranslate2 inaguracio2011.mp3 --model medium
To translate:
whisper-ctranslate2 inaguracio2011.mp3 --model medium --task translate
Additionally using:
whisper-ctranslate2 --help
All the supported options with their help are shown.
On top of the OpenAI Whisper command line options, there are some specific options provided by CTranslate2 .
--compute_type {default,auto,int8,int8_float16,int16,float16,float32}
Type of quantization to use. On CPU int8 will give the best performance.
--model_directory MODEL_DIRECTORY
Directory where to find a CTranslate Whisper model, for example a fine-tunned Whisper model. The model should be in CTranslate2 format.
--device_index [DEVICE_INDEX ...]
Device IDs where to place this model on
--vad_filter VAD_FILTER
Enable the voice activity detection (VAD) to filter out parts of the audio without speech. This step is using the Silero VAD model https://github.com/snakers4/silero-vad.
--vad_min_silence_duration_ms VAD_MIN_SILENCE_DURATION_MS
When vad_filter
is enabled, audio segments without speech for at least this number of milliseconds will be ignored.
On top of the OpenAI Whisper and CTranslate2, whisper-ctranslate2 provides some additional specific options:
--print-colors PRINT_COLORS
Adding the --print_colors True
argument will print the transcribed text using an experimental color coding strategy based on whisper.cpp to highlight words with high or low confidence:
Jordi Mas [email protected]