Transcribe

Transcribe video and add subtitles using Whisper AI.

Features

Split videos longer than 20 minutes into segments.
Extract audio from video files.
Transcribe audio using Whisper AI.
Generate and burn subtitles into videos.
Handle various video formats, including .mkv conversion to .mp4.

Requirements

See requirements.txt for a list of dependencies.

torch
torch-audiomentations
torch-pitch-shift
torchaudio
torchmetrics
torchvision
openai-whisper
colorama
ffutils
ffmpeg

Note: Additionally, you'll need to install the following external dependencies:

FFmpeg: Download and install FFmpeg separately from here.

After downloading, extract the contents of the archive and add the FFmpeg binaries to your system PATH.
CUDA Toolkit (Optional, for users with dedicated GPUs): If you have a dedicated NVIDIA GPU and wish to enable GPU acceleration, you'll need to install the CUDA Toolkit. You can download the CUDA Toolkit from here and follow the installation instructions provided by NVIDIA.

Setting up a Virtual Environment

It's recommended to use a virtual environment to manage dependencies for this project. Follow these steps to create and activate a virtual environment:

Install virtualenv if you haven't already:

pip install virtualenv

Navigate to the project directory:

cd /path/to/your/directory

Create a virtual environment:

python -m venv my_env

Activate the virtual environment:
- On Windows:
```
my_env\Scripts\activate
```
- On macOS and Linux:
```
source my_env/bin/activate
```

Once activated, you can install the project dependencies within the virtual environment without affecting your system-wide Python installation.

Installation

To install the package, clone the repository and install using pip:

git clone https://github.com/jeromearellano/transcribe.git
cd transcribe
pip install . --extra-index-url https://download.pytorch.org/whl/cu118

Usage

To use the CLI tool, run the following command:

transcribe -i /path/to/video.mp4 --model small

Arguments

-i, --input: Path to the input video file. (required)
--model: Whisper model size to use (tiny, base, small, medium, large). Default is small.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

License

This project is licensed under the MIT License.

Acknowledgements

OpenAI Whisper for the transcription model.
FFmpeg for video processing.
Colorama for colored terminal output.
Torch for the deep learning framework.
ffutils for FFmpeg utilities.

jeromearellano / transcribe Goto Github PK

transcribe's Introduction

Transcribe

Features

Requirements

Setting up a Virtual Environment

Installation

Usage

Arguments

Contributing

License

Acknowledgements

transcribe's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent