Giter Site home page Giter Site logo

transcribe's Introduction

Transcribe

Transcribe video and add subtitles using Whisper AI.

Features

  • Split videos longer than 20 minutes into segments.
  • Extract audio from video files.
  • Transcribe audio using Whisper AI.
  • Generate and burn subtitles into videos.
  • Handle various video formats, including .mkv conversion to .mp4.

Requirements

See requirements.txt for a list of dependencies.

  • torch
  • torch-audiomentations
  • torch-pitch-shift
  • torchaudio
  • torchmetrics
  • torchvision
  • openai-whisper
  • colorama
  • ffutils
  • ffmpeg

Note: Additionally, you'll need to install the following external dependencies:

  • FFmpeg: Download and install FFmpeg separately from here.

    After downloading, extract the contents of the archive and add the FFmpeg binaries to your system PATH.

  • CUDA Toolkit (Optional, for users with dedicated GPUs): If you have a dedicated NVIDIA GPU and wish to enable GPU acceleration, you'll need to install the CUDA Toolkit. You can download the CUDA Toolkit from here and follow the installation instructions provided by NVIDIA.

Setting up a Virtual Environment

It's recommended to use a virtual environment to manage dependencies for this project. Follow these steps to create and activate a virtual environment:

  1. Install virtualenv if you haven't already:
pip install virtualenv
  1. Navigate to the project directory:
cd /path/to/your/directory
  1. Create a virtual environment:
python -m venv my_env
  1. Activate the virtual environment:
    • On Windows:

      my_env\Scripts\activate
    • On macOS and Linux:

      source my_env/bin/activate

Once activated, you can install the project dependencies within the virtual environment without affecting your system-wide Python installation.

Installation

To install the package, clone the repository and install using pip:

git clone https://github.com/jeromearellano/transcribe.git
cd transcribe
pip install . --extra-index-url https://download.pytorch.org/whl/cu118

Usage

To use the CLI tool, run the following command:

transcribe -i /path/to/video.mp4 --model small

Arguments

  • -i, --input: Path to the input video file. (required)
  • --model: Whisper model size to use (tiny, base, small, medium, large). Default is small.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

License

This project is licensed under the MIT License.

Acknowledgements

transcribe's People

Contributors

jeromearellano avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.