Video Swear Jar - The AI-Powered Profanity Filter

Introducing Video Swear Jar, your AI-powered solution for creating clean video content! This project offers a Docker container with all necessary tools to process videos, transcribe the audio, detect profanity, and remove inappropriate language, delivering a new, family-friendly video file.

Project Goals

Simplify the process of removing profanity from video files. The current process is quite technical, and we aim to make it more user-friendly.
Enable local processing without relying on an internet connection.
Minimize cost for users by keeping the solution as affordable as possible.

Process Overview

Transcribe the video file using OpenAI Whisper
Process the transcription file:
- Detects profanity based on predefined swear-words.json file
- Generate an FFmpeg video cut file.
Use FFmpeg to cut the video file at specified times and create a new, edited video file.

Requirements

Docker installed

Usage

# create new video file with profanity removed
docker run --rm -it \
  -v $(pwd):/data jveldboom/video-swear-jar:v1 \
  clean --input video.mkv --model tiny.en --language en

# recommended to mount a ".whisper" directory to locally cache large language models
docker run --rm -it \
  -v $(pwd):/data \
  -v $(pwd)/.whisper:/app/.whisper jveldboom/video-swear-jar:v1 \
  clean --input video.mkv --model tiny.en --language en

Arguments

--input - path to video file
--model - whisper model name - tiny, tiny.en (default), base, base.en, small, small.en, medium, medium.sm, large. View official docs for break down of model size and performance
--language - language code. Typically improves in transcription to set language instead of allowing Whisper to auto-detect.
--engine - transcription engine whisper-ctranslate2 (default) or whisper. whisper is likely to be removed in the near future if/when whisper-ctranslate2 proves to be just as good but 4x faster

Known Issues

Error: Command "whisper" exited with code null - this is likely caused by the container needing more allocated memory. Allocating at least 4 GB memory for the small.en usually resolved the issue but your mileage may vary.

Utility Commands

There are a handful of utility commands that I find useful in the workflow to edit a video that are available in the Docker container.

`cut-video`

Allows you to manually create a list of timestamps to cut the video.

Usage:

docker run --rm -it -v $(pwd):/data jveldboom/video-swear-jar:v1 \
  cut-video --timestamp timestamps.txt --video video.mkv

--timestamp - path to file with timestamps. Each timestamp must be on a new line in HH:MM:SS - HH:MM:SS format
--video - path to video to cut the video
--cut-video - optional boolean to set to not cut video but only output cut file

`read-subtitles`

Reads a subtitle file and prints out the lines with swear words. Useful if you want to cut the video manually.

Usage:

docker run --rm -it -v $(pwd):/data jveldboom/video-swear-jar:v1 \
  read-subtitles --subtitles path/to/subtitle.srt

--subtitles - path to file with timestamps

`whisper`

This is the whisper CLI if you need to further customize the command. Visit https://github.com/openai/whisper for full details

Usage:

docker run --rm -it -v $(pwd):/data jveldboom/video-swear-jar:v1 \
  whisper my-video.mp4 \
    --model tiny.en \
    --language en \
    --output_format json \
    --output_dir data

ffmpeg

Usage:

docker run --rm -it -v $(pwd):/data jveldboom/video-swear-jar:v1 \
  ffmpeg -i input.mp4 output.avi

Roadmap

Notes

Alternatives

Below is a small list of possible alternatives to vide-swear-jar.

AWS Transcribe - in my testing this provides very good results but is relatively expensive for small projects plus it requires uploading your video files to the cloud.
OpenAI Whisper API - requires uploading video to cloud

Tools

Small list of tool I use in my workflow

MakeMKV - "convert videos (DVD/Blu-ray) that you own into free and patents-unencumbered format that can be played everywhere"

jveldboom / video-swear-jar Goto Github PK

video-swear-jar's Introduction

Video Swear Jar - The AI-Powered Profanity Filter

Project Goals

Process Overview

Requirements

Usage

Arguments

Known Issues

Utility Commands

cut-video

read-subtitles

whisper

ffmpeg

Roadmap

Notes

Alternatives

Tools

video-swear-jar's People

Contributors

Stargazers

Watchers

Recommend Projects

Recommend Topics

Recommend Org

`cut-video`

`read-subtitles`

`whisper`