Giter Site home page Giter Site logo

m1guelpf / yt-whisper Goto Github PK

View Code? Open in Web Editor NEW
1.3K 17.0 134.0 16 KB

Using OpenAI's Whisper to automatically generate YouTube subtitles

License: MIT License

Python 100.00%
ffmpeg openai openai-whisper whisper youtube youtube-dl subtitles subtitles-generated transcribe

yt-whisper's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yt-whisper's Issues

youtube-dl is Heavily Throttled, Consider supporting yt-dlp

Youtube-dl isn't being supported anymore (last update was in 2021). I'm finding that my downloads are being throttled heavily.

Could you please consider switching or supporting yt-dlp? It is a fork that is actively supported, has more features, and is faster.

Error: no matches found

Steps to reproduce

  1. After installation, use example command: yt_whisper https://www.youtube.com/watch?v=dQw4w9WgXcQ

Error

zsh: no matches found: https://www.youtube.com/watch?v=dQw4w9WgXcQ

Can't install from prebuilt wheel

Firstly not sure if this is a user error or a project issue but I can't find anywhere else to make a request for assistance so I figured I'd do it here.

The installation instructions don't mention a rust compiler, however, when the installation is run, it says

If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.

Should I try installing a Rust compiler? It sounds a bit complicated so that hasnt been my first course of action if there is a prebuilt wheel.

Run with docker

First of all, thanks for making yt-whisper.

I have almost zero knowledge with docker, but tried with the following Dockerfile:

ARG BASE=python:3.10
FROM ${BASE}
RUN mkdir -p /output
RUN apt-get update && apt-get install -y ffmpeg git
RUN pip install git+https://github.com/m1guelpf/yt-whisper.git

ENTRYPOINT ["yt_whisper"]

For some reason, the process ends with Killed right after downloading the large model. I might be doing something wrong but I don't know.

It would be nice to have docker support out of the box.

`winget` support on Windows

winget is the official package manager for Windows now, I see you have choco, would you be able to add it to winget as well?

How to run on local files?

Hi, nice project!!
Do you know how to run this on local .mp4 files? Is there a command I could use? Nothing was mentioned in your readme.

Thanks in advance! :)

Could not find a version that satisfies the requirement whisper

WARNING: Generating metadata for package whisper produced metadata for project name openai-whisper. Fix your #egg=whisper fragments.
Discarding git+https://github.com/openai/whisper.git@main#egg=whisper: Requested openai-whisper from git+https://github.com/openai/whisper.git@main#egg=whisper (from yt-whisper==1.0) has inconsistent name: expected 'whisper', but metadata has 'openai-whisper'
Collecting yt-dlp
Using cached yt_dlp-2023.1.6-py2.py3-none-any.whl (2.8 MB)
ERROR: Could not find a version that satisfies the requirement whisper (unavailable) (from yt-whisper) (from versions: 0.9.5, 0.9.6, 0.9.7, 0.9.8, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.9.13, 0.9.14, 0.9.15, 0.9.16, 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.1.9, 1.1.10)
ERROR: No matching distribution found for whisper (unavailable)

Add line breaks into long lines

When I tried using Whisper to generate subtitles for one of my own talks earlier today, one adjustment I needed to make is to break up long lines. For example, instead of

A friend of mine was of the opinion that it was missing an exclamation mark in the string.

it is preferable to have

A friend of mine was of the opinion
that it was missing an exclamation mark in the string.

Netflix, along with other guidelines, recommends:

Prefer a bottom-heavy pyramid shape for subtitles when multiple line break options present themselves, but avoid having just one or two words on the top line.

I think it would be really cool if we could also automate that process of breaking the text into lines! :) Cheers for your really cool work here! \o/

please help me!ERROR

PS F:\yongqi\project> pip install git+https://github.com/m1guelpf/yt-whisper.git
Looking in indexes: http://pypi.douban.com/simple
Collecting git+https://github.com/m1guelpf/yt-whisper.git
Cloning https://github.com/m1guelpf/yt-whisper.git to c:\users\anbaobu\appdata\local\temp\pip-req-build-gh43evvc
Running command git clone --filter=blob:none --quiet https://github.com/m1guelpf/yt-whisper.git 'C:\Users\anbaobu\AppData\Local\Temp\pip-req-build-gh43evvc'
Resolved https://github.com/m1guelpf/yt-whisper.git to commit 0190e7e
Preparing metadata (setup.py) ... done
Collecting whisper@ git+https://github.com/openai/whisper.git@main#egg=whisper
Cloning https://github.com/openai/whisper.git (to revision main) to c:\users\anbaobu\appdata\local\temp\pip-install-oww449jx\whisper_ce653dcda98444e1baac0baa96e7d960
Running command git clone --filter=blob:none --quiet https://github.com/openai/whisper.git 'C:\Users\anbaobu\AppData\Local\Temp\pip-install-oww449jx\whisper_ce653dcda98444e1baac0baa96e7d960'
Resolved https://github.com/openai/whisper.git to commit 7858aa9c08d98f75575035ecd6481f462d66ca27
Preparing metadata (setup.py) ... done
WARNING: Generating metadata for package whisper produced metadata for project name openai-whisper. Fix your #egg=whisper fragments.
Discarding git+https://github.com/openai/whisper.git@main#egg=whisper: Requested openai-whisper from git+https://github.com/openai/whisper.git@main#egg=whisper (from yt-whisper==1.0) has inconsistent name: expected 'whisper', but metadata has 'openai-whisper'
Requirement already satisfied: yt-dlp in c:\python311\lib\site-packages (from yt-whisper==1.0) (2023.1.6)
ERROR: Could not find a version that satisfies the requirement whisper (unavailable) (from yt-whisper) (from versions: 0.9.5, 0.9.6, 0.9.7, 0.9.8, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.9.13, 0.9.14, 0.9.15, 0.9.16, 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.1.9, 1.1.10)
ERROR: No matching distribution found for whisper (unavailable)

Command not runnable after install

After running

pip install git+https://github.com/m1guelpf/yt-whisper.git

and

yt_whisper "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

I get

bash: yt_whisper: command not found

Does python have a directory I need to add to my path?

I keep getting this error.

line 8, in
sys.exit(main())
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/yt_whisper/cli.py", line 49, in main
result = model.transcribe(audio_path, **args)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/transcribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/audio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/audio.py", line 46, in load_audio
except ffmpeg.Error as e:
AttributeError: module 'ffmpeg' has no attribute 'Error'

[M1 Mac] fft: ATen not compiled with MKL support

base ❯ yt_whisper 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'
100%|███████████████████████████████████████| 461M/461M [00:26<00:00, 18.2MiB/s]
Downloading video: 0.0%
Downloading video: 0.1%
Downloading video: 0.2%
Downloading video: 0.4%
Downloading video: 0.9%
Downloading video: 1.9%
Downloading video: 3.8%
Downloading video: 7.6%
Downloading video: 15.2%
Downloading video: 30.5%
Downloading video: 61.0%
Downloading video: 100.0%
Downloaded video "Rick Astley - Never Gonna Give You Up (Official Music Video)". Generating subtitles...
Traceback (most recent call last):
  File "/opt/homebrew/bin/yt_whisper", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.9/site-packages/yt_whisper/cli.py", line 40, in main
    result = model.transcribe(audio_path, **args)
  File "/opt/homebrew/lib/python3.9/site-packages/whisper/transcribe.py", line 84, in transcribe
    mel = log_mel_spectrogram(audio)
  File "/opt/homebrew/lib/python3.9/site-packages/whisper/audio.py", line 115, in log_mel_spectrogram
    stft = torch.stft(audio, N_FFT, HOP_LENGTH, window=window, return_complex=True)
  File "/opt/homebrew/lib/python3.9/site-packages/torch/functional.py", line 471, in stft
    return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
RuntimeError: fft: ATen not compiled with MKL support

Installation fails

The installation fails with the following error:

janwe@DESKTOP:~$ pip install git+https://github.com/m1guelpf/yt-whisper.git
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/m1guelpf/yt-whisper.git
  Cloning https://github.com/m1guelpf/yt-whisper.git to /tmp/pip-req-build-z4rcqme6
  Running command git clone --filter=blob:none --quiet https://github.com/m1guelpf/yt-whisper.git /tmp/pip-req-build-z4rcqme6
  Resolved https://github.com/m1guelpf/yt-whisper.git to commit 4e49b7851d8ee2f389f770b3e0967d4b1ac05a7d
  Preparing metadata (setup.py) ... done
Collecting whisper@ git+ssh://[email protected]/openai/whisper@main#egg=whisper
  Cloning ssh://****@github.com/openai/whisper (to revision main) to /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428
  Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428
The authenticity of host 'github.com (140.82.121.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
  Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
  [email protected]: Permission denied (publickey).
  fatal: Konnte nicht vom Remote-Repository lesen.

  Bitte stellen Sie sicher, dass die korrekten Zugriffsberechtigungen bestehen
  und das Repository existiert.
  error: subprocess-exited-with-error

  × git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428 did not run successfully.
  │ exit code: 128
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428 did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Support for local files?

Does this tool support loading local video files, or is it limited to videos hosted on YouTube?

Files Over 1 hour aren't timestamped correctly

Youtube doesn't take vtt files that are over 1 hour if the timestamps aren't adjusted for it apparently. When I went through and fixed the timestamps with this plugin for vscode they were properly accepted. I don't know if it's the leading zeroes on the sub-1h-timestamps or the zero in front of the 1h-timestamps but that did the trick somehow.

Here are the files before and after fixing:
Untold_Realms__The_Rise_of_Vestia___Session_6.txt
Untold_Realms__The_Rise_of_Vestia___Session_6-FIXED.txt
This is the video in question: https://www.youtube.com/watch?v=Cnk_7vk_4To&t=2s

Thank you for this awesome tool - I'll subtitle a bunch more videos in the future with it! :)

Language support

Hi,

any idea or docs extension on how to implement it in Polish language?

Thanks

New yt-whisper

Hello everyone,

We, at EasyBooks initiative, are working a Python package that accepts a YouTube video/playlist link and transcript it using Whisper.

The package supports all features of yt-whisper and adds new features to it. You can find the package here: https://github.com/ieasybooks/tafrigh.

You can contribute by start adding an English README.md file, or try it with our easy to use Google Colab notebook: https://colab.research.google.com/github/ieasybooks/tafrigh/blob/main/colab_notebook.ipynb.

Timing in the .vtt file is way off

First off, thanks for this very nice and useful project. I ran yt-whisper on a video just to check things out. I found that the timings in the VTT file varied between spot-on and several subtitles ahead. At first I thought the titles were gradually getting ever-further ahead but then they would periodically sync up with the audio and then start gradually getting ahead again and the cycle would repeat throughout the video.

How do I proceed next to get the text? [I am a noob]

If the following is the code for whisper (it is for local video), what will be the code for yt-whisper for a youtube video?

import whisper
model = whisper.load_model("base")
result = model.transcribe("localaudio.mp3")
print(result["text"])

Length of the subtitles

Thank you very much for this program. It saves me a lot of time!

The subtitles produced are superior to the ones generated by YouTube in every way except one: they are too long, on occasion even ridiculously long.

I wonder if it would be possible to define a maximum number of characters that appear on screen at the same time? Or at least make a mandatory break after a period (.) if the sentence is above a certain number of characters?

large-v2

OpenAI released large-v2, could you add support for it here and in your replicate model?

Support for manual language selection

Would be usefull to be able to pass language argument to Whisper, since automatic language detection doesn't always work reliably. Currently trying to specify language produces error: "yt_whisper: error: unrecognized arguments: --language"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.