m1guelpf / yt-whisper Goto Github PK

View Code? Open in Web Editor NEW

1.3K 17.0 134.0 16 KB

Using OpenAI's Whisper to automatically generate YouTube subtitles

License: MIT License

Python 100.00%

ffmpeg openai openai-whisper whisper youtube youtube-dl subtitles subtitles-generated transcribe

yt-whisper's People

Stargazers

Watchers

Forkers

lovenick ryom0624 lyghtcode carlo697 thisisrahmat edebo guibibeau tfius hadryan sekouperry ogut77 baifengbai naresharelli nogarcia ztjhz alabarga techthiyanes agarwalprashant klaxonz hayaihayai crypto-forks immapaladin amitalokbera crazypython aichr briankmdy leeseomin tobiasploetz cyphernaught-0x untszlung mindadeepam osantosae vlasvlasvlas brahimmade jordilinarespellicer sloganking sonicrules1234 prateekralhan lschiesser evdcush compustar mliradelc tarako-east michelealberto sangeethavenkatesan hirajanwin claytonjr raks0078 rishi0110 hanw dedust3 afrizal0 chid hyojunguy archiveproject positioner rushchang javileyes iseltech rajaramkuberan hezuikn stephen137 446104748 aqf671 dreamboy1339 mysticaltech anushibin007 shasha79 aviv926 kewingj ivanlynch agdula michaeldohyun yichengdwu vhngroup jkliop09 theuerc niittymaa zero506 pptpreg rohitm00 00-iris ztupidts zapplecat chienfeng aj-goldie emailandxu blackhuman eamehdi sorokinvld aicodehunt merlinn269 sanz111 spawan02 saanvi95 nick777-pixel fisforfaheem aganoob tuygurseckin swananalytics

yt-whisper's Issues

youtube-dl is Heavily Throttled, Consider supporting yt-dlp

Youtube-dl isn't being supported anymore (last update was in 2021). I'm finding that my downloads are being throttled heavily.

Could you please consider switching or supporting yt-dlp? It is a fork that is actively supported, has more features, and is faster.

Steps to reproduce

After installation, use example command: yt_whisper https://www.youtube.com/watch?v=dQw4w9WgXcQ

Error

zsh: no matches found: https://www.youtube.com/watch?v=dQw4w9WgXcQ

Can I translate English videos to Korean?

What should I do?

Can't install from prebuilt wheel

Firstly not sure if this is a user error or a project issue but I can't find anywhere else to make a request for assistance so I figured I'd do it here.

The installation instructions don't mention a rust compiler, however, when the installation is run, it says

If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.

Should I try installing a Rust compiler? It sounds a bit complicated so that hasnt been my first course of action if there is a prebuilt wheel.

Run with docker

First of all, thanks for making yt-whisper.

I have almost zero knowledge with docker, but tried with the following Dockerfile:

ARG BASE=python:3.10
FROM ${BASE}
RUN mkdir -p /output
RUN apt-get update && apt-get install -y ffmpeg git
RUN pip install git+https://github.com/m1guelpf/yt-whisper.git

ENTRYPOINT ["yt_whisper"]

For some reason, the process ends with Killed right after downloading the large model. I might be doing something wrong but I don't know.

It would be nice to have docker support out of the box.

`winget` support on Windows

winget is the official package manager for Windows now, I see you have choco, would you be able to add it to winget as well?

How to run on local files?

Hi, nice project!!
Do you know how to run this on local .mp4 files? Is there a command I could use? Nothing was mentioned in your readme.

Thanks in advance! :)

Could not find a version that satisfies the requirement whisper

WARNING: Generating metadata for package whisper produced metadata for project name openai-whisper. Fix your #egg=whisper fragments.
Discarding git+https://github.com/openai/whisper.git@main#egg=whisper: Requested openai-whisper from git+https://github.com/openai/whisper.git@main#egg=whisper (from yt-whisper==1.0) has inconsistent name: expected 'whisper', but metadata has 'openai-whisper'
Collecting yt-dlp
Using cached yt_dlp-2023.1.6-py2.py3-none-any.whl (2.8 MB)
ERROR: Could not find a version that satisfies the requirement whisper (unavailable) (from yt-whisper) (from versions: 0.9.5, 0.9.6, 0.9.7, 0.9.8, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.9.13, 0.9.14, 0.9.15, 0.9.16, 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.1.9, 1.1.10)
ERROR: No matching distribution found for whisper (unavailable)

Add line breaks into long lines

When I tried using Whisper to generate subtitles for one of my own talks earlier today, one adjustment I needed to make is to break up long lines. For example, instead of

A friend of mine was of the opinion that it was missing an exclamation mark in the string.

it is preferable to have

A friend of mine was of the opinion
that it was missing an exclamation mark in the string.

Netflix, along with other guidelines, recommends:

Prefer a bottom-heavy pyramid shape for subtitles when multiple line break options present themselves, but avoid having just one or two words on the top line.

I think it would be really cool if we could also automate that process of breaking the text into lines! :) Cheers for your really cool work here! \o/

please help me！ERROR

PS F:\yongqi\project> pip install git+https://github.com/m1guelpf/yt-whisper.git
Looking in indexes: http://pypi.douban.com/simple
Collecting git+https://github.com/m1guelpf/yt-whisper.git
Cloning https://github.com/m1guelpf/yt-whisper.git to c:\users\anbaobu\appdata\local\temp\pip-req-build-gh43evvc
Running command git clone --filter=blob:none --quiet https://github.com/m1guelpf/yt-whisper.git 'C:\Users\anbaobu\AppData\Local\Temp\pip-req-build-gh43evvc'
Resolved https://github.com/m1guelpf/yt-whisper.git to commit 0190e7e
Preparing metadata (setup.py) ... done
Collecting whisper@ git+https://github.com/openai/whisper.git@main#egg=whisper
Cloning https://github.com/openai/whisper.git (to revision main) to c:\users\anbaobu\appdata\local\temp\pip-install-oww449jx\whisper_ce653dcda98444e1baac0baa96e7d960
Running command git clone --filter=blob:none --quiet https://github.com/openai/whisper.git 'C:\Users\anbaobu\AppData\Local\Temp\pip-install-oww449jx\whisper_ce653dcda98444e1baac0baa96e7d960'
Resolved https://github.com/openai/whisper.git to commit 7858aa9c08d98f75575035ecd6481f462d66ca27
Preparing metadata (setup.py) ... done
WARNING: Generating metadata for package whisper produced metadata for project name openai-whisper. Fix your #egg=whisper fragments.
Discarding git+https://github.com/openai/whisper.git@main#egg=whisper: Requested openai-whisper from git+https://github.com/openai/whisper.git@main#egg=whisper (from yt-whisper==1.0) has inconsistent name: expected 'whisper', but metadata has 'openai-whisper'
Requirement already satisfied: yt-dlp in c:\python311\lib\site-packages (from yt-whisper==1.0) (2023.1.6)
ERROR: Could not find a version that satisfies the requirement whisper (unavailable) (from yt-whisper) (from versions: 0.9.5, 0.9.6, 0.9.7, 0.9.8, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.9.13, 0.9.14, 0.9.15, 0.9.16, 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.1.9, 1.1.10)
ERROR: No matching distribution found for whisper (unavailable)

If you are getting error this ERROR: [youtube] dQw4w9WgXcQ:

It's a simple fix

use this to update yt_dlp package
https://github.com/yt-dlp/yt-dlp/wiki/Installation#with-pip

Command not runnable after install

After running

pip install git+https://github.com/m1guelpf/yt-whisper.git

and

yt_whisper "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

I get

bash: yt_whisper: command not found

Does python have a directory I need to add to my path?

I keep getting this error.

line 8, in
sys.exit(main())
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/yt_whisper/cli.py", line 49, in main
result = model.transcribe(audio_path, **args)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/transcribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/audio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/audio.py", line 46, in load_audio
except ffmpeg.Error as e:
AttributeError: module 'ffmpeg' has no attribute 'Error'

No issue just thank you

Arabic subtitles worked flawlessly
Thank you very much for your hard work and efforts

[M1 Mac] fft: ATen not compiled with MKL support

base ❯ yt_whisper 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'
100%|███████████████████████████████████████| 461M/461M [00:26<00:00, 18.2MiB/s]
Downloading video: 0.0%
Downloading video: 0.1%
Downloading video: 0.2%
Downloading video: 0.4%
Downloading video: 0.9%
Downloading video: 1.9%
Downloading video: 3.8%
Downloading video: 7.6%
Downloading video: 15.2%
Downloading video: 30.5%
Downloading video: 61.0%
Downloading video: 100.0%
Downloaded video "Rick Astley - Never Gonna Give You Up (Official Music Video)". Generating subtitles...
Traceback (most recent call last):
  File "/opt/homebrew/bin/yt_whisper", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.9/site-packages/yt_whisper/cli.py", line 40, in main
    result = model.transcribe(audio_path, **args)
  File "/opt/homebrew/lib/python3.9/site-packages/whisper/transcribe.py", line 84, in transcribe
    mel = log_mel_spectrogram(audio)
  File "/opt/homebrew/lib/python3.9/site-packages/whisper/audio.py", line 115, in log_mel_spectrogram
    stft = torch.stft(audio, N_FFT, HOP_LENGTH, window=window, return_complex=True)
  File "/opt/homebrew/lib/python3.9/site-packages/torch/functional.py", line 471, in stft
    return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
RuntimeError: fft: ATen not compiled with MKL support

Installation fails

The installation fails with the following error:

janwe@DESKTOP:~$ pip install git+https://github.com/m1guelpf/yt-whisper.git
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/m1guelpf/yt-whisper.git
  Cloning https://github.com/m1guelpf/yt-whisper.git to /tmp/pip-req-build-z4rcqme6
  Running command git clone --filter=blob:none --quiet https://github.com/m1guelpf/yt-whisper.git /tmp/pip-req-build-z4rcqme6
  Resolved https://github.com/m1guelpf/yt-whisper.git to commit 4e49b7851d8ee2f389f770b3e0967d4b1ac05a7d
  Preparing metadata (setup.py) ... done
Collecting whisper@ git+ssh://[email protected]/openai/whisper@main#egg=whisper
  Cloning ssh://****@github.com/openai/whisper (to revision main) to /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428
  Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428
The authenticity of host 'github.com (140.82.121.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
  Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
  [email protected]: Permission denied (publickey).
  fatal: Konnte nicht vom Remote-Repository lesen.

  Bitte stellen Sie sicher, dass die korrekten Zugriffsberechtigungen bestehen
  und das Repository existiert.
  error: subprocess-exited-with-error

  × git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428 did not run successfully.
  │ exit code: 128
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428 did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Support for local files?

Does this tool support loading local video files, or is it limited to videos hosted on YouTube?

Files Over 1 hour aren't timestamped correctly

Youtube doesn't take vtt files that are over 1 hour if the timestamps aren't adjusted for it apparently. When I went through and fixed the timestamps with this plugin for vscode they were properly accepted. I don't know if it's the leading zeroes on the sub-1h-timestamps or the zero in front of the 1h-timestamps but that did the trick somehow.

Here are the files before and after fixing:
Untold_Realms__The_Rise_of_Vestia___Session_6.txt
Untold_Realms__The_Rise_of_Vestia___Session_6-FIXED.txt
This is the video in question: https://www.youtube.com/watch?v=Cnk_7vk_4To&t=2s

Thank you for this awesome tool - I'll subtitle a bunch more videos in the future with it! :)

Language support

Hi,

any idea or docs extension on how to implement it in Polish language?

Thanks

New yt-whisper

Hello everyone,

We, at EasyBooks initiative, are working a Python package that accepts a YouTube video/playlist link and transcript it using Whisper.

The package supports all features of yt-whisper and adds new features to it. You can find the package here: https://github.com/ieasybooks/tafrigh.

You can contribute by start adding an English README.md file, or try it with our easy to use Google Colab notebook: https://colab.research.google.com/github/ieasybooks/tafrigh/blob/main/colab_notebook.ipynb.

Timing in the .vtt file is way off

First off, thanks for this very nice and useful project. I ran yt-whisper on a video just to check things out. I found that the timings in the VTT file varied between spot-on and several subtitles ahead. At first I thought the titles were gradually getting ever-further ahead but then they would periodically sync up with the audio and then start gradually getting ahead again and the cycle would repeat throughout the video.

How do I proceed next to get the text? [I am a noob]

If the following is the code for whisper (it is for local video), what will be the code for yt-whisper for a youtube video?

import whisper
model = whisper.load_model("base")
result = model.transcribe("localaudio.mp3")
print(result["text"])

Support running the model without GPU

There should be an option for the user to run the model on CPU

Length of the subtitles

Thank you very much for this program. It saves me a lot of time!

The subtitles produced are superior to the ones generated by YouTube in every way except one: they are too long, on occasion even ridiculously long.

I wonder if it would be possible to define a maximum number of characters that appear on screen at the same time? Or at least make a mandatory break after a period (.) if the sentence is above a certain number of characters?