m1guelpf / yt-whisper Goto Github PK
View Code? Open in Web Editor NEWUsing OpenAI's Whisper to automatically generate YouTube subtitles
License: MIT License
Using OpenAI's Whisper to automatically generate YouTube subtitles
License: MIT License
Cfs
Youtube-dl isn't being supported anymore (last update was in 2021). I'm finding that my downloads are being throttled heavily.
Could you please consider switching or supporting yt-dlp? It is a fork that is actively supported, has more features, and is faster.
yt_whisper https://www.youtube.com/watch?v=dQw4w9WgXcQ
zsh: no matches found: https://www.youtube.com/watch?v=dQw4w9WgXcQ
What should I do?
Firstly not sure if this is a user error or a project issue but I can't find anywhere else to make a request for assistance so I figured I'd do it here.
The installation instructions don't mention a rust compiler, however, when the installation is run, it says
If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
Should I try installing a Rust compiler? It sounds a bit complicated so that hasnt been my first course of action if there is a prebuilt wheel.
First of all, thanks for making yt-whisper
.
I have almost zero knowledge with docker, but tried with the following Dockerfile
:
ARG BASE=python:3.10
FROM ${BASE}
RUN mkdir -p /output
RUN apt-get update && apt-get install -y ffmpeg git
RUN pip install git+https://github.com/m1guelpf/yt-whisper.git
ENTRYPOINT ["yt_whisper"]
For some reason, the process ends with Killed
right after downloading the large model. I might be doing something wrong but I don't know.
It would be nice to have docker support out of the box.
winget
is the official package manager for Windows now, I see you have choco, would you be able to add it to winget
as well?
Hi, nice project!!
Do you know how to run this on local .mp4 files? Is there a command I could use? Nothing was mentioned in your readme.
Thanks in advance! :)
WARNING: Generating metadata for package whisper produced metadata for project name openai-whisper. Fix your #egg=whisper fragments.
Discarding git+https://github.com/openai/whisper.git@main#egg=whisper: Requested openai-whisper from git+https://github.com/openai/whisper.git@main#egg=whisper (from yt-whisper==1.0) has inconsistent name: expected 'whisper', but metadata has 'openai-whisper'
Collecting yt-dlp
Using cached yt_dlp-2023.1.6-py2.py3-none-any.whl (2.8 MB)
ERROR: Could not find a version that satisfies the requirement whisper (unavailable) (from yt-whisper) (from versions: 0.9.5, 0.9.6, 0.9.7, 0.9.8, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.9.13, 0.9.14, 0.9.15, 0.9.16, 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.1.9, 1.1.10)
ERROR: No matching distribution found for whisper (unavailable)
When I tried using Whisper to generate subtitles for one of my own talks earlier today, one adjustment I needed to make is to break up long lines. For example, instead of
A friend of mine was of the opinion that it was missing an exclamation mark in the string.
it is preferable to have
A friend of mine was of the opinion
that it was missing an exclamation mark in the string.
Netflix, along with other guidelines, recommends:
Prefer a bottom-heavy pyramid shape for subtitles when multiple line break options present themselves, but avoid having just one or two words on the top line.
I think it would be really cool if we could also automate that process of breaking the text into lines! :) Cheers for your really cool work here! \o/
PS F:\yongqi\project> pip install git+https://github.com/m1guelpf/yt-whisper.git
Looking in indexes: http://pypi.douban.com/simple
Collecting git+https://github.com/m1guelpf/yt-whisper.git
Cloning https://github.com/m1guelpf/yt-whisper.git to c:\users\anbaobu\appdata\local\temp\pip-req-build-gh43evvc
Running command git clone --filter=blob:none --quiet https://github.com/m1guelpf/yt-whisper.git 'C:\Users\anbaobu\AppData\Local\Temp\pip-req-build-gh43evvc'
Resolved https://github.com/m1guelpf/yt-whisper.git to commit 0190e7e
Preparing metadata (setup.py) ... done
Collecting whisper@ git+https://github.com/openai/whisper.git@main#egg=whisper
Cloning https://github.com/openai/whisper.git (to revision main) to c:\users\anbaobu\appdata\local\temp\pip-install-oww449jx\whisper_ce653dcda98444e1baac0baa96e7d960
Running command git clone --filter=blob:none --quiet https://github.com/openai/whisper.git 'C:\Users\anbaobu\AppData\Local\Temp\pip-install-oww449jx\whisper_ce653dcda98444e1baac0baa96e7d960'
Resolved https://github.com/openai/whisper.git to commit 7858aa9c08d98f75575035ecd6481f462d66ca27
Preparing metadata (setup.py) ... done
WARNING: Generating metadata for package whisper produced metadata for project name openai-whisper. Fix your #egg=whisper fragments.
Discarding git+https://github.com/openai/whisper.git@main#egg=whisper: Requested openai-whisper from git+https://github.com/openai/whisper.git@main#egg=whisper (from yt-whisper==1.0) has inconsistent name: expected 'whisper', but metadata has 'openai-whisper'
Requirement already satisfied: yt-dlp in c:\python311\lib\site-packages (from yt-whisper==1.0) (2023.1.6)
ERROR: Could not find a version that satisfies the requirement whisper (unavailable) (from yt-whisper) (from versions: 0.9.5, 0.9.6, 0.9.7, 0.9.8, 0.9.9, 0.9.10, 0.9.11, 0.9.12, 0.9.13, 0.9.14, 0.9.15, 0.9.16, 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.1.9, 1.1.10)
ERROR: No matching distribution found for whisper (unavailable)
It's a simple fix
use this to update yt_dlp package
https://github.com/yt-dlp/yt-dlp/wiki/Installation#with-pip
After running
pip install git+https://github.com/m1guelpf/yt-whisper.git
and
yt_whisper "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
I get
bash: yt_whisper: command not found
Does python have a directory I need to add to my path?
line 8, in
sys.exit(main())
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/yt_whisper/cli.py", line 49, in main
result = model.transcribe(audio_path, **args)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/transcribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/audio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "/Users/wesley/miniconda3/lib/python3.9/site-packages/whisper/audio.py", line 46, in load_audio
except ffmpeg.Error as e:
AttributeError: module 'ffmpeg' has no attribute 'Error'
Arabic subtitles worked flawlessly
Thank you very much for your hard work and efforts
base ❯ yt_whisper 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'
100%|███████████████████████████████████████| 461M/461M [00:26<00:00, 18.2MiB/s]
Downloading video: 0.0%
Downloading video: 0.1%
Downloading video: 0.2%
Downloading video: 0.4%
Downloading video: 0.9%
Downloading video: 1.9%
Downloading video: 3.8%
Downloading video: 7.6%
Downloading video: 15.2%
Downloading video: 30.5%
Downloading video: 61.0%
Downloading video: 100.0%
Downloaded video "Rick Astley - Never Gonna Give You Up (Official Music Video)". Generating subtitles...
Traceback (most recent call last):
File "/opt/homebrew/bin/yt_whisper", line 8, in <module>
sys.exit(main())
File "/opt/homebrew/lib/python3.9/site-packages/yt_whisper/cli.py", line 40, in main
result = model.transcribe(audio_path, **args)
File "/opt/homebrew/lib/python3.9/site-packages/whisper/transcribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "/opt/homebrew/lib/python3.9/site-packages/whisper/audio.py", line 115, in log_mel_spectrogram
stft = torch.stft(audio, N_FFT, HOP_LENGTH, window=window, return_complex=True)
File "/opt/homebrew/lib/python3.9/site-packages/torch/functional.py", line 471, in stft
return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined]
RuntimeError: fft: ATen not compiled with MKL support
The installation fails with the following error:
janwe@DESKTOP:~$ pip install git+https://github.com/m1guelpf/yt-whisper.git
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/m1guelpf/yt-whisper.git
Cloning https://github.com/m1guelpf/yt-whisper.git to /tmp/pip-req-build-z4rcqme6
Running command git clone --filter=blob:none --quiet https://github.com/m1guelpf/yt-whisper.git /tmp/pip-req-build-z4rcqme6
Resolved https://github.com/m1guelpf/yt-whisper.git to commit 4e49b7851d8ee2f389f770b3e0967d4b1ac05a7d
Preparing metadata (setup.py) ... done
Collecting whisper@ git+ssh://[email protected]/openai/whisper@main#egg=whisper
Cloning ssh://****@github.com/openai/whisper (to revision main) to /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428
Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428
The authenticity of host 'github.com (140.82.121.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
[email protected]: Permission denied (publickey).
fatal: Konnte nicht vom Remote-Repository lesen.
Bitte stellen Sie sicher, dass die korrekten Zugriffsberechtigungen bestehen
und das Repository existiert.
error: subprocess-exited-with-error
× git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428 did not run successfully.
│ exit code: 128
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× git clone --filter=blob:none --quiet 'ssh://****@github.com/openai/whisper' /tmp/pip-install-4woekdls/whisper_6de07a42a4564342bb35df700a95c428 did not run successfully.
│ exit code: 128
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Does this tool support loading local video files, or is it limited to videos hosted on YouTube?
Youtube doesn't take vtt files that are over 1 hour if the timestamps aren't adjusted for it apparently. When I went through and fixed the timestamps with this plugin for vscode they were properly accepted. I don't know if it's the leading zeroes on the sub-1h-timestamps or the zero in front of the 1h-timestamps but that did the trick somehow.
Here are the files before and after fixing:
Untold_Realms__The_Rise_of_Vestia___Session_6.txt
Untold_Realms__The_Rise_of_Vestia___Session_6-FIXED.txt
This is the video in question: https://www.youtube.com/watch?v=Cnk_7vk_4To&t=2s
Thank you for this awesome tool - I'll subtitle a bunch more videos in the future with it! :)
Hi,
any idea or docs extension on how to implement it in Polish language?
Thanks
Hello everyone,
We, at EasyBooks initiative, are working a Python package that accepts a YouTube video/playlist link and transcript it using Whisper.
The package supports all features of yt-whisper
and adds new features to it. You can find the package here: https://github.com/ieasybooks/tafrigh.
You can contribute by start adding an English README.md
file, or try it with our easy to use Google Colab notebook: https://colab.research.google.com/github/ieasybooks/tafrigh/blob/main/colab_notebook.ipynb.
First off, thanks for this very nice and useful project. I ran yt-whisper on a video just to check things out. I found that the timings in the VTT file varied between spot-on and several subtitles ahead. At first I thought the titles were gradually getting ever-further ahead but then they would periodically sync up with the audio and then start gradually getting ahead again and the cycle would repeat throughout the video.
If the following is the code for whisper (it is for local video), what will be the code for yt-whisper for a youtube video?
import whisper
model = whisper.load_model("base")
result = model.transcribe("localaudio.mp3")
print(result["text"])
There should be an option for the user to run the model on CPU
Thank you very much for this program. It saves me a lot of time!
The subtitles produced are superior to the ones generated by YouTube in every way except one: they are too long, on occasion even ridiculously long.
I wonder if it would be possible to define a maximum number of characters that appear on screen at the same time? Or at least make a mandatory break after a period (.) if the sentence is above a certain number of characters?
OpenAI released large-v2, could you add support for it here and in your replicate model?
I don't find a command for doing this? Can we have this as an option?
Are there ways/plans to implement this as a yt-dlp plugin or postprocessor ?
This was deemed out of scope on yt-dlp's side (yt-dlp/yt-dlp#5656), but I think that feature would be loved by many.
Would be usefull to be able to pass language argument to Whisper, since automatic language detection doesn't always work reliably. Currently trying to specify language produces error: "yt_whisper: error: unrecognized arguments: --language"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.