patrickenfuego / chapterize-audiobooks Goto Github PK
View Code? Open in Web Editor NEWSplit a single, monolithic mp3 audiobook file into chapters using Machine Learning and ffmpeg.
License: Apache License 2.0
Split a single, monolithic mp3 audiobook file into chapters using Machine Learning and ffmpeg.
License: Apache License 2.0
Some audiobooks don't use normal keywords that can help identify the start of a chapter. For example, some don't say "chapter" before the identifier, but instead just say "One".
My goal is to help identify these section separators using the surrounding context, allowing for more accurate chapter breaks.
I found your script and I really liked the idea. But I tried to run it and I get stuck all the time!
At first I got the following error:
File "C:\FFOutput\Chapterize-Audiobooks-0.6.0\chapterize_ab.py", line 316, in parse_args
args.audiobook.with_suffix('.cue').exists()
AttributeError: 'NoneType' object has no attribute 'with_suffix'
So I went to line 316 and deleted the
or args.audiobook.with_suffix('.cue').exists()
Now the script started working. And I got the message:
ERROR: The script only works with .mp3 files (for now)
I tried different lines and got the same error:
I tried at first from the Windows command line, and then also from IDLE (3.11), but without success.
I tried to run older versions of your script, and I got the first error in version 0.5 as well, and the second error in all your versions...
Would appreciate help.
post Scriptum. I don't understand Python that much, so it is not unreasonable that I skipped a step that is obvious to you, simply due to lack of knowledge.
In a previous release, I modularized the project so it can leverage multiple different languages dynamically. I need help from people who speak those languages to fill out the excluded phrases and chapter separators so more people can use this tool.
I love that the script can detect and splice a big file into chapters but it would be nice it also supported mp4a encoding consistently. The script is able to analyze and generate SRT file from mp4a but it cannot splice the file. It would be nice if the script could detect that the source was encoded using mp4a and automatically convert it to a temporary mp3 file so it can splice it or let the user know before it starts to process it that the encoding is not supported.
I manually converted the file mp4a file to mp3 to confirm that the error reported was due to it being mp4 and not for another reason and it worked as expected.
ffmpeg_log.txtx
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55e69758ce40] Discarding ID3 tags because more suitable tags were found.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/mnt/Podcast/Vaughn_Heppner/Star_Raider/Star_Raider.mp3':
Metadata:
major_brand : dash
minor_version : 0
compatible_brands: iso6mp41
creation_time : 2023-11-20T06:27:35.000000Z
Duration: 12:18:42.06, start: 0.000000, bitrate: 129 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
Metadata:
creation_time : 2023-11-20T06:27:35.000000Z
handler_name : ISO Media file produced by Google Inc.
vendor_id : [0][0][0][0]
Input #1, image2, from '/mnt/Podcast/Vaughn_Heppner/Star_Raider/star_raider.jpg':
Duration: 00:00:00.04, start: 0.000000, bitrate: 14626 kb/s
Stream #1:0: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 362x342 [SAR 300:300 DAR 181:171], 25 fps, 25 tbr, 25 tbn, 25 tbc
[mp3 @ 0x55e6975e6680] Invalid audio stream. Exactly one MP3 audio stream is required.
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:1 --
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (copy)
Last message repeated 1 times
----------------------------------------------------
********************************************************
NEW LOG START
********************************************************
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x5641cafe2e80] Discarding ID3 tags because more suitable tags were found.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/mnt/Podcast/Vaughn_Heppner/Star_Raider/Star_Raider.mp3':
Metadata:
major_brand : dash
minor_version : 0
compatible_brands: iso6mp41
creation_time : 2023-11-20T06:27:35.000000Z
Duration: 12:18:42.06, start: 0.000000, bitrate: 129 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
Metadata:
creation_time : 2023-11-20T06:27:35.000000Z
handler_name : ISO Media file produced by Google Inc.
vendor_id : [0][0][0][0]
Input #1, image2, from '/mnt/Podcast/Vaughn_Heppner/Star_Raider/star_raider.jpg':
Duration: 00:00:00.04, start: 0.000000, bitrate: 14626 kb/s
Stream #1:0: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 362x342 [SAR 300:300 DAR 181:171], 25 fps, 25 tbr, 25 tbn, 25 tbc
[mp3 @ 0x5641cb03c180] Invalid audio stream. Exactly one MP3 audio stream is required.
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:1 --
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (copy)
Last message repeated 1 times
----------------------------------------------------
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x556de9d23e80] Discarding ID3 tags because more suitable tags were found.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/mnt/Podcast/Vaughn_Heppner/Star_Raider/Star_Raider.mp3':
Metadata:
major_brand : dash
minor_version : 0
compatible_brands: iso6mp41
creation_time : 2023-11-20T06:27:35.000000Z
Duration: 12:18:42.06, start: 0.000000, bitrate: 129 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
Metadata:
creation_time : 2023-11-20T06:27:35.000000Z
handler_name : ISO Media file produced by Google Inc.
vendor_id : [0][0][0][0]
Input #1, image2, from '/mnt/Podcast/Vaughn_Heppner/Star_Raider/star_raider.jpg':
Duration: 00:00:00.04, start: 0.000000, bitrate: 14626 kb/s
Stream #1:0: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 362x342 [SAR 300:300 DAR 181:171], 25 fps, 25 tbr, 25 tbn, 25 tbc
[mp3 @ 0x556de9d7d180] Invalid audio stream. Exactly one MP3 audio stream is required.
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:1 --
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (copy)
Last message repeated 1 times
----------------------------------------------------
...
Add additional chapter separators:
Initially these will not be used but can be enabled via a CLI switch until thorough testing is performed.
I got the error:
Traceback (most recent call last):
File "/home/savant/Projects/Chapterize-Audiobooks/chapterize_ab.py", line 1078, in <module>
main()
File "/home/savant/Projects/Chapterize-Audiobooks/chapterize_ab.py", line 970, in main
audiobook_file, in_metadata, lang, model_name, model_type, cue_file = parse_args()
File "/home/savant/Projects/Chapterize-Audiobooks/chapterize_ab.py", line 316, in parse_args
args.audiobook.with_suffix('.cue').exists()
AttributeError: 'NoneType' object has no attribute 'with_suffix'
Command: python3.10 chapterize_ab.py -dm -l pl
Executed inside venv
OS: debian 11
Installed dependencies using pip3.10 install -r requirements.txt
pip log:
Requirement already satisfied: rich>=12.6.0 in ./lib/python3.10/site-packages (from -r requirements.txt (line 1)) (13.5.3)
Requirement already satisfied: vosk>=0.3.44 in ./lib/python3.10/site-packages (from -r requirements.txt (line 2)) (0.3.45)
Requirement already satisfied: requests>=2.28.0 in ./lib/python3.10/site-packages (from -r requirements.txt (line 3)) (2.31.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in ./lib/python3.10/site-packages (from rich>=12.6.0->-r requirements.txt (line 1)) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./lib/python3.10/site-packages (from rich>=12.6.0->-r requirements.txt (line 1)) (2.16.1)
Requirement already satisfied: cffi>=1.0 in ./lib/python3.10/site-packages (from vosk>=0.3.44->-r requirements.txt (line 2)) (1.16.0)
Requirement already satisfied: tqdm in ./lib/python3.10/site-packages (from vosk>=0.3.44->-r requirements.txt (line 2)) (4.66.1)
Requirement already satisfied: srt in ./lib/python3.10/site-packages (from vosk>=0.3.44->-r requirements.txt (line 2)) (3.5.3)
Requirement already satisfied: websockets in ./lib/python3.10/site-packages (from vosk>=0.3.44->-r requirements.txt (line 2)) (11.0.3)
Requirement already satisfied: certifi>=2017.4.17 in ./lib/python3.10/site-packages (from requests>=2.28.0->-r requirements.txt (line 3)) (2023.7.22)
Requirement already satisfied: idna<4,>=2.5 in ./lib/python3.10/site-packages (from requests>=2.28.0->-r requirements.txt (line 3)) (3.4)
Requirement already satisfied: charset-normalizer<4,>=2 in ./lib/python3.10/site-packages (from requests>=2.28.0->-r requirements.txt (line 3)) (3.2.0)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./lib/python3.10/site-packages (from requests>=2.28.0->-r requirements.txt (line 3)) (2.0.5)
Requirement already satisfied: pycparser in ./lib/python3.10/site-packages (from cffi>=1.0->vosk>=0.3.44->-r requirements.txt (line 2)) (2.21)
Requirement already satisfied: mdurl~=0.1 in ./lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich>=12.6.0->-r requirements.txt (line 1)) (0.1.2)
WARNING: You are using pip version 21.2.3; however, version 23.2.1 is available.
You should consider upgrading via the '/home/savant/Projects/Chapterize-Audiobooks/bin/python3.10 -m pip install --upgrade pip' command.
After parsing, generate a file which can be used to edit chapter markers in situations where the split points are inaccurate.
It would be nice if there was some indicator that the ffmpeg subprocess was working (maybe a tail of the SRT file) so as a user we can see it's still working through the file and not that the process is hung.
I know we could modify the chapterize_ab.py#760
and remove the -loglevel quiet
arg and see that it's working but if a prettier option was available it would be nice.
Installed on my Windows 10 machine, with python3 and ffmpeg etc up to date. When I run any command involving chapterize_ab.py (including "-h"), I get the following error:
File "C:\dev\chapterize-audiobooks-main\chapterize_ab.py", line 40 vosk_link = f"[link={vosk_url}]this link[/link]" ^ SyntaxError: invalid syntax
Any ideas why?
Add an additional, simple GUI interface for users who are not as comfortable using the command line.
Option to convert an mp3 file to m4b with embedded chapter metadata.
Is it possible to just write the CUE file and skip writing the mp3s?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.