Giter Site home page Giter Site logo

[nicovideo.jp] Extracting the audio from nicovideos in the "aac" or "m4a" file format causes the length to be missing, but NOT "alac" about yt-dlp HOT 6 OPEN

Synth-ix avatar Synth-ix commented on August 20, 2024
[nicovideo.jp] Extracting the audio from nicovideos in the "aac" or "m4a" file format causes the length to be missing, but NOT "alac"

from yt-dlp.

Comments (6)

bashonly avatar bashonly commented on August 20, 2024 1

Niconico is known to only have MP4 format (according to this and this Q&A pages from the niconico official site), that's why it was hard-coded to m4a.

Right, but those such cases are typically handled by passing mp4 as the ext param to _extract_m3u8_formats(), which the above patch does

from yt-dlp.

bashonly avatar bashonly commented on August 20, 2024

from the m4a log:

[ExtractAudio] Not converting audio 【マリオペイント】お猿のおもちゃのBGMをリミックスしてみた [nm10550338].m4a; the file is already in a common audio format

Here, the only thing that -x is doing is implicitly selecting the bestaudio format. yt-dlp detects that the download is already in an audio-only container (m4a) and does not extract or remux anything. So you end up with the raw product of the m3u8 playlist; the native HLS downloader just concatenates all of the AAC audio stream segments. The resulting container is not playing nice with Windows' metadata parsing.

When you pass --audio-format alac (or any format other than m4a or aac), then yt-dlp will remux or re-encode the download, which effectively fixes the issue.

Possible solutions/workarounds for getting a fixed m4a container:

  1. use -x -f bv+ba/b to download the video and the audio, merge them, and then extract the audio
  2. use -x --parse-metadata " m4a_dash: %(container)s" to download only audio and force remuxing after download
  3. use ffmpeg as the downloader: --downloader ffmpeg
  4. manually remux with ffmpeg after downloading, e.g.: ffmpeg -i "input.m4a" -c copy "output.m4a"

from yt-dlp.

Synth-ix avatar Synth-ix commented on August 20, 2024

Those solutions worked just perfectly!

Thank you for the explanation & inputs on fixing the issue, seriously. ♥

from yt-dlp.

bashonly avatar bashonly commented on August 20, 2024

Looking at the extractor code though, I'm noticing we manually set the audio format extension to m4a. I'm wondering if this should be fixed on the extractor level like this:

diff --git a/yt_dlp/extractor/niconico.py b/yt_dlp/extractor/niconico.py
index 179e7a9b1..e06740d62 100644
--- a/yt_dlp/extractor/niconico.py
+++ b/yt_dlp/extractor/niconico.py
@@ -420,7 +420,7 @@ def _yield_dms_formats(self, api_data, video_id):
                 'x-request-with': 'https://www.nicovideo.jp',
             })['data']['contentUrl']
         # Getting all audio formats results in duplicate video formats which we filter out later
-        dms_fmts = self._extract_m3u8_formats(dms_m3u8_url, video_id)
+        dms_fmts = self._extract_m3u8_formats(dms_m3u8_url, video_id, 'mp4')
 
         # m3u8 extraction does not provide audio bitrates, so extract from the API data and fix
         for audio_fmt in traverse_obj(dms_fmts, lambda _, v: v['vcodec'] == 'none'):
@@ -432,7 +432,6 @@ def _yield_dms_formats(self, api_data, video_id):
                     'asr': ('samplingRate', {int_or_none}),
                 }), get_all=False),
                 'acodec': 'aac',
-                'ext': 'm4a',
             }
 
         # Sort before removing dupes to keep the format dicts with the lowest tbr

from yt-dlp.

Synth-ix avatar Synth-ix commented on August 20, 2024

Definitely shoot me a reply if/when you'd go about releasing a build including this potential patch, I'd be willing to test it out before release if for whatever reason that'd be needed as well.
Glad I was able to bring this to light & you're able to find a possible way to go about troubleshooting this!

from yt-dlp.

CXwudi avatar CXwudi commented on August 20, 2024

Niconico is known to only have MP4 format (according to this and this Q&A pages from the niconico official site), that's why it was hard-coded to m4a.

from yt-dlp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.