Giter Site home page Giter Site logo

Comments (14)

gkatsev avatar gkatsev commented on August 22, 2024 3

I think there are two things to consider here:

  1. what's the easiest and best API to have for something like media-chrome
  2. what's the best API that's we can propose to the w3c/whatwg to get it into the standards.

For an API we can use today, not having to extend Audio/Video Tracks is definitely nice, but I think such an API is less likely to get accepted into the relevant specs.
In addition, I don't think it really matters if a rendition is muxed content. From a user's perspective, it doesn't matter if the audio is available in the same segment as the video or if it was downloaded from a separate segment.

I think that adding a RenditionList to Audio and Video Tracks, similar to what @littlespex prposed above, is better than a combined RenditionList. In the majority case, since alternative video tracks aren't very common, you'd end up with a single Video Track, which has the specified renditions on it. Additionally, you'd have one or more AudioTracks, potentially with their own renditions. In the case of muxed content, you'd have a track show up under both AudioTrack and VideoTrack.
This how Safari currently implements things, where for media, including mp4, you get video.videoTracks[0] pointing at the video portion and video.audioTracks[0] point at the audio portion.
You can then separately turn off audio and video with video.audioTracks[0].enabled = false and video.videoTracks[0].selected = false.
The way AudioTracks and VideoTracks are defined is that you could theoretically enable multiple audio tracks at the same time, but not multiple video tracks. This is why videojs-contrib-quality-levels uses enabled on the renditions list, so that you could have multiple enabled, rather than only selecting one.

The tricky part of a renditions API is likely supporting everything that DASH allows. DASH is tricky here because you can have different renditions per period and potentially different audio tracks per video track. Maybe a non-goal would be to not support all permutations that DASH allows. HLS is simpler because it doesn't allow you multiple renditions per audio track.

from media-ui-extensions.

littlespex avatar littlespex commented on August 22, 2024 2

Amending the VideoTrack API seems promising as it allows for each track to have multiple renditions. The two new functions would allow for a basic selection menu.

With the wide variety of bitrate ladders out in the wild, having getAvailableRenditions return a list of strings may be limiting. It may be necessary to return a Rendition object similar to videojs-contrib-quality-levels so that dimensions, bitrates and codecs can be used to generate the list of menu items.

For more advanced use cases, it may also be necessary to dispatch events so that changes made to the renditions list can be reflected in the UI:

  • A change event to catch changes made by things like the streaming library's ABR algorithm. For example, some menus have Auto checked, and a separate indicator showing which rendition is currently being rendered.
  • add/remove events. Multi-period DASH manifests are allowed to have different numbers of Representations per Period.

One possible alternative, though certainly not as simple, would be something similar to the existing text/audio/video track APIs:

partial interface VideoTrack {
  readonly attribute VideoRenditionList renditions;
};

interface VideoRenditionList : EventTarget {
  readonly attribute unsigned long length;
  getter VideoRendition (unsigned long index);
  VideoRendition? getRenditionById(DOMString id);
  readonly attribute long selectedIndex;

  attribute EventHandler onchange;
  attribute EventHandler onaddrendition;
  attribute EventHandler onremoverendition;
};

interface VideoRendition {
  readonly attribute DOMString id;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute unsigned long bitrate;
  readonly attribute unsigned long codec;
  attribute boolean selected;
};

from media-ui-extensions.

heff avatar heff commented on August 22, 2024 2

We're stepping into this with media-chrome, and it looks like @luwes has already done work on a version.

How should mixed audio/video renditions (.ts HLS) be handled in an API like this? Should the assumption be if there's no audio renditions then there's only mixed media renditions? Or should the rendition list not be media type specific, with a rendition type field that can be video/audio/mixed. I think I remember a proposal from @wilaw somewhere with those options.

from media-ui-extensions.

gkatsev avatar gkatsev commented on August 22, 2024 1

Yeah, maybe it selects one from the available options and sticks with it. Either way, it seems simplified compared to what you can do in DASH.

from media-ui-extensions.

gkatsev avatar gkatsev commented on August 22, 2024

Also wanted to mention https://github.com/videojs/videojs-contrib-quality-levels which we wrote with making it be updated to a spec in mind.

from media-ui-extensions.

luwes avatar luwes commented on August 22, 2024

Yes, good food for thought. @cjpillsbury brought this also up when we discussed my draft implementation.

Maybe it'd be easier to not have to patch the Video/AudioTrack apis for browsers other than Safari.

would be more like

partial interface HTMLMediaElement {
  readonly attribute RenditionList renditions;
}

interface RenditionList : EventTarget {
  readonly attribute unsigned long length;
  getter Rendition (unsigned long index);
  Rendition? getRenditionById(DOMString id);
  readonly attribute long selectedIndex;

  attribute EventHandler onchange;
  attribute EventHandler onaddrendition;
  attribute EventHandler onremoverendition;
};

interface Rendition {
  readonly attribute DOMString trackId;
  readonly attribute video | audio | mixed type; 

  readonly attribute DOMString id;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute unsigned long bitrate;
  readonly attribute unsigned long codec;
  attribute boolean selected;
};

from media-ui-extensions.

heff avatar heff commented on August 22, 2024

The multiple video tracks use case is one to consider here. On one hand, it means you'll still end up identifying which video track a rendition belongs to. On the other hand, I question how much we can rely on the native VideoTracks list to actually represent the multiple video tracks in an adaptive manifest. If it doesn't, then that makes it more complicated to extend the native VideoTracks in the (maybe rare) use case of multiple video tracks with multiple renditions each. Anybody have experience with that or want to test it?

from media-ui-extensions.

cjpillsbury avatar cjpillsbury commented on August 22, 2024

There is poor (none that I know of) support of "alternate video" in native playback for browser/browser-like envs (and players generally). However, there is decent support for "alternate audio".

from media-ui-extensions.

cjpillsbury avatar cjpillsbury commented on August 22, 2024

@gkatsev

HLS is simpler because it doesn't allow you multiple renditions per audio track.

I'd be careful here. Folks definitely use EXT-X-MEDIA:TYPE=AUDIO to provide multiple encodings/"renditions" of "the same" audio content, and Safari will represent them as a single AudioTrack. For example, Apple's official test stream https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_adv_example_hevc/master.m3u8 includes:

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="a1",NAME="English",LANGUAGE="en-US",AUTOSELECT=YES,DEFAULT=YES,CHANNELS="2",URI="a1/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="a2",NAME="English",LANGUAGE="en-US",AUTOSELECT=YES,DEFAULT=YES,CHANNELS="6",URI="a2/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="a3",NAME="English",LANGUAGE="en-US",AUTOSELECT=YES,DEFAULT=YES,CHANNELS="6",URI="a3/prog_index.m3u8"

(note the shared NAME and LANGUAGE but the differences in e.g. CHANNEL (and also the encoded content itself)

and when playing in Safari, you'll get:
Screen Shot 2022-09-01 at 8 20 50 AM

(aka a single AudioTrack).

Additionally, no "Languages" control menu is added to the controls, since there is only one "track".

Compare to this example https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/hls.m3u8 which includes:

#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-eng-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="stream_5",DEFAULT=YES,AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-deu-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="de",NAME="stream_4",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-ita-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="it",NAME="stream_8",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-fra-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="fr",NAME="stream_7",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-spa-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="es",NAME="stream_9",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-eng-0384k-aac-6c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="stream_6",CHANNELS="6"

Note that they all share the same GROUP-ID but vary with e.g. LANGUAGE and NAME (though there are two en playlists, which still have different NAMEs). Here's what you get when playing in Safari:
Screen Shot 2022-09-01 at 8 29 29 AM

And here's what shows up in the automatically added "Language" control menu:
Screen Shot 2022-09-01 at 8 49 48 AM

Finally, here's what happens when I create a local version of the multivariant playlist where the two english EXT-X-MEDIA playlists share the same NAME. playlist tags:

#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-eng-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="English",DEFAULT=YES,AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-deu-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="de",NAME="German",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-ita-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="it",NAME="Italian",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-fra-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="fr",NAME="French",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-spa-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="es",NAME="Spanish",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-eng-0384k-aac-6c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="English",CHANNELS="6"

Safari's audioTracks:
Screen Shot 2022-09-01 at 8 53 11 AM

(note there's only one en AudioTrack now)

"Languages" control menu:
Screen Shot 2022-09-01 at 8 54 45 AM

All this is to say that Safari will treat multiple audio playlists as different tracks or as the same track depending on details in their attributes

from media-ui-extensions.

gkatsev avatar gkatsev commented on August 22, 2024

I'd be careful here. Folks definitely use EXT-X-MEDIA:TYPE=AUDIO to provide multiple encodings/"renditions" of "the same" audio content, and Safari will represent them as a single AudioTrack. For example, Apple's official test stream https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_adv_example_hevc/master.m3u8 includes:

However, this will still only match a specific audio track to a specific set of video renditions. The audio renditions won't be switching independently of the video renditions here, which is specifically what I was calling out, maybe it wasn't clear enough.

I tested locally and as far as I can tell, Safari is ignoring the second English track (I just named all the tracks English).

from media-ui-extensions.

cjpillsbury avatar cjpillsbury commented on August 22, 2024

"ignoring" may be wrong here. iirc AVFoundation/AVPlayer (which Safari HLS playback is built on top of) will do some filtering based on support (6 channels being relevant here) but will also use ABR switching, similar to video playlists, for multiple audio playlists with "similar relevant features". It just isn't exposed in the browser.

from media-ui-extensions.

gkatsev avatar gkatsev commented on August 22, 2024

If it does ABR the audio renditions, I couldn't get it to happen. But maybe my test wasn't great.

from media-ui-extensions.

cjpillsbury avatar cjpillsbury commented on August 22, 2024

We don't have a good test stream. We'd want all the same container format & codec & channels with matching names & languages but notably different bitrates (including a stupidly large bitrate). We'd also likely want only one EXT-X-STREAM-INF to avoid the dance of video ABR switching vs. (potential) audio ABR switching.

from media-ui-extensions.

cjpillsbury avatar cjpillsbury commented on August 22, 2024

Or maybe someone with more knowledge of how this works under the hood will chime in 🤞

from media-ui-extensions.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.