w3c / mediacapture-output Goto Github PK

View Code? Open in Web Editor NEW

25.0 35.0 25.0 131 KB

API to manage the rendering of audio on any audio output device

Home Page: https://w3c.github.io/mediacapture-output/

License: Other

CSS 9.11% JavaScript 8.55% HTML 82.34%

webrtc

mediacapture-output's Introduction

Specification 'mediacapture-output'

This is the repository for mediacapture-output. You're welcome to contribute! Let's make the Web rock our socks off!

mediacapture-output's People

Contributors

Stargazers

Watchers

Forkers

juberti victoria kleopatra999 dontcallmedom opendawn guidou samphillips1879 foolip jan-ivar bhanditz youennf autokagami iamthatiam777 isabella232 miketaylr igalia

mediacapture-output's Issues

sinkId definition appears broken

It just says

sinkId of type DOMString, readonly
MediaDeviceInfo.deviceIdMediaDevices.enumerateDevices()GETUSERMEDIA

Add explainer document

From TAG review:

we would really like to see an explainer document with some same code examples of use

Default sinkId value

From TAG review:

Section 2.1 (describing the attribute sinkId should have some prose describing what the value is that is returned (in the default case at a minimum: which I think is the empty string).

Introduce audioDeviceType as an extension to HTMLMediaElement

As we discussed at TPAC, it'd be beneficial to introduce virtual device types so the audio stream can be routed to the default audio end point that maps to the device type. This will give flexibility to apps to choose the audio device type and keep seamless device switching when audio end points (e.g. headphones) are plugged-in or unplugged during run time.

a starting point

partial interface HTMLMediaElement {
    attribute DOMString audioDeviceType;
}

property values

"Console": Specifies that the audio output will be sent to the console device.
"Multimedia": Specifies that the audio output will be sent to the multimedia device. This should be the default value if the audioDeviceType is not set explicitly.
"Communications": Specifies that the audio output will be sent to the communications device. The User Agent is encouraged to use a media playback code path optimized for real-time communications.

remarks

Dynamic change of audioDeviceType during playback is not supported. A new value of audioDeviceType must be set before setting the src or srcObject attribute in order for the value to be effective.

partial interface HTMLMediaElements should not include ": EventTarget"

http://w3c.github.io/mediacapture-output/#htmlmediaelement-extensions

The sinkId argument and attribute are strings, not objects

https://w3c.github.io/mediacapture-output/#widl-HTMLMediaElement-setSinkId-Promise-void--DOMString-sinkId

If sinkId and the sinkId attribute are the same object, return a resolved promise.

This should say something on the form "If x is equal to y".

Delay the SecurityError to the async part of setSinkId()

https://w3c.github.io/mediacapture-output/#widl-HTMLMediaElement-setSinkId-Promise-void--DOMString-sinkId

If application is not authorized to play audio through the device identified by the given sinkId, return a promise rejected with a new DOMException whose name is SecurityError.

Unless it is guaranteed that this information will always be available synchronously in the same process, doing this in the async part would be a good idea.

Reserved words for device identifiers don't need 'id-'

...prefix. I believe that the intent of this prefix is to carve out some of the device identifier space for use in this specification. That's unnecessary, though it might make sense to ensure that you only use reserved values that are shorter than the minimum size of an identifier.

As specified, this doesn't avoid the potential for collision. Assuming that an implementation is assigning identifiers while ignorant of this specification, then an implementation might assign one of these identifiers if they use the URL and filename safe variant of base64 to encode identifiers. At 17 characters in length, id-communications is long enough to collide with a 102/104-bit value, which is a plausible amount of entropy, if a mite odd.

Avoid Interface.attributeName notation

E.g. in the sentences

If the HTMLMediaElement.sinkId is no longer present in the list of MediaDeviceInfo.deviceIds returned

This is confusing since sinkId is an instance property, not a static property. E.g. if you type HTMLMediaElement.sinkId in your browser console, the result will be undefined.

Instead say something like "the HTMLMediaElement's sinkId".

Enumeration of output devices and permission model

I have recently gotten an inquiry from a developer attempting to utilize the Audio Devices API to select an output audio device in a situation where there are no input devices (e.g. no microphone or camera).

To enumerate the output devices, they are calling MediaDevices.enumerateDevices to obtain MediaDeviceInfo relating to the devices and then once the user has selected an output device, HTMLMediaElement.setSinkId(deviceId) to set the output device.

Here is the catch. To provide the user with an intelligible list of output devices to select, the label attribute is needed. As noted in Section 9.2.1 of Media Capture and Streams:

 The algorithm described above means that the access to media device information depends on 
 whether or not permission has been granted to the page's origin.

 If no such access has been granted, the MediaDeviceInfo dictionary will contain the deviceId, kind, 
 and groupId.

 If access has been granted for a media device, the MediaDeviceInfo dictionary will contain the 
deviceId, kind, label, and groupId.

So the label attribute can only be obtained if permission has been granted. However, in a situation where there are no input devices, requesting access to the user's (non-existent) microphone and camera is problematic.

Was this interaction intentional?

Point to public-webrtc

Intro currently says comments should be brought to the (to be discontinued) public-media-capture list.

Should setSinkId be functional in SecureContext only?

Following on w3c/webrtc-pc#1945, would it make sense to make setSinkId [SecureContext]? I.e.

console.log("setSinkId" in document.createElement('video')); // false in http

We're looking for guidance on implementing it in Firefox.

Currently, in Chrome this method exists in http, but is pretty much useless (SecurityError).

As to the read-only sinkId attribute, we could leave it ("") or make it SecureContext as well (undefined). With no way to affect the attribute, it serves little purpose.

"For each sinkId whose value is equal to sinkId:"

What collection exactly is being iterated over here? There is no such thing as "For each sinkId".

My best guess is that this is attempting to iterate over all HTMLMediaElements. But which ones? The ones in a document tree? The ones connected to a document? Which documents---all of the documents in the browser? Active documents only?

WG CR review for mediacapture-output

This issue serves to track internal WG review (in the WGs making up the mediacapture TF) of the medacapture-output document for CR publication.
If you have reviewed the document, please add a “thumbs up” reaction to the issue.
If you find issues that you think need addressing before CR publication, please file the issue in Github and mention it in the comments.

The review lasts until Thursday, October 12, 2016. At that time, the chairs will decide whether or not the review result warrants asking the wider community to review the document for CR; if we do ask the wider community, a similar issue wil be filed for tracking that review.

Specify action of id-communications when there is no communications device

Copied from TAG review w3ctag/design-reviews#91

If there is no comms device, behavior should be predictable. I suggest that sound should go to the default device (same as "" or "id-multimedia").

Web Audio design is unlikely

The most likely way that Web Audio will be piped to non-default devices is via a constraints-style object on construction of the AudioContext object-

var audioContext = new AudioContext( { sinkId: requestedSinkId } );

It's highly unlikely we would support output to multiple devices from a single AudioContext simultaneously as a direct case; the resampling and jitter work necessary is kinda goofy from a practical perspective. The output device is the master clock owner, and that becomes more complex in a multiple-output-device-per-Context case. Similarly, changing the single sink live is a bit unstable.

Tracking issues in Web Audio spec: https://github.com/WebAudio/web-audio-api/issues/445, WebAudio/web-audio-api#359.

Why is MediaDeviceId an enum?

No IDL in this spec uses it. Will IDL in other specs?

It seems more likely this should just be a list of strings described in a table, without trying to bring Web IDL into this.

Go back to the default output

As long as the sinkId is not set, the sink is selected by the user agent.
There does not seem to be any way for a web page to go back to this original state.
Should we allow setSinkId('')?

Setting the audio output for a whole context or page

Currently, we are only able to control individual HTMLMediaElement audio output with setSinkId.
We probably need to add something to WebAudio as well.

In most cases, it seems that pages will want to output all their audio to a single device.
Adding such an API to do so would be convenient and would complement setSinkId which would be used to override this default value.

Something at navigator.mediaDevices level might make sense.

Mismatch between sinkId and deviceId names

From TAG review:

sinkId name doesn't seem to match-up with the property used to identify the output devices (deviceId) from MediaDeviceInfo

Add issues link to the spec

Most W3C specs have a 'participation' section at the top that includes things like the GitHub repo and 'file a bug link' (eg. see UI Events). Please consider adding such a section to this spec.

setSinkId is inconsistent in handling of default value.

sinkId defaults to ""("...the empty string if output is delivered through the user-agent default device")

But [[SinkId]] defaults to null ("Let the element have a [[SinkId]] internal slot, initialized to null.")

This messes up setSinkId in observable ways:

"3. If sinkId is equal to element's [[SinkId]], return a promise resolved with undefined."

await element.setSinkId(element.sinkId); // NotFoundError

Instead, this should succeed, like it does in Chrome.

Don't restrict this spec to HTMLMediaElement in the introduction

This proposal allows JavaScript to direct the audio output of a media element

should be:

This proposal allows JavaScript to direct the audio output of a media element or an AudioContext

Use deviceId instead of sinkId

Using deviceId in all the different places would make this consistent with the getUserMedia spec and make it more obvious that these are the same thing that we're talking about.

Behaviour when the system default audio output device changes

What happens when the following happens:

We're playing an HTMLMediaElement through the default device, with a empty sinkId
The default device changes (because the user changes it in the control panel of the operating system, or a new device has been plugged in, and the system sets it as the default).

Is the HTMLMediaElement silently re-routed to the new default device? Does it stop playing ? The former is what happens right now in UAs, the latter is probably not web-compatible.

More broadly, the concept of a "default device" is too overloaded, and needs to be defined. Is it the device that was the default at the time the audio playback started? Is it a kind of "virtual device" that always outputs to whatever the default device is (i.e. the audio output follows the system default, even if the system default changes) ?

Handling connect/disconnect for AudioContext

From TAG review:

For AudioContext, no mention of how to handle the various update/devices disconnected and reconnected scenarios mentioned in section 2.3

Problems with authorization for the output-only case

from Philippe Joseph Cohen:

The issue in the current proposal is that EnumerateDevice returns opaque information before user authorisation as current drafted in the EnumerateDevices specs. So with this opaque information and without the the more meaningful properties (currently the ‘label’ one), apps won’t be in a position to decide if to prompt the user for authorisation to use a specific device based on just the opaque deviceId.

This implies that any authorization prompt must be for all devices (i.e. "web site X wants to enumerate your audio devices", as opposed to "web site X wants to access device Y").

Specify with what the setSinkId() promise should be rejected

https://w3c.github.io/mediacapture-output/#widl-HTMLMediaElement-setSinkId-Promise-void--DOMString-sinkId

In step 5:

Associate the audio output device represented by sinkId with this object for playout.
If the preceding step failed, return a rejected promise.

Here it needs to be specified with what kind of exception the promise is rejected.

Normative reference to HTML 5.2 should be changed to HTML LS

The normative references lists HTML 5.x.

Since W3C and WHAT WG signed the MOU, the HTML 5.2 reference should be changed to HTML LS in the bikeshed source.

Touching sinkId from background thread

In step 3.5 in Method paragraph of section 2 sinkId appears to be updated from a background thread. Since it is a JavaScript entity I would expect to queue a task in order to update it.

some text got garbled in section 2.1 Attributes

Section 2.1, Attributes currently says:

sinkId of type DOMString, readonly
MediaDeviceInfo.deviceIdMediaDevices.enumerateDevices()GETUSERMEDIA

The second line looks like it somehow got garbled. It's just two links to properties/functions and a reference strung together with no spacing or prose.

s/SecurityError/NotAllowedError/

When we switched from SecurityError to NotAllowedError two years ago, we forgot this spec.

We should fix that for consistency with getUserMedia.

Also, we should tidy up the terminology by s/authorization/permission/ at the same time.

Default value alias

From TAG review:

Section 4's pre-defined Identifiers adds an alias to the empty string value with "id_multimedia". I wonder for clarity if this should just be the default value.

(probably over taken by events)

CD publication of specification

At the Lisbon TPAC meeting, we will raise the question of whether this spec is ready for publication as CD. Please add to this issue references to the issues that you think have to be resolved before CD publication.

Once all these are resolved, and we think review is adequate, we will ask for CD publication.

Harald, for the chairs.

It is unclear why https://w3c.github.io/mediacapture-output/#dom-htmlmediaelement-setsinkid 5.5 needs to queue a task

https://w3c.github.io/mediacapture-output/#dom-htmlmediaelement-setsinkid
5.5 queues a task to set [[SinkId]] . [[SinkId]] is just a slot with initial value null.
5.3. has already changed element's underlying output device.

So, there is time between 5.3 and 5.5 when element's [[SinkId]] doesn't tell what the output device actually is.

Controlling 3rd party iframe audio output on a page?

(forgive me, this is my first time attempting to contribute)

We've built an application to allow for video streaming, and we essentially render the video using a browser, then output the framebuffer to another encoding process and send it along to an RTMP destination. (In our case, it's usually FB Live.)

We'd like to enable our users to easily embed other media in their productions.

Today, we capture the audio by changing a machine's default audio device to one we custom wrote, then reroute the audio to our encoder process. This works great generally, but has the major drawback of disabling audio for the rest of the computer.

As the web audio spec has started shipping in Chrome specifically, we've started experimenting with using the web audio output api to redirect the audio properly. Basically, we use the enumerate devices api to find our driver, and if a confluence of things are correct, we direct our audio to go out to that spot explicitly using the setSinkId of audio and video elements.

The issue is if we'd like to embed other external media, like an iframe from YouTube as a simple example, we'd need YouTube to explicitly support switching audio destinations in their postmessage api. We view this as unlikely given our usecase is more edgecase for their business. We think the top-most context for a page should likely be in charge of where audio ends up, if inner iframes haven't changed their sound settings past 'default'. Basically, the top-most context could be in charge of all audio routing ideally.

I'd propose a setSinkId api on an iframe, just like we have on audio / video elements. If this has been done before, I apologize, I wasn't able to find any data on this pretty much anywhere on the web.

I think there's likely some technical challenges here, but I think for advanced audio / video (what i'm obsessed with) it'll help a lot with what the web is great at: linking and embedding resources.

setSinkId does not abort substeps

Per the algorithm given, "If sinkId does not match any audio output device identified by enumerateDevices()," then the promise will be rejected, but then the steps will proceed. You should add "and abort these substeps" to all such cases where you don't want to proceed.

Sample rate can change when an `AudioContext` is running

Requiring the sink ID to be set at construction time simplifies the implementation, since the output sample rate is fixed.

Changing the sample-rate of a device while it is in use is something that can be done on all major desktop operating systems. This sentence implies that this is not the case.

I agree that it's hard to change the sample-rate of an AudioContext, but it's not hard to change it from a device to another, and should be supported.

Of course, in practice and if necessary, UAs will insert a resampler somewhere in the rendering pipeline so that the AudioContext can still render the audio at the original sample-rate, and everything will continue running smoothly.

Ask for user gesture to call setSinkId

Playing audio is nowadays gated by a user gesture.
Changing the output of the audio could be restricted similarly.

HTMLMediaElement.setSinkId() when media element is paused doesn't change output device?

http://w3c.github.io/mediacapture-output/#widl-HTMLMediaElement-setSinkId-Promise-void--DOMString-sinkId

The "stop playing this object's audio out of the device represented by the sinkId attribute" and "start playing this object's audio out of the device represented by sinkId" steps are predicated on the media element being playing.

A literal reading of this would be that calling setSinkId() while the media element is paused would have no effect other than changing the sinkId attribute, so that when playback continues it's using the original device.

Making these steps unconditional ought to fix the problem.

Reserved values for ID don't seem like the right solution

After discussion on the webrtc list, it seems that the conclusion is that reserved values are a bad idea.
(see #mediacapture-main:173)

Can we find a different solution?

"Set the sinkId attribute to sinkId."

sinkId does not have a setter, so this is not possible.

Probably you mean to store the current sinkId in some "internal slot" or "associated sink id", and then set that. Then the getter is defined as returning the value of this HTMLMediaElement's associated sink id.

Headphone connected flag

For certain media player applications, it would be useful to know when headphones are connected (provided the operating system has this information) so that headphone-specific audio can be delivered (e.g. virtual surround). For example, Android has the ACTION_HEADSET_PLUG in the AudioManager class. Could a similar flag be made available within the medacapture-output API? It must be clear that this flag will not always be set, it would depend on the underlying system. But this would be a useful signal and one that many systems, especially mobile, do track.

Could there perhaps be a headset plugged property on each audio device and an event triggered when this status changes? Also an interface to query the current headphone plugged state on a given sinkID?

Predefined IDs conflict with future direction of device selection

I am concerned that the use of predefined IDs with somewhat vague meanings (see https://cdn.rawgit.com/w3c/mediacapture-output/master/index.html#mediadeviceid-enum) sets up a recipe for confusion that will not be easy to get rid of in the future. While these appear to map to some existing MS Windows concepts, that does not provide applications with the ability to offer a meaningful choice of output device to end users (imagine offering users the choice of "playback" or "communications" in a music synthesizer app, for example). I suggest that these IDs be omitted from the spec altogether at this stage.

My belief is that most developers need to be able to discover devices that satisfy various context-dependent constraints, e.g. "is a headphone", "is a speaker", "can support 44K sample rate", "can support a 5.1 channel layout". I am aware that this work is being deferred to a later phase of enhancing enumerateDevices(), and so on. But I think that's the best place to focus effort on selecting devices, rather than on a stopgap distinction between playback and communications.

Checking validity and access of SinkId should be asynchronous

In the algorithm of section 2.2 for setSinkID, the following steps are in the synchronous section:

If the sinkId does not match any audio output device identified by enumerateDevices(), return a promise rejected with a new DOMException whose name is NotFoundError.
If application is not authorized to play audio through the device identified by the given sinkId, return a promise rejected with a new DOMException whose name is SecurityError.

It turns out that checking 2) (which is a prerequisite for 3) requires enumerating the output devices, which can be a slow operation. We suggest that this is better done in the asynchronous section.

"If sinkId and the sinkId attribute are the same object, return a resolved promise."

A couple minor issues:

Neither sinkId nor the sinkId attribute are objects. sinkId is a string; the sinkId attribute is an attribute, but the sinkId's attribute's value is a string.
Resolved with what? "a promise resolved with undefined" is probably what you want here.

sinkId definition is still broken

#39 was closed but was not fixed correctly. See https://w3c.github.io/mediacapture-output/#attributes

Web Audio API Extensions monkeypatch

While this is good stuff, it's not appropriate for a final spec. See https://annevankesteren.nl/2014/02/monkey-patch. It's important to track getting these changes properly upstreamed to the Web Audio specification before this spec proceeds much further.

"Run the following steps asynchronously:"

"asynchronously" is not defined. It should be in parallel.

Selecting audio output in case device info permission is not granted

Currently, selecting audio output is difficult to implement if camera/microphone access is not granted. This is a limitation that we should try to overcome.
This might be solvable by investigate ways to discover output device IDs outside of enumerateDevices.

w3c / mediacapture-output Goto Github PK

mediacapture-output's Introduction

Specification 'mediacapture-output'

mediacapture-output's People

Contributors

Stargazers

Watchers

Forkers

mediacapture-output's Issues

a starting point

property values

remarks

Recommend Projects

Recommend Topics

Recommend Org