Giter Site home page Giter Site logo

API for containers? about webcodecs HOT 28 CLOSED

w3c avatar w3c commented on July 4, 2024 10
API for containers?

from webcodecs.

Comments (28)

chrisn avatar chrisn commented on July 4, 2024 3

I've been following the general discussion around WebCodecs, and the need for a media container API seems to be recognised, but I thought I'd add my own use case as an example.

I maintain a library waveform-data.js that produces data for waveform visualisation from audio.

This uses Web Audio decodeAudioData() - but has the well-known problems: it runs on the main thread so UI updates stall during decoding, it requires the entire encoded audio to be held in memory, there's no indication of progress so I can't tell how long it will take to complete, and there's no way to cancel the decode.

For this use case, the simplest solution would be to allow decodeAudioData() to run from a worker context, with an extended API to allow progress notifications and cancellation.

WebCodecs also solves these issues, but introduces a new one. Because the library is generic, it will accept any audio format that decodeAudioData supports. So in order to use Web Codecs the library would have to include code to parse all the container formats, or define an API that moves container parsing to users of the library. Both options increase the amount of JavaScript that needs to be delivered, and unnecessarily so because parsing the container is a capability the browser already has. Also, leaving container parsing to library users would make the library much harder for people to use.

from webcodecs.

steveanton avatar steveanton commented on July 4, 2024 1

The concern I have with containers is that it exposes a lot of API and specification surface but does not unlock anything new that can't already be done (pretty efficiently even) today.

Following the principles of the Extensible Web Manifesto (https://extensiblewebmanifesto.org/), we should focus on delivering low-level codecs first. Only after that's done (or in parallel by a different group) consider tackling containers.

from webcodecs.

guest271314 avatar guest271314 commented on July 4, 2024 1

@pthatcherg

It's something that can be done JS/wasm

While there does exist code which can write input images and output a WebM file (https://github.com/GoogleChromeLabs/webm-wasm; https://github.com/thenickdude/webm-writer-js) neither of the repository authors are interested in implementing writing audio to the same output file

thenickdude/webm-writer-js#8 (comment)

I don't have any plans to work on adding audio, sorry, and I'm not sure where best to begin either (it probably depends on what format you can capture the audio in and what environment you expect to run in, Chrome, arbitrary browser, Electron, etc).

GoogleChromeLabs/webm-wasm#12

It seems this project doesn't support encoding audio+video yet, just video-only? It this feasible, or would it be better to just use the more heavyweight ffmpeg.js project for this?

GoogleChromeLabs/webm-wasm#12 (comment)

Yeah, there’s no support for audio and I don’t have any plans to add it. This project was born out of the lackluster capabilities of MediaStreamRecorder.

ffmpeg.js is definitely one choice. But if you already have an encoded audio and video stream, an mkv muxer might do. That would be a lot faster and smaller. Hope this helps!

It seems this project doesn't support encoding audio+video yet, just video-only? It this feasible, or would it be better to just use the more heavyweight ffmpeg.js project for this?

There is an implementation which is capable of writing audio as Opus to a WebM container https://github.com/kbumsik/opus-media-recorder/.

Meaning this repository can reach maturity without a corresponding container writer being available and having proximal maturoty to write the output of WebCodecs to a file.

Thus, simply because "JS/wasm" exists does not mean that implementations exist to meet the requirements described at Key use-cases, particuarly

  • Non-realtime encoding/decoding/transcoding, such as for local file editing.

from webcodecs.

guillaumebrunerie avatar guillaumebrunerie commented on July 4, 2024 1

I managed to make MP4Box work with WebCodecs! See code below and a working example at https://codepen.io/Latcarf/pen/NWBmJVw.

The main thing I am still very confused about is the codec string. I couldn’t find a single example of valid H264 codec string on MDN, and after some trial and error I settled on avc1.64003d (found somewhere online) which seems to mostly work, but I have very little understanding of what it means (even after trying to read everything I can find about profiles and levels). It also doesn’t seem to always work, for instance if you change the size of the video to 200×200, it fails with a rather cryptic DOMException: Encoding error. (without any more explanation).

It would be great if either there was for instance a catch-all codec string h264 (or avc1, or mp4) which would mean "Choose whatever avc1.xxxxxx codec string that you believe is most appropriate", or at the very least some examples on MDN, like "If you want H264 HD video choose this, if you want a small H264 video choose that". I guess the MediaRecorder API already chooses an appropriate codec string on behalf of the user based on the size of the canvas, so it would be great if WebCodecs could do the same.

It also seems like we cannot create the track upfront because it needs the metadata.decoderConfig.description (which I have no idea what it contains). That’s not a big issue, but it is a bit hard to guess.

Here is my function doing the encoding:

const encodeFramesToMP4 = async ({width, height, fps, frames, renderFrame}) => {
	const f = MP4Box.createFile();
	let track = null;
	const frameDuration = 1_000_000/fps;

	const encoder = new VideoEncoder({
		output: (chunk, metadata) => {
			if (track === null) {
				track = f.addTrack({
					timescale: 1_000_000,
					width,
					height,
					avcDecoderConfigRecord: metadata.decoderConfig?.description,
				});
			}

			const buffer = new ArrayBuffer(chunk.byteLength);
			chunk.copyTo(buffer);
			f.addSample(track, buffer, {
				duration: frameDuration,
			});
		},
		error: (error) => {
			throw error;
		}
	});
	encoder.configure({
		codec: "avc1.64003d",
		width,
		height,
	});

	for (let i = 0; i < frames; i++) {
		const frame = new VideoFrame(
			renderFrame(i),
			{timestamp: i * frameDuration},
		);
		encoder.encode(frame);
		frame.close();
	}
	await encoder.flush();
	encoder.close();
	return f;
}

And here is how it is used. We simply create an OffscreenCanvas and provide a function that can draw a given frame.

const renderExampleVideo = async () => {
	const width = 600;
	const height = 600;
	const canvas = new OffscreenCanvas(width, height);

	const file = await encodeFramesToMP4({
		width,
		height,
		fps: 30,
		frames: 30, // duration in frames
		renderFrame: i => {
			// ...
			// draw frame #i on the canvas
			// ...
			return canvas
		}
	})
	file.save("Example.mp4");
}

Feel free to let me know if there is any issue in my code or anything that could be improved.

from webcodecs.

dalecurtis avatar dalecurtis commented on July 4, 2024 1

For all containers you'd definitely need something like ffmpeg.wasm #549 shows how this might work. Even if browsers had a containers API, it'd likely only support the formats they already parse (mp4, webm, ogg, etc)

from webcodecs.

pthatcherg avatar pthatcherg commented on July 4, 2024

I agree completely. That's a great way to put it.

from webcodecs.

guest271314 avatar guest271314 commented on July 4, 2024

Key use-cases

  • Non-realtime encoding/decoding/transcoding, such as for local file editing
  • Decoded and encoding images
  • Reencoding multiple input media streams in order to merge many encoded media streams into one encoded media stream.

and potentially

  • Live stream uploading

each could be considered an API that at least in part performed some form of code to edit a file. A file could be considered a "container" when technically compared to

  • Extremely low latency live streaming (<3s delay)
  • Cloud gaming
  • Advanced Real-time Communications:
    -- e2e encryption
    -- control over buffer behavior
    -- spatial and temporal scalability

and potentially

  • Live stream uploading

where any and all of the above is encompassed within the use case of recording media into a container whether that "container" be an array of images with an index element for "metadata", i.e., width, height, frame duration, or other adjustments made "mid-stream" or post-production ("codec" or instruction), to a .json (or, if preferred Matroska or WebM) "container", for download of both the entire procedure output by WebCodecs and specific time slices into a single "container" (file structure).

Since the topic is at hand and the maintainers of this repository spans a wide range of topics it might be helpful to create a glossary to point to exactly what you (this repository) mean the definition of the term that you are using.

Non-goal

  • Direct APIs for media containers (muxers/demuxers)

indicates the technical proximity of "codecs" to "containers".

Whether or not WebCodecs includes the reading, writing, editing, etc. of both "codecs" and "containers" that internal decision will not prevent nor preclude the fact of the actual use-case for a single API which is capable of both codec and container creation, extension, modification, etc., to avoid the necessity of attempting to utilizing what is available in different specifications that were not initially conceived as being interoperable with other APIs, both existing and proposed.

One recent use case for not omitting to include the capability to perform the same procedures within the scope of "containers" the same as "codecs" is that the two are symbiotic and when a single API designed and maintained with that consideration at the forefront and throughout has the potential to solve more than one existing issue where separate portions of code in the same domain, for example, media, could have very different output due to different authors' intent at the time they wrote and merged the code: time passes, new technologies that the issue resolves are now an issue because the implementers may or may not be in accord with the various branches of "media". A single API from creation to editing to production of media streams and files is the explicit use case.

from webcodecs.

guest271314 avatar guest271314 commented on July 4, 2024

Example use case: write audio to a WebM file https://plnkr.co/edit/Inb676?p=preview. Ideally, from a front-end perspective, this single API should be able to encode VP8 and Opus to a file, or if Opus is missing from an existing file, write the audio to the file.

from webcodecs.

guest271314 avatar guest271314 commented on July 4, 2024

@steveanton

(or in parallel by a different group)

What is necessary to start such a group? Post the proposal at https://discourse.wicg.io? (Note, am not a member of W3C).

from webcodecs.

guest271314 avatar guest271314 commented on July 4, 2024

FWIW https://discourse.wicg.io/t/webmediacontainers-proposal/3928

from webcodecs.

padenot avatar padenot commented on July 4, 2024

To write down what was said during TPAC, this might be very important to avoid the proliferation of badly muxed files. Muxing is rather hard to get right.

A possible solution might be a vouched library, as noted above, but there is always the problem of updating it for bug fixes.

from webcodecs.

chcunningham avatar chcunningham commented on July 4, 2024

Triage note: marking 'extension', as this would clearly be a new API.

from webcodecs.

chcunningham avatar chcunningham commented on July 4, 2024

@chrisn thanks for the use case.

I'm a little torn. I find the argument about existing demuxers to be persuasive, but less so on the muxing side. Browsers have long compiled-in full featured demuxers for <video> and MSE, but for muxing I think the only example is MediaRecorder, and the files it produces are pretty basic. For example, I don't think we currently ship a muxer that could produce a fragmented MP4.

We've found JS demuxing performance is quite good. Performance equal, there are some advantages to JS like rapid extensibility and perfect interoperability. My hope is that the download hit is largely amortized away by caching. WDYT?

But the JS answer rings a little hollow because the available libraries for this are pretty limited right now. If folks like the idea, we could organize a community / WG effort to build / centralize.

from webcodecs.

dalecurtis avatar dalecurtis commented on July 4, 2024

I think containers are an entirely separate API from WebCodecs. The interfaces and processing model are likely entirely different from WebCodecs. E.g., it's likely a streams based API would work very well for containers. It will also need its own containers registry which describes per-container behavior.

IMHO, the options for solving this use case are:

  • Do nothing and let the JS ecosystem for this flourish.
  • Create an entirely new WebContainers API for muxing/demuxing in a new spec.
  • Extend MediaSourceExtensions for demuxing and MediaRecorder for muxing in their respective specs.

from webcodecs.

padenot avatar padenot commented on July 4, 2024

I agree with both @dalecurtis and @chcunningham. Gecko is also running in-content-process WASM demuxers for security reasons (essentially, libogg compiled to WASM running in process), and confirms the findings of the link above. This has been shipped in release for a few versions without a single problem reported.

I prefer option 1 and 2 in @dalecurtis's comment, and this can be a gradual solution (1 then 2 if really needed).

3 I like less, those objects are not at the same abstraction level, and MediaRecorder only supports real-time media (not offline processing), although Gecko implements a proprietary extension that allows encoding faster than real-time, that we only used for testing (not exposed to the web of course).

from webcodecs.

davedoesdev avatar davedoesdev commented on July 4, 2024

Is there a list of container projects? I've written a WebM muxer but this doesn't seem like the right place to keep track of them.

from webcodecs.

dalecurtis avatar dalecurtis commented on July 4, 2024

I ended up writing a quick explainer for the third bullet in #24 (comment) (Extending MediaRecorder for muxing):

https://github.com/dalecurtis/mediarecorder-muxer/blob/main/explainer.md

Have your thoughts in #24 (comment) changed at all now that WebCodecs is more fleshed out @padenot?

At least internally folks don't seem to hate it. It's only targeted towards simple use cases as a hedge against a more complete containers API (which we (Chromium) are unlikely to undertake anytime soon). It looks like a fairly small implementation delta. Is this interesting at all?

cc: @youennf

from webcodecs.

guillaumebrunerie avatar guillaumebrunerie commented on July 4, 2024

Did anyone find a way to generate mp4 files client-side in Chrome? I tried all possible ways, but couldn't find one that works:

  • MediaRecorder only supports real-time encoding, and doesn't support mp4 in Chrome on Mac anyway (only mkv and webm).
  • WebCodecs doesn't give any way to save the encoded stream into an mp4 file.
  • I tried WebCodecs + mp4box-js, but it seems to require a pretty deep understanding of the mp4 format, boxes and stuff, which I don't have, I just want a "save to mp4" function.

I'm working on a 2D animation app in the browser, I can currently easily export animations as a sequence of frames but not being able to export them as an mp4 file is pretty limiting. The format needs to be mp4 as the videos are meant to be imported in other programs that unfortunately only support mp4.

The other options I have left are

  • Ask my users to use Safari instead (!), as MediaRecorder on Safari does support mp4. Unfortunately my users typically use Chrome, and there is a number of other things that are better supported in Chrome, like file system access and multiscreen support.
  • Use MediaRecorder with something like ffmpeg-wasm, but last time I checked it was around 25M which seems overkill for my app which is otherwise only around 0.5M (and it would only support real-time encoding anyway).
  • Ask my users to download the animation as a sequence of frames, and to then use QuickTime to turn it into a video. Not ideal, obviously.

I'm not sure if it should be in this or another specification, but it seems like a pretty important missing use case. If the reason for it not being included is because it can already be done in Javascript, please link to a library that can actually do it.

from webcodecs.

dalecurtis avatar dalecurtis commented on July 4, 2024

FWIW, MP4 support for MediaRecorder is being worked on in Chrome. You can follow along here: https://bugs.chromium.org/p/chromium/issues/detail?id=1072056

What went wrong with mp4box.js exactly? https://github.com/gpac/mp4box.js/blob/master/test/qunit-iso-creation.js shows how to handle creation. I think the only thing you might need to tweak is the segment size.

from webcodecs.

guillaumebrunerie avatar guillaumebrunerie commented on July 4, 2024

FWIW, MP4 support for MediaRecorder is being worked on in Chrome. You can follow along here: https://bugs.chromium.org/p/chromium/issues/detail?id=1072056

Great to hear, thanks for the link!

What went wrong with mp4box.js exactly? https://github.com/gpac/mp4box.js/blob/master/test/qunit-iso-creation.js shows how to handle creation. I think the only thing you might need to tweak is the segment size.

Actually I think it is mux.js that I have tried, not mp4box.js. It has a test file https://github.com/videojs/mux.js/blob/main/test/mp4-generator.test.js with a very promising name, but all the code there goes pretty deep into box types and things like that, so I could not manage to create a working mp4 file from that.
I did not know about this mp4box example, but I'll definitely give it another try, thank you!

from webcodecs.

dalecurtis avatar dalecurtis commented on July 4, 2024

Thanks for sharing! Codec strings can be pretty annoying. Here are some good references if you haven't seen them:
https://developer.mozilla.org/en-US/docs/Web/Media/Formats/codecs_parameter
https://cconcolato.github.io/media-mime-support/

from webcodecs.

KevinBoeing avatar KevinBoeing commented on July 4, 2024

I am currently working on a video editor that runs entirely in the Chrome browser. I am currently struggling with the demuxing and muxing process. Mp4Box.js works great, but unfortunately it only allows .mp4 containers to be demuxed. Is there already a demuxer that allows you to demux any container type? I was thinking of ffmpeg.wasm (I saw clipchamp uses it too), but since it's a cli tool, I have no idea how to use it as an all-in-one demuxer in javascript. Is it even possible? The end result of the demuxing process should be to have all the EncodedVideoFrames in one array.

from webcodecs.

KevinBoeing avatar KevinBoeing commented on July 4, 2024

Thats exactly what I needed. Thanks!

from webcodecs.

bartadaniel avatar bartadaniel commented on July 4, 2024

I also work on a fully in-browser video editing experience. While I made it work with ffmpeg+webcodecs, I can see massive value in adding something like the WebContainers API. Maybe my angle is a little different here; I had edge cases where I had to do workarounds and fixups on the demuxed streams before I could feed them to the VideoDecoder. Some examples:

  • I had to handle negative presentation timestamps in streams. The negative timestamp is not a problem for the VideoDecoder, but it results in the VideoFrame having a different timestamp than what I gave when calling decode.
  • For some videos, the extradata I receive from ffmpeg is missing some information that is available somewhere in the stream I guess. Without that, the VideoDecoder fails.

Obviously, these things are out of the scope of WebCodecs. But I assume all these touches are already written somewhere in the major browsers because those problematic videos play just fine in Chrome. If I could have access to the video and audio stream in a way that the browser thinks it's appropriate for decoding, that would be a game-changer.

from webcodecs.

dalecurtis avatar dalecurtis commented on July 4, 2024

Maybe unsurprisingly, edge cases are one of the reasons we wouldn't want to do this. The argument being that containers are all edge cases and an external library meets needs the best. It's likely if we did undertake this the API would be limited to very common muxing and demuxing scenarios. E.g., playback w/ seeking and basic recording scenarios.

IIRC, decoded timestamps should just pass through per spec: https://w3c.github.io/webcodecs/#output-videoframes -- If you're not seeing that, please file an issue with the respective UA. https://crbug.com/new for Chromium.

from webcodecs.

aboba avatar aboba commented on July 4, 2024

Can we close this issue?

from webcodecs.

padenot avatar padenot commented on July 4, 2024

I think so.

from webcodecs.

chrisn avatar chrisn commented on July 4, 2024

I think so too, but worth keeping track of developer interest, so have created w3c/media-and-entertainment#108 - anyone still interested is welcome to comment there.

from webcodecs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.