Comments (28)
I've been following the general discussion around WebCodecs, and the need for a media container API seems to be recognised, but I thought I'd add my own use case as an example.
I maintain a library waveform-data.js that produces data for waveform visualisation from audio.
This uses Web Audio decodeAudioData()
- but has the well-known problems: it runs on the main thread so UI updates stall during decoding, it requires the entire encoded audio to be held in memory, there's no indication of progress so I can't tell how long it will take to complete, and there's no way to cancel the decode.
For this use case, the simplest solution would be to allow decodeAudioData()
to run from a worker context, with an extended API to allow progress notifications and cancellation.
WebCodecs also solves these issues, but introduces a new one. Because the library is generic, it will accept any audio format that decodeAudioData
supports. So in order to use Web Codecs the library would have to include code to parse all the container formats, or define an API that moves container parsing to users of the library. Both options increase the amount of JavaScript that needs to be delivered, and unnecessarily so because parsing the container is a capability the browser already has. Also, leaving container parsing to library users would make the library much harder for people to use.
from webcodecs.
The concern I have with containers is that it exposes a lot of API and specification surface but does not unlock anything new that can't already be done (pretty efficiently even) today.
Following the principles of the Extensible Web Manifesto (https://extensiblewebmanifesto.org/), we should focus on delivering low-level codecs first. Only after that's done (or in parallel by a different group) consider tackling containers.
from webcodecs.
It's something that can be done JS/wasm
While there does exist code which can write input images and output a WebM file (https://github.com/GoogleChromeLabs/webm-wasm; https://github.com/thenickdude/webm-writer-js) neither of the repository authors are interested in implementing writing audio to the same output file
thenickdude/webm-writer-js#8 (comment)
I don't have any plans to work on adding audio, sorry, and I'm not sure where best to begin either (it probably depends on what format you can capture the audio in and what environment you expect to run in, Chrome, arbitrary browser, Electron, etc).
It seems this project doesn't support encoding audio+video yet, just video-only? It this feasible, or would it be better to just use the more heavyweight ffmpeg.js project for this?
GoogleChromeLabs/webm-wasm#12 (comment)
Yeah, there’s no support for audio and I don’t have any plans to add it. This project was born out of the lackluster capabilities of
MediaStreamRecorder
.ffmpeg.js is definitely one choice. But if you already have an encoded audio and video stream, an mkv muxer might do. That would be a lot faster and smaller. Hope this helps!
It seems this project doesn't support encoding audio+video yet, just video-only? It this feasible, or would it be better to just use the more heavyweight ffmpeg.js project for this?
There is an implementation which is capable of writing audio as Opus to a WebM container https://github.com/kbumsik/opus-media-recorder/.
Meaning this repository can reach maturity without a corresponding container writer being available and having proximal maturoty to write the output of WebCodecs to a file.
Thus, simply because "JS/wasm" exists does not mean that implementations exist to meet the requirements described at Key use-cases, particuarly
- Non-realtime encoding/decoding/transcoding, such as for local file editing.
from webcodecs.
I managed to make MP4Box work with WebCodecs! See code below and a working example at https://codepen.io/Latcarf/pen/NWBmJVw.
The main thing I am still very confused about is the codec string. I couldn’t find a single example of valid H264 codec string on MDN, and after some trial and error I settled on avc1.64003d
(found somewhere online) which seems to mostly work, but I have very little understanding of what it means (even after trying to read everything I can find about profiles and levels). It also doesn’t seem to always work, for instance if you change the size of the video to 200×200, it fails with a rather cryptic DOMException: Encoding error.
(without any more explanation).
It would be great if either there was for instance a catch-all codec string h264
(or avc1
, or mp4
) which would mean "Choose whatever avc1.xxxxxx
codec string that you believe is most appropriate", or at the very least some examples on MDN, like "If you want H264 HD video choose this, if you want a small H264 video choose that". I guess the MediaRecorder API already chooses an appropriate codec string on behalf of the user based on the size of the canvas, so it would be great if WebCodecs could do the same.
It also seems like we cannot create the track upfront because it needs the metadata.decoderConfig.description
(which I have no idea what it contains). That’s not a big issue, but it is a bit hard to guess.
Here is my function doing the encoding:
const encodeFramesToMP4 = async ({width, height, fps, frames, renderFrame}) => {
const f = MP4Box.createFile();
let track = null;
const frameDuration = 1_000_000/fps;
const encoder = new VideoEncoder({
output: (chunk, metadata) => {
if (track === null) {
track = f.addTrack({
timescale: 1_000_000,
width,
height,
avcDecoderConfigRecord: metadata.decoderConfig?.description,
});
}
const buffer = new ArrayBuffer(chunk.byteLength);
chunk.copyTo(buffer);
f.addSample(track, buffer, {
duration: frameDuration,
});
},
error: (error) => {
throw error;
}
});
encoder.configure({
codec: "avc1.64003d",
width,
height,
});
for (let i = 0; i < frames; i++) {
const frame = new VideoFrame(
renderFrame(i),
{timestamp: i * frameDuration},
);
encoder.encode(frame);
frame.close();
}
await encoder.flush();
encoder.close();
return f;
}
And here is how it is used. We simply create an OffscreenCanvas and provide a function that can draw a given frame.
const renderExampleVideo = async () => {
const width = 600;
const height = 600;
const canvas = new OffscreenCanvas(width, height);
const file = await encodeFramesToMP4({
width,
height,
fps: 30,
frames: 30, // duration in frames
renderFrame: i => {
// ...
// draw frame #i on the canvas
// ...
return canvas
}
})
file.save("Example.mp4");
}
Feel free to let me know if there is any issue in my code or anything that could be improved.
from webcodecs.
For all containers you'd definitely need something like ffmpeg.wasm
#549 shows how this might work. Even if browsers had a containers API, it'd likely only support the formats they already parse (mp4, webm, ogg, etc)
from webcodecs.
I agree completely. That's a great way to put it.
from webcodecs.
- Non-realtime encoding/decoding/transcoding, such as for local file editing
- Decoded and encoding images
- Reencoding multiple input media streams in order to merge many encoded media streams into one encoded media stream.
and potentially
- Live stream uploading
each could be considered an API that at least in part performed some form of code to edit a file. A file could be considered a "container" when technically compared to
- Extremely low latency live streaming (<3s delay)
- Cloud gaming
- Advanced Real-time Communications:
-- e2e encryption
-- control over buffer behavior
-- spatial and temporal scalability
and potentially
- Live stream uploading
where any and all of the above is encompassed within the use case of recording media into a container whether that "container" be an array of images with an index element for "metadata", i.e., width, height, frame duration, or other adjustments made "mid-stream" or post-production ("codec" or instruction), to a .json
(or, if preferred Matroska or WebM) "container", for download of both the entire procedure output by WebCodecs and specific time slices into a single "container" (file structure).
Since the topic is at hand and the maintainers of this repository spans a wide range of topics it might be helpful to create a glossary to point to exactly what you (this repository) mean the definition of the term that you are using.
- Direct APIs for media containers (muxers/demuxers)
indicates the technical proximity of "codecs" to "containers".
Whether or not WebCodecs includes the reading, writing, editing, etc. of both "codecs" and "containers" that internal decision will not prevent nor preclude the fact of the actual use-case for a single API which is capable of both codec and container creation, extension, modification, etc., to avoid the necessity of attempting to utilizing what is available in different specifications that were not initially conceived as being interoperable with other APIs, both existing and proposed.
One recent use case for not omitting to include the capability to perform the same procedures within the scope of "containers" the same as "codecs" is that the two are symbiotic and when a single API designed and maintained with that consideration at the forefront and throughout has the potential to solve more than one existing issue where separate portions of code in the same domain, for example, media, could have very different output due to different authors' intent at the time they wrote and merged the code: time passes, new technologies that the issue resolves are now an issue because the implementers may or may not be in accord with the various branches of "media". A single API from creation to editing to production of media streams and files is the explicit use case.
from webcodecs.
Example use case: write audio to a WebM file https://plnkr.co/edit/Inb676?p=preview. Ideally, from a front-end perspective, this single API should be able to encode VP8 and Opus to a file, or if Opus is missing from an existing file, write the audio to the file.
from webcodecs.
(or in parallel by a different group)
What is necessary to start such a group? Post the proposal at https://discourse.wicg.io? (Note, am not a member of W3C).
from webcodecs.
FWIW https://discourse.wicg.io/t/webmediacontainers-proposal/3928
from webcodecs.
To write down what was said during TPAC, this might be very important to avoid the proliferation of badly muxed files. Muxing is rather hard to get right.
A possible solution might be a vouched library, as noted above, but there is always the problem of updating it for bug fixes.
from webcodecs.
Triage note: marking 'extension', as this would clearly be a new API.
from webcodecs.
@chrisn thanks for the use case.
I'm a little torn. I find the argument about existing demuxers to be persuasive, but less so on the muxing side. Browsers have long compiled-in full featured demuxers for <video>
and MSE, but for muxing I think the only example is MediaRecorder, and the files it produces are pretty basic. For example, I don't think we currently ship a muxer that could produce a fragmented MP4.
We've found JS demuxing performance is quite good. Performance equal, there are some advantages to JS like rapid extensibility and perfect interoperability. My hope is that the download hit is largely amortized away by caching. WDYT?
But the JS answer rings a little hollow because the available libraries for this are pretty limited right now. If folks like the idea, we could organize a community / WG effort to build / centralize.
from webcodecs.
I think containers are an entirely separate API from WebCodecs. The interfaces and processing model are likely entirely different from WebCodecs. E.g., it's likely a streams based API would work very well for containers. It will also need its own containers registry which describes per-container behavior.
IMHO, the options for solving this use case are:
- Do nothing and let the JS ecosystem for this flourish.
- Create an entirely new WebContainers API for muxing/demuxing in a new spec.
- Extend MediaSourceExtensions for demuxing and MediaRecorder for muxing in their respective specs.
from webcodecs.
I agree with both @dalecurtis and @chcunningham. Gecko is also running in-content-process WASM demuxers for security reasons (essentially, libogg
compiled to WASM running in process), and confirms the findings of the link above. This has been shipped in release for a few versions without a single problem reported.
I prefer option 1 and 2 in @dalecurtis's comment, and this can be a gradual solution (1 then 2 if really needed).
3 I like less, those objects are not at the same abstraction level, and MediaRecorder
only supports real-time media (not offline processing), although Gecko implements a proprietary extension that allows encoding faster than real-time, that we only used for testing (not exposed to the web of course).
from webcodecs.
Is there a list of container projects? I've written a WebM muxer but this doesn't seem like the right place to keep track of them.
from webcodecs.
I ended up writing a quick explainer for the third bullet in #24 (comment) (Extending MediaRecorder for muxing):
https://github.com/dalecurtis/mediarecorder-muxer/blob/main/explainer.md
Have your thoughts in #24 (comment) changed at all now that WebCodecs is more fleshed out @padenot?
At least internally folks don't seem to hate it. It's only targeted towards simple use cases as a hedge against a more complete containers API (which we (Chromium) are unlikely to undertake anytime soon). It looks like a fairly small implementation delta. Is this interesting at all?
cc: @youennf
from webcodecs.
Did anyone find a way to generate mp4 files client-side in Chrome? I tried all possible ways, but couldn't find one that works:
- MediaRecorder only supports real-time encoding, and doesn't support mp4 in Chrome on Mac anyway (only mkv and webm).
- WebCodecs doesn't give any way to save the encoded stream into an mp4 file.
- I tried WebCodecs + mp4box-js, but it seems to require a pretty deep understanding of the mp4 format, boxes and stuff, which I don't have, I just want a "save to mp4" function.
I'm working on a 2D animation app in the browser, I can currently easily export animations as a sequence of frames but not being able to export them as an mp4 file is pretty limiting. The format needs to be mp4 as the videos are meant to be imported in other programs that unfortunately only support mp4.
The other options I have left are
- Ask my users to use Safari instead (!), as MediaRecorder on Safari does support mp4. Unfortunately my users typically use Chrome, and there is a number of other things that are better supported in Chrome, like file system access and multiscreen support.
- Use MediaRecorder with something like ffmpeg-wasm, but last time I checked it was around 25M which seems overkill for my app which is otherwise only around 0.5M (and it would only support real-time encoding anyway).
- Ask my users to download the animation as a sequence of frames, and to then use QuickTime to turn it into a video. Not ideal, obviously.
I'm not sure if it should be in this or another specification, but it seems like a pretty important missing use case. If the reason for it not being included is because it can already be done in Javascript, please link to a library that can actually do it.
from webcodecs.
FWIW, MP4 support for MediaRecorder is being worked on in Chrome. You can follow along here: https://bugs.chromium.org/p/chromium/issues/detail?id=1072056
What went wrong with mp4box.js exactly? https://github.com/gpac/mp4box.js/blob/master/test/qunit-iso-creation.js shows how to handle creation. I think the only thing you might need to tweak is the segment size.
from webcodecs.
FWIW, MP4 support for MediaRecorder is being worked on in Chrome. You can follow along here: https://bugs.chromium.org/p/chromium/issues/detail?id=1072056
Great to hear, thanks for the link!
What went wrong with mp4box.js exactly? https://github.com/gpac/mp4box.js/blob/master/test/qunit-iso-creation.js shows how to handle creation. I think the only thing you might need to tweak is the segment size.
Actually I think it is mux.js that I have tried, not mp4box.js. It has a test file https://github.com/videojs/mux.js/blob/main/test/mp4-generator.test.js with a very promising name, but all the code there goes pretty deep into box types and things like that, so I could not manage to create a working mp4 file from that.
I did not know about this mp4box example, but I'll definitely give it another try, thank you!
from webcodecs.
Thanks for sharing! Codec strings can be pretty annoying. Here are some good references if you haven't seen them:
https://developer.mozilla.org/en-US/docs/Web/Media/Formats/codecs_parameter
https://cconcolato.github.io/media-mime-support/
from webcodecs.
I am currently working on a video editor that runs entirely in the Chrome browser. I am currently struggling with the demuxing and muxing process. Mp4Box.js works great, but unfortunately it only allows .mp4 containers to be demuxed. Is there already a demuxer that allows you to demux any container type? I was thinking of ffmpeg.wasm (I saw clipchamp uses it too), but since it's a cli tool, I have no idea how to use it as an all-in-one demuxer in javascript. Is it even possible? The end result of the demuxing process should be to have all the EncodedVideoFrames in one array.
from webcodecs.
Thats exactly what I needed. Thanks!
from webcodecs.
I also work on a fully in-browser video editing experience. While I made it work with ffmpeg+webcodecs, I can see massive value in adding something like the WebContainers API. Maybe my angle is a little different here; I had edge cases where I had to do workarounds and fixups on the demuxed streams before I could feed them to the VideoDecoder. Some examples:
- I had to handle negative presentation timestamps in streams. The negative timestamp is not a problem for the VideoDecoder, but it results in the VideoFrame having a different timestamp than what I gave when calling decode.
- For some videos, the extradata I receive from ffmpeg is missing some information that is available somewhere in the stream I guess. Without that, the VideoDecoder fails.
Obviously, these things are out of the scope of WebCodecs. But I assume all these touches are already written somewhere in the major browsers because those problematic videos play just fine in Chrome. If I could have access to the video and audio stream in a way that the browser thinks it's appropriate for decoding, that would be a game-changer.
from webcodecs.
Maybe unsurprisingly, edge cases are one of the reasons we wouldn't want to do this. The argument being that containers are all edge cases and an external library meets needs the best. It's likely if we did undertake this the API would be limited to very common muxing and demuxing scenarios. E.g., playback w/ seeking and basic recording scenarios.
IIRC, decoded timestamps should just pass through per spec: https://w3c.github.io/webcodecs/#output-videoframes -- If you're not seeing that, please file an issue with the respective UA. https://crbug.com/new for Chromium.
from webcodecs.
Can we close this issue?
from webcodecs.
I think so.
from webcodecs.
I think so too, but worth keeping track of developer interest, so have created w3c/media-and-entertainment#108 - anyone still interested is welcome to comment there.
from webcodecs.
Related Issues (20)
- Support for AV1 switch frames HOT 5
- Fatal encoding/decoding error on Windows 10/11 HOT 2
- Why is isTypeSupported() promise-based? HOT 8
- Receiving Uncompressed Webcam Data without Browser Compression HOT 5
- After how many decode should the codec process the frames? HOT 10
- Clarify `reset()` behaviour when multiple things are being output HOT 3
- key-frame request handling when scalability mode is not L1T1 for encoder HOT 6
- Clarify the `Clone configuration` algorithm HOT 4
- Figure out what should happen to the unused bits in 10-bits and 12-bits pixel formats
- Rephrase non-normative uses of RFC2119 keywords HOT 1
- Web Audio API compatibility HOT 1
- Assign VideoFrame resource to [[resource reference]] in BufferSource constructor HOT 4
- VideoFrameBufferInit metadata field missing
- sourceWidthBytes from sampleWidth, not sampleHeight
- Vekil
- Sporadic build failures HOT 4
- Define scope for w3c candidate recommendation HOT 3
- Candidate Recommendation tracking issue
- VideoFrame copyTo() behavior with non-RGBA/RGBX/BGRA/BGRX formats HOT 1
- VideoPixelFormat enum values do not follow casing rule guidelines HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from webcodecs.