Giter Site home page Giter Site logo

Comments (4)

pdeljanov avatar pdeljanov commented on May 28, 2024 1

If I understand correctly, you're using FFmpeg to demux your audio and then passing the packets to Symphonia for decoding. If so, the issue is that the example is trying to do both demuxing and decoding with Symphonia.

When you have a WAVE file that contains uncompressed PCM audio, the audio packets are just the raw PCM samples. So when Symphonia tries to determine what audio format it is to select a demuxer (aka a FormatReader), it just happens to land on a MP3 demuxer because the raw PCM audio data just so happened to look like a MP3 file.

Essentially, you want to skip the demuxing step with Symphonia entirely. To do this you can instantiate a PcmDecoder manually and convert the FFmpeg packets to Symphonia Packets.

I didn't quite understand why you need FFmpeg to do the demuxing, though. Take a look at the the symphonia-check utility. It spawns a ffmpeg process and pipes the output WAVE to Symphonia. This works fine because Symphonia does the demuxing. Perhaps that utility can give you some ideas?

As for why AAC works: your ffmpeg command line for AAC creates an AAC file using the ADTS format. It looks like FFmpeg passes the ADTS packets straight through to the decoder without removing the ADTS headers. Therefore, since Symphonia has an ADTS demuxer, it is able to select the ADTS demuxer and everything works as intended.

from symphonia.

Lighty0410 avatar Lighty0410 commented on May 28, 2024

If I understand correctly, you're using FFmpeg to demux your audio and then passing the packets to Symphonia for decoding. If so, the issue is that the example is trying to do both demuxing and decoding with Symphonia.

When you have a WAVE file that contains uncompressed PCM audio, the audio packets are just the raw PCM samples. So when Symphonia tries to determine what audio format it is to select a demuxer (aka a FormatReader), it just happens to land on a MP3 demuxer because the raw PCM audio data just so happened to look like a MP3 file.

Essentially, you want to skip the demuxing step with Symphonia entirely. To do this you can instantiate a PcmDecoder manually and convert the FFmpeg packets to Symphonia Packets.

I didn't quite understand why you need FFmpeg to do the demuxing, though. Take a look at the the symphonia-check utility. It spawns a ffmpeg process and pipes the output WAVE to Symphonia. This works fine because Symphonia does the demuxing. Perhaps that utility can give you some ideas?

As for why AAC works: your ffmpeg command line for AAC creates an AAC file using the ADTS format. It looks like FFmpeg passes the ADTS packets straight through to the decoder without removing the ADTS headers. Therefore, since Symphonia has an ADTS demuxer, it is able to select the ADTS demuxer and everything works as intended.

Thank you A LOT. Now it works :).
I have one question left.
There's the function to create a packet from a slice:
pub fn new_from_slice(track_id: u32, ts: u64, dur: u64, buf: &[u8])
If i fill everything except for the buffer with 0s it works fine and everything is alright. However, i'm curious whether it could affect anything or not in my case since i'm manually decoding each and every packet?

I didn't quite understand why you need FFmpeg to do the demuxing, though. Take a look at the the symphonia-check utility. It spawns a ffmpeg process and pipes the output WAVE to Symphonia. This works fine because Symphonia does the demuxing. Perhaps that utility can give you some ideas?

I'm demuxing a .wav file here for the sake of simplicity. In a real life scenario i'm demuxing an mpeg-ts stream, transcode the AAC elementary stream to pcm_s16le, then decode it to f32 samples using Symphonia then feed it to whisper-gcc In real time. Btw, this is the prototype and maybe it will be useful for someone in a future:
prototype

from symphonia.

pdeljanov avatar pdeljanov commented on May 28, 2024

If i fill everything except for the buffer with 0s it works fine and everything is alright. However, i'm curious whether it could affect anything or not in my case since i'm manually decoding each and every packet?

This is fine for PcmDecoder. However, some other decoders (generally for the lossy codecs) use the duration to trim the decoded audio buffer for gapless playback.

I'm demuxing a .wav file here for the sake of simplicity. In a real life scenario i'm demuxing an mpeg-ts stream, transcode the AAC elementary stream to pcm_s16le, then decode it to f32 samples using Symphonia then feed it to whisper-gcc In real time. Btw, this is the prototype and maybe it will be useful for someone in a future:

Ah, I see. So you want to use ffmpeg because Symphonia doesn't have a MPEG-TS demuxer?

from symphonia.

Lighty0410 avatar Lighty0410 commented on May 28, 2024

Ah, I see. So you want to use ffmpeg because Symphonia doesn't have a MPEG-TS demuxer?

Yeah, exactly.

This is fine for PcmDecoder. However, some other decoders (generally for the lossy codecs) use the duration to trim the decoded audio buffer for gapless playback.

Thanks for the answer!

from symphonia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.