Giter Site home page Giter Site logo

pdeljanov / symphonia Goto Github PK

View Code? Open in Web Editor NEW
2.1K 2.1K 121.0 1.95 MB

Pure Rust multimedia format demuxing, tag reading, and audio decoding library

License: Mozilla Public License 2.0

Rust 99.86% Python 0.14%
aac adpcm alac apple-lossless audio audio-decoder flac id3v1 id3v2 m4a media mkv mp2 mp3 mp4 ogg pcm rust vorbis wav

symphonia's People

Contributors

101100 avatar 5225225 avatar aentity avatar aschey avatar be-ing avatar blackholefox avatar dedobbin avatar djugei avatar erikas-taroza avatar felixmcfelix avatar geckoxx avatar gnomeddev avatar herohtar avatar herschel avatar james7132 avatar jasonlg1979 avatar jesnor avatar karlri avatar nukeop avatar pdeljanov avatar perxjoh avatar richardmitic avatar sagudev avatar shnatsel avatar techno-coder avatar tenzap avatar terrorfisch avatar thomcc avatar timmmm avatar udoprog avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

symphonia's Issues

AIFF support

Uncompressed PCM data is sometimes stored in AIFF files rather than WAV. It would be nice to add support for this.

Panic in symphonia-format-ogg/src/demuxer.rs:229:9

These files cause a panic when fed to symphonia-play --decode-only: ogg-panic.tar.gz

thread 'main' panicked at 'assertion failed: self.pages.header().is_first_page', symphonia-format-ogg/src/demuxer.rs:229:9

These are fuzzer-generated files I had lying around from back when I fuzzed the lewton crate.

Tested on commit 7bbb8aa

Overflow error in MediaSourceStream

With a certain .flac file I managed to cause this error. In release profile this error does not appear but causes an infinite loop. Note that I'm on Windows 10 with msvc toolchain. The error below is from sonata-play but I experienced the same in a separate project that only decodes the file.

thread 'main' panicked at 'attempt to subtract with overflow', sonata-core\src\io\media_source_stream.rs:384:12
stack backtrace:
   0: backtrace::backtrace::trace_unsynchronized
             at C:\Users\VssAdministrator\.cargo\registry\src\github.com-1ecc6299db9ec823\backtrace-0.3.46\src\backtrace\mod.rs:66
   1: std::sys_common::backtrace::_print_fmt
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\sys_common\backtrace.rs:78
   2: std::sys_common::backtrace::_print::{{impl}}::fmt
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\sys_common\backtrace.rs:59
   3: core::fmt::write
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libcore\fmt\mod.rs:1076
   4: std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\io\mod.rs:1537
   5: std::sys_common::backtrace::_print
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\sys_common\backtrace.rs:62
   6: std::sys_common::backtrace::print
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\sys_common\backtrace.rs:49
   7: std::panicking::default_hook::{{closure}}
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panicking.rs:198
   8: std::panicking::default_hook
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panicking.rs:218
   9: std::panicking::rust_panic_with_hook
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panicking.rs:486
  10: std::panicking::begin_panic_handler
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panicking.rs:388
  11: core::panicking::panic_fmt
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libcore\panicking.rs:101
  12: core::panicking::panic
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libcore\panicking.rs:56
  13: sonata_core::io::media_source_stream::{{impl}}::read_quad_bytes
             at .\sonata-core\src\io\media_source_stream.rs:384
  14: sonata_core::io::{{impl}}::read_quad_bytes
             at .\sonata-core\src\io\mod.rs:307
  15: sonata_core::io::scoped_stream::{{impl}}::read_quad_bytes
             at .\sonata-core\src\io\scoped_stream.rs:108
  16: sonata_core::io::ByteStream::read_be_u32
             at .\sonata-core\src\io\mod.rs:200
  17: sonata_utils_xiph::flac::metadata::read_picture_block<sonata_core::io::scoped_stream::ScopedStream<mut sonata_core::io::media_source_stream::MediaSourceStream*>>
             at .\sonata-utils-xiph\src\flac\metadata.rs:470
  18: sonata_codec_flac::demuxer::read_all_metadata_blocks
             at .\sonata-codec-flac\src\demuxer.rs:339
  19: sonata_codec_flac::demuxer::{{impl}}::try_new
             at .\sonata-codec-flac\src\demuxer.rs:85
  20: sonata_codec_flac::demuxer::{{impl}}::query::{{closure}}
             at .\sonata-core\src\probe.rs:326
  21: core::ops::function::FnOnce::call_once<closure-0,(sonata_core::io::media_source_stream::MediaSourceStream, sonata_core::formats::FormatOptions*)>
             at C:\Users\csany\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\src\libcore\ops\function.rs:232
  22: sonata_core::probe::Probe::format
             at .\sonata-core\src\probe.rs:295
  23: sonata_play::main
             at .\sonata-play\src\main.rs:118
  24: std::rt::lang_start::{{closure}}<()>
             at C:\Users\csany\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\src\libstd\rt.rs:67
  25: std::rt::lang_start_internal::{{closure}}
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\rt.rs:52
  26: std::panicking::try::do_call
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panicking.rs:297
  27: std::panicking::try
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panicking.rs:274
  28: std::panic::catch_unwind
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\panic.rs:394
  29: std::rt::lang_start_internal
             at /rustc/5c1f21c3b82297671ad3ae1e8c942d2ca92e84f2\/src\libstd\rt.rs:51
  30: std::rt::lang_start<()>
             at C:\Users\csany\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib\rustlib\src\rust\src\libstd\rt.rs:67
  31: main
  32: invoke_main
             at d:\A01\_work\6\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  33: __scrt_common_main_seh
             at d:\A01\_work\6\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  34: BaseThreadInitThunk
  35: RtlUserThreadStart

Files used for benchmarks are not published

In absence of files used for the benchmarks listed in README.md it's impossible to reproduce the performance measurement results. This hinders attempts to improve performance further.

Please share the files used for the measurements.

[Task] The Decoder trait needs a reset function

Currently instantiating a new Decoder is required after a seek. This can be expensive for some codecs such as AAC or Vorbis which do a lot of pre-computation to setup the decoder. Reset would only reset the state necessary to start decoding the same stream from a new position.

This issue it for stubbing out that reset function in all the current decoders.

ALAC support

Hi, I’m really excited to see this project! I have some experiences on ALAC codec and am eager to help. What would be best for the collaboration?

Status of OPUS Codec

What is the status of work on the opus Codec? I see it marked as next in the readme, is there an ETA for release?

Matroska / WebM support

Hello.

I noticed you have plans to add support for MKV / WebM containers. Is there anybody actively working on this? I think I might be able to find some time to help.

read_id3v2 cannot read chapter markers from file

I could have missed where this data might show up, but taking some sample podcasts from Mac Power Users. (e.g. https://www.podtrac.com/pts/redirect.mp3/traffic.libsyn.com/secure/relaympu/mpu554.mp3)

Dumping out tags from the Metadata gives me this

k: TALB, v: Mac Power Users
k: TPE1, v: Mac Power Users
k: TIT2, v: 554: Read-it-later Services
k: COMM!eng, v: Read-it-later services can be a great way to save and enjoy an article later, away from the noise of social media or an overflowing RSS client. This week, David and Stephen talk about some of the popular choices, and how to keep them from becoming just another thing to check. T
hen, a recap of Apple’s recent media event.
k: USLT!eng, v: Read-it-later services can be a great way to save and enjoy an article later, away from the noise of social media or an overflowing RSS client. This week, David and Stephen talk about some of the popular choices, and how to keep them from becoming just another thing to check. T
hen, a recap of Apple’s recent media event.
k: TLEN, v: 5997000
k: TYER, v: 2020
k: TENC, v: Forecast

Whereas ffprobe on the same file will output:

Input #0, mp3, from 'mpu554.mp3':
  Metadata:
    album           : Mac Power Users
    artist          : Mac Power Users
    title           : 554: Read-it-later Services
    comment         : Read-it-later services can be a great way to save and enjoy an article later, away from the noise of social media or an overflowing RSS client. This week, David and Stephen talk about some of the popular choices, and how to keep them from becoming just an
    lyrics-eng      : Read-it-later services can be a great way to save and enjoy an article later, away from the noise of social media or an overflowing RSS client. This week, David and Stephen talk about some of the popular choices, and how to keep them from becoming just an
    TLEN            : 5997000
    encoded_by      : Forecast
    date            : 2020
  Duration: 01:39:57.11, start: 0.000000, bitrate: 128 kb/s
    Chapter #0:0: start 0.000000, end 466.287000
    Metadata:
      title           : MPU 554
    Chapter #0:1: start 466.287000, end 964.340000
    Metadata:
      title           : Read-it-Later Services?
    Chapter #0:2: start 964.340000, end 1597.615000
    Metadata:
      title           : Safari Reading List
    Chapter #0:3: start 1597.615000, end 3042.409000
    Metadata:
      title           : Third-Party Services
    Chapter #0:4: start 3042.409000, end 3345.281000
    Metadata:
      title           : What We’re Using
    Chapter #0:5: start 3345.281000, end 4158.417000
    Metadata:
      title           : David’s Research Workflow
    Chapter #0:6: start 4158.417000, end 5997.000000
    Metadata:
      title           : Apple’s September
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s
    Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1400x1400, 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      comment         : Cover (front)

mp3 playing twice

I have a few mp3 files like this, where it plays the file and then plays it again. The file is 2:17, but it's played for about 4:34:
Kill Paris feat. Big Gigantic & Jimi Tents - Fizzy Lifting Drink.mp3.zip

I'm guessing there's something about how it's encoded that causes Symphonia to trip up, but no idea what that could be.

Cargo.toml

[package]
name = "mp3test"
version = "0.1.0"
edition = "2018"

[dependencies]
rodio = { git = "https://github.com/RustAudio/rodio", rev = "d40551d", features = ["symphonia-mp3"] }

main.rs

use rodio::{Decoder, OutputStream, Sink};
use std::fs::File;
use std::io::BufReader;
use std::sync::mpsc::channel;
use std::sync::mpsc::TryRecvError;
use std::thread;
use std::time::{Duration, Instant};

fn main() {
    let path = "Kill Paris feat. Big Gigantic & Jimi Tents - Fizzy Lifting Drink.mp3";
    let file = File::open(path).unwrap();
    let buf = BufReader::new(file);

    let decoder = Decoder::new(buf).unwrap();

    let output_stream = OutputStream::try_default();
    let (_stream, handle) = output_stream.unwrap();
    let sink = Sink::try_new(&handle).unwrap();

    sink.append(decoder);
    let (send, recv) = channel();
    thread::spawn(move || {
        sink.set_volume(0.25);
        let start = Instant::now();
        sink.play();
        sink.sleep_until_end();
        let dur = start.elapsed().as_secs_f64();
        println!("Ended, took {:.3}s", dur);
        send.send("End").unwrap();
    });

    let start = Instant::now();
    loop {
        match recv.try_recv() {
            Ok("End") => break,
            Err(TryRecvError::Empty) => {}
            _ => panic!(),
        };
        let dur = start.elapsed().as_secs_f64();
        println!("\u{23f1}  {:.3}s", dur);
        thread::sleep(Duration::from_millis(1000));
    }
}

ADPCM support

I'd like to use Symphonia with some WAV files containing ADPCM data. Specifically, I'd like to support the "Microsoft ADPCM" (RIFF format 0x021) and "IMA/DVI ADPCM" (RIFF format 0x11) variants. There are also possibly a couple extensions2 that I need to support.

Just these two account for most of the ADPCM I've encounted in the wild, and to some extent rounds out the set of codecs that you most commonly find in WAV files, IME. The various other variants use can of course be added later when/if needed — I think there are a couple that likely will be useful (like QuickTime's "IMA4 ADPCM").

Anyway, I'd be willing to provide the code for those two (I have some parsing code lying around for them, and encoding is easy enough as well).

The reason I'm filing an issue first (rather than just a PR) is:

  1. To make sure you'd accept it.

  2. I am unsure where they should live. Or rather, I suspect the answer is "A new symphonia-codec-adpcm crate, but it seems worth asking about this first.

    I think there's a strong possibility you considered the existence of ADPCM when writing the WAV code, and possibly have thoughts on where it should live, but I could be wrong

  3. So I have somewhere to ask questions when I inevitably get lost due to my vague-at-best understanding of how Symphonia is structured 😅

One final note is that while libavcodec (codec library behind ffmpeg)'s support for ADPCM is pretty thorough, ffmpeg's usage of it used to introduce many strange bugs. So, I'm a little concerned about testing against it as with symphonia-check... I guess we'll cross that it it becomes an issue, though, since it may have been fixed in the 3 years since I tried last.


(footnotes)

  1. I'm specifying the format ID because pretty much every name you could use to describe the different flavors of ADPCM is somewhat ambiguous. There are at least 3 codecs reasonably called "DVI ADPCM" and two reasonably called "Microsoft ADPCM", and these have overlap (it's kind of a disaster).

  2. Specifically, the two important extensions are:

    Note that I'm unsure that these are actually extensions.

symphonia-play: CPAL ring buffer usage is incorrect

In the CPAL output code for symphonia-play, the usage of ring buffer's write_blocking is incorrect:

// Write as many samples as possible to the ring buffer. This blocks until some
// samples are written or the consumer has been destroyed (None is returned).
if let Some(written) = self.ring_buf_producer.write_blocking(writeable_samples) {
i += written;
}
else {
// Consumer destroyed, return an error.
return Err(AudioOutputError::StreamClosedError);
}

According to the rb docs for write_blocking, a None return value only means that the given slice had zero length. It has nothing to do with whether the consumer has been dropped, as the code in symphonia-play implies. In practice this doesn't matter in this case because the code never calls write_blocking with an empty slice, so None is never returned, but the usage here is misleading.

Here's an alternative way to write the sample buffer to the ring buffer that I'm using in my project:

sample_buffer.copy_interleaved_ref(decoded);
let mut samples = sample_buffer.samples();
while let Some(written) = self.ring_buf_producer.write_blocking(samples) {
    samples = &samples[written..];
}

[Task] Rename Decoder::close to Decoder::finalize

There's really nothing being closed here so this function makes no sense as-is. However, some Decoders need to return a result for optional features such as verification.

Therefore, change the Decoder trait from:

trait Decoder {
    // ..
    fn close(&mut self);
}

to

struct FinalizeResult {
    verify_ok: bool,
}

trait Decoder {
    // ..
    fn finalize(&mut self) -> FinalizeResult;
}

The documentation should note that calling finalize is optional but necessary for some features such as verification.

Some additional work will need to done to update FlacDecoder to return its verification result in FinalizeResult. Currently, the FLAC FormatReader reads the expected MD5 checksum of the decoded audio, and the FLAC Decoder calculates the MD5 checksum of what it decodes. The FinalizeResult provided by the FlacDecoder should set verify_ok = <calculated md5> == <expected MD5>. Therefore, the expected MD5 sum needs to be passed to FlacDecoder somehow. This could be done through the CodecParameters.

symphonia-play crashes on Windows 10 if the audio file's sample rate doesn't match the output device's default sample rate

If I try to play an audio file that has a sample rate that is different than the sample rate that is set as the default format for my audio device, symphonia-play crashes. I'm guessing that this might not strictly be a bug, per se, but just a side effect of opening the default audio device with its default configuration. If so, I think to resolve this it would have to check the sample rate of the file before opening the device, then try to open the device using a matching sample rate. Is that correct?

 ERROR symphonia_play::output::cpal > audio output stream open error: The requested stream configuration is not supported by the device.
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: OpenStreamError', symphonia-play\src\main.rs:248:74
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `target\release\symphonia-play.exe D:\test.flac` (exit code: 101)

Artifact when decoding MP3

symphonia-bundle-mp3 decodes the following MP3 incorrectly:

dash-fx.zip
(32-bit float WAV of symphonia output included).

Expected waveform:
image

Symphonia output:
image

Configure CI

Configuring CI would have caught the compilation issue I resolved in #10. I don't care whether the project uses GitHub Actions or another system, but I would strongly recommend setting up a minimal pipeline to at least run tests on every commit.

I can easily configure GitHub Actions if this is something the project would like to see done.

MediaSourceStream is not Send

Due to wrapping a Box<dyn MediaSource> instead of Box<dyn MediaSource + Send>, MediaSourceStream is not safe to send across thread boundaries. Is this intentional?

Decoded frame count doesn't match the file's frame count

First of all, thanks for making and sharing this library, I'm very happy to have discovered it, kudos!

I've noticed some behaviour that I don't understand while decoding some .wav files (I haven't checked all of their formats, but typically they're stereo 16bit, 44.1k, PCM).

Is it expected that the sum of all decoded AudioBufferRef frame counts should match the n_frames value defined in the track's codec params? I'm finding that in the last decoded packet, there are sometimes extra frames in the buffer, which then contain data that isn't in the file (e.g. Each buffer's frame count is 1158 through the file, then the final buffer contains 541 frames, when there are only 118 frames remaining according to the count).

I can share an example if you like but I figured I'd better check first if I'm just misunderstanding something in the API.

FLAC: Mismatch with max delta from 0.085 to 1.998

These files exhibit mismatches compared to ffmpeg:

1.25170898 01 - Give It Away (In Progress).flac
1.93112183 05 - Киберпанк.flac
1.59304810 06. The Sunshine.flac
1.98730469 03 Roll Us A Giant.flac
1.72723389 10. The Parts.flac
1.08828735 07. The Grocery.flac
0.08523428 04- The Hours - Arr. Michael Riesman - Morning Passages.flac
1.66995239 16. Olivia's Doom (Chad Mossholder Remix).flac
1.92807007 09 - Путь прост.flac
1.92960131 40. Stumpy.flac
1.99769855 3-Blue Jeans.flac
1.92474365 Robert Rich - Echo of Small Things.flac

More over, one of the files triggers an error:

ERROR symphonia_core::probe > reached probe limit of 1048576 bytes.Test interrupted by error: unsupported feature: core (probe): no suitable reader found 01. You Are Light (feat. Felicia Farerre).flac

aac file doesn't play in rodio with error «invalid data»

I can play the file with ffplay or the windows default decoder. I played the file in symphonia-play : the error occurs at line 370 of aac.rs.

validate!(
(self.window_sequence == ONLY_LONG_SEQUENCE)
|| (self.window_sequence == LONG_START_SEQUENCE)
);

I attached a sample of the first 1 second
error.zip

Panic in symphonia-codec-vorbis/src/dsp.rs:134:28

The attached file produces a panic: 02 - Shirashikkur.ogg.gz (gzipped so that github would accept it)

thread 'main' panicked at 'slice index starts at 18446744073709551168 but ends at 128', symphonia-codec-vorbis/src/dsp.rs:134:28

Backtrace:

stack backtrace:
   0: rust_begin_unwind
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:515:5
   1: core::panicking::panic_fmt
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:92:14
   2: core::slice::index::slice_index_order_fail
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/slice/index.rs:48:5
   3: symphonia_codec_vorbis::dsp::DspChannel::synth
   4: <symphonia_codec_vorbis::VorbisDecoder as symphonia_core::codecs::Decoder>::decode
   5: symphonia_play::decode_only
   6: symphonia_play::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Tested with symphonia-play --decode-only on commit 7bbb8aa

Not getting length of files and other decode errors

I'm trying to create a realtime-safe disk-streaming algorithm for audio for our DAW project. https://github.com/RustyDAW/rt-audio-disk-stream

I figured Symphonia would be perfect for this. However, I can't seem to get it to work. I've created various test files in Audacity, and it's returning None as the number of frames in the file for some codecs like wav and mp3, and other codecs like ogg fail to decode.

I've created a test program with my decoder struct that I plan to use.
symphonia_test.zip

This program outputs this for me:

[src/main.rs:25] file = "./test_files/wav_u8_mono.wav"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/wav_i16_mono.wav"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/wav_i24_mono.wav"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/wav_i32_mono.wav"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/wav_f32_mono.wav"
[src/decoder.rs:115] e = "pcm: unknown bits per (coded) sample."
[src/main.rs:34] e = Format(
    IoError(
        Custom {
            kind: UnexpectedEof,
            error: "end of stream",
        },
    ),
)
[src/main.rs:25] file = "./test_files/wav_i24_stereo.wav"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/ogg_mono.ogg"
[src/main.rs:34] e = Format(
    IoError(
        Custom {
            kind: UnexpectedEof,
            error: "end of stream",
        },
    ),
)
[src/main.rs:25] file = "./test_files/ogg_stereo.ogg"
[src/main.rs:34] e = Format(
    IoError(
        Custom {
            kind: UnexpectedEof,
            error: "end of stream",
        },
    ),
)
[src/main.rs:25] file = "./test_files/mp3_constant_mono.mp3"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/mp3_constant_stereo.mp3"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/mp3_variable_mono.mp3"
[src/main.rs:34] e = NoNumFrames
[src/main.rs:25] file = "./test_files/mp3_variable_stereo.mp3"
[src/main.rs:34] e = NoNumFrames

Compliance Testing Utility

Create a new bin crate, sonata-test, that accepts two input files: a WAV reference file, and a test file (any format). The application should decode both files and check if they are sample-for-sample equivalent. Samples that are not the same should have their timestamp printed.

Using the application it should be possible to, for example, compare an FFMPEG decoded MP3 file (converted to WAV) with Sonata's built-in decoder.

FLAC reference files could also be used since the FLAC decoder can verify if it has decoded the file correctly via the the FLAC file's internal MD5 checksum.

WAV: Error probing file that can be played by VLC

I have code that ran Probe::format on a file using the default format options and metadata options which resulted in an error malformed stream: wav: chunk length exceeds parent (list) chunk length. I'm not sure about the details of this error, but this file can be played fine in VLC. So I guess I have two questions:

  1. Would providing non-defaults for the format or metadata options possibly fix this?
  2. If not, is there a way that the probe can work around this error since the media seems to be readable by other media players?

An async version of Symphonia

Hello!,

I'm currently investigating the use of Symphonia as a dependency in a pet project of my own. Trouble is, my project is based on Tokio and Futures so I'm working out if I can make Symphonia async. Is there any interest in this with the Symphonia devs? I'm considering adding an "async" feature which will not be default; I believe that the async code can be added without changing the current source, i.e. by just adding new source enabled by the async feature although I need to complete my investigation before being sure.

Basic examples

Hello,

Found your library while surveying the landscape of Rust audio decoders. I'm trying to decode arbitrary audio codec data into raw bytes for use in Alto.

Unfortunately, I've spent about half an hour looking through the code and can't seem to figure out what my entrypoint should be. Mind pointing me to where I'd look to, say, ingest arbitrary bytes and decode them? Or, if I need to know the incoming content type first, where I'd go to do that? I'm working with a game engine that hands me raw bytes, though it also gives me a path from which I can perform reasonable content-type detection. I'm

Thanks!

Add library to compare similarity of music files

I wanted to add support to my app to find similar music(by content, not tags).

For images I use img_hash which generate 64(and later maybe even 128) bit hash which I save to special tree bk-tree which allow to get similar hashes(similar hashes are generated for similar images).

I wanted to use chromaprint for this but since my app is multiplatform and support also Windows and MacOS, there could be a problem with exporting app to this OS and also music decoding.

Encoding support

Hello,
Is there plans to make symphonia be able to encode PCM data into various codec/formats?

Implementing `Decoder` for external codecs

I'm currently trying to implement the Decoder trait around libopus (via the audiopus crate) for a downstream project. We're trying to move from being dependent on ffmpeg, and currently support some uncommon Opus framing mechanism and PCM in our mixing/decoding setup.

However, there are some differences between Symphonia's buffer formats and those output by libopus. In particular:

  • A stereo libopus decoder interleaves both audio planes, while AudioBuffer only allows stacked planes per frame. A consequence is that while I can get a large enough &mut [f32] to write into via buf.chan_mut(0) on a mono buffer, this interferes with correctly combining this with any other buffers. It can be wrangled into the correct planar format by simply creating a Vec<f32> and manually copying into the slices given by buf.chan_pair_mut(0, 1), but this probably adds quite an overhead (especially if the target mixing SampleBuffer must be interleaved to be re-encoded by libopus).
  • libopus can decode to stereo or mono independent of the number of channels in a Track, according to whatever the user needs (at the time the OpusDecoder is created).
  • Missed packets can be passed in to Opus in general: I assume a Packet from a null slice would suffice here though.

(I haven't tried to implement DCA framing yet, but see no reason why it wouldn't work.)

Naturally, I could avoid registering a Symphonia codec entirely by just special-case handling any Opus tracks (which would also be used for e.g., direct packet passthrough), or taking the performance/memory hit to get data in and out of Symphonia's representation. But I'm wondering whether the following are future considerations for the design of the library:

  • Support for interleaved channels in AudioBuffer. While this makes it easier to interface with external codecs (not a project goal to my understanding), it also should reduce overhead if the user knows that an interleaved output is needed. Is there any way to reconcile this with the Signal trait, given that this would prevent retrieving a plane as a contiguous slice? How much additional work would this be for the existing codecs?
  • Passing in information about a target buffer--whether it is stereo/mono, f32/i16. Is DecoderOptions the right struct for this?

MP3: Mismatch with max delta from 0.002 to 2.0

These files show max delta ranging from 0.002 to 2.0 when compared against ffmpeg with symphonia-check.

Tested on commit 33f0701 (latest as of this writing).

As we have learned in #72, ffmpeg has divergences with other decoders, so running the check on these files against other decoders seems prudent.

The files are from the Pony Music Archive corpus and are free to redistribute. I have run the check on half of it because I don't have enough disk space for the entire thing. This amounts to roughly 90Gb of MP3 (12831 individual files).

FLAC: mismatches with max delta > 1.9 on a handful of packets per file

These 3 FLAC files exhibit large mismatches (max delta > 1.9) compared to ffmpeg.

Only a handful of packets (1 to 10) are affected, and the affected packets are not sequential, so it does not appear to be caused by misalignment.

Tested on commit f05f310 (the latest one as of this writing).

Only these 3 files out of ~10,000 (~400Gb) of FLAC from the Pony Music Archive corpus show discrepancies compared to ffmpeg. The entire corpus is 900Gb (a mix of MP3 and FLAC), but I don't have enough storage for that locally, so I've only tested the first 400Gb of it.

Investigate using ReadBytesExt for Bytestream

The byteorder crate provides the ReadBytesExt trait which implements typical bytestream functions. Sonata has a fairly identical set of functions in the Bytestream trait. Since byteorder is a well-regarded and popular crate, it may make sense to transition to byteorder's ReadBytesExt, or delegate the basic read functions in Sonata's Bytestream to ReadBytesExt.

Concerns

The ReadBytesExt trait is implemented automatically for anything that implements Read. Since MediaSourceStream implements Read then it should be trivial to get ReadBytesExt working. However, this also means that ReadBytesExt is relying on read_exact to get data from MediaSourceStream. This is likely more expensive than Sonata's Bytestream implementation on MediaSourceStream which can directly pull bytes from the internal buffer. This may be particularly bad in the case of read_u8. Since Sonata's BitReader depends on read_u8, performance must not be degraded.

Additionally, the solution should allow for future additions to the bytestream. ReadBytesExt has no direct analogue to scan_bytes, ignore_bytes, and read_boxed_slice_bytes. Additionally, we may want to support other functions on the the bytestream such as peeking or rewinding.

What's Needed

  • Rewrite the implementation of the Read trait on MediaSourceStream to have less overhead.
  • Replace the integer and floating point read functions on Sonata's Bytestream with byteorder's ReadBytesExt.
  • Update codecs and demuxers.
  • Benchmark performance against Sonata's existing Bytestream.

Criteria For Replacement

Since this is largely for developer comfort, and since the learning curve is minimal for Sonata's Bytestream, it is not worthwhile to make this change if performance is degraded. The criteria for replacement will therefore be equal or better performance.

Incorrect value used to construct SampleWriter

SampleWriter::from_buf uses an unsafe code block to create a mutable slice, with buf.capacity() as the length; however, it seems the intention is for the referenced slice to be of length n_samples. It appears that the method is always called with n_samples equal to buf.capacity(), but it could cause unintended behavior if the code is ever called with a different value.

Seeking to a postion other than zero decodes from the wrong position

In creating my real-time disk-streaming crate, I've noticed that calling reader.seek() doesn't start decoding from the correct position, instead it seems to start decoding a bit after the requested position. I've only tested this on wav files so far.

This is the code I have for my seek_to function:

pub fn seek_to(&mut self, frame: usize) -> Result<(), ReadError> {
    self.current_frame = frame;

    let seconds = self.current_frame as f64 / f64::from(self.sample_rate.unwrap_or(44100));

    self.reader.seek(SeekTo::Time {
        time: seconds.into(),
    })?;

    self.reset_smp_buffer = true;
    self.curr_smp_buf_i = 0;

    Ok(())
}

I noticed that if frame is 0, than it correctly starts decoding from the start of the file. But any frame other than 0, it starts decoding from a bit ahead of the given frame.

I've checked the output of reader.seek(), and it returns the same amount of seconds as I put in for a wav file. I know I'll have to keep track of this for other codecs, but I'm just trying to get wav to work for now.

Maybe this has something to do with the reader.seek() function not using the proper sample rate?

symphonia-check fails with 'Test interrupted by error: malformed stream: mp3: invalid main_data_begin'

Running symphonia-check results in an immediate error on roughly 40% of MP3 files from the MySpace Dragon Hoard dataset:

$ symphonia-check '1/std_cd687347995a4b159c43783fe74f46e2.mp3'
Input Path: 1/std_cd687347995a4b159c43783fe74f46e2.mp3

Test interrupted by error: malformed stream: mp3: invalid main_data_begin

However, both ffplay and symphonia-play seem to play the file without issue.

Tested on latest master, commit a723c15

P.S. I have only run check on about a third of the hoard, but so far the highest deviation reported with -q flag was 0.00001782, so the decoder seems to be in a pretty good shape overall.

[Task] Test on a big-endian virtual machine

Symphonia should work properly on both big and little-endian devices. x86/x86_64 and ARM devices are all little-endian, but Power (and MIPS?) architecture devices can run in big-endian mode.

Therefore, let's explore using a virtual machine (QEMU?) to test if Symphonia works on big-endian devices and fix any issues found. If at all feasible, could this also be integrated into CI?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.