Giter Site home page Giter Site logo

Mkv/Webm seeking error about symphonia HOT 5 OPEN

GonnaHackIt avatar GonnaHackIt commented on May 27, 2024
Mkv/Webm seeking error

from symphonia.

Comments (5)

dedobbin avatar dedobbin commented on May 27, 2024

Hey!

I'm not super familiar with MKV but i studied the code for a bit to get a grasp at what is going on and i understand the problem you are running into, this seems valid yes.

I also think this is the correct place to add the fix.
I was thinking though, it might be bit more expressive/clear to have a conditional that check if the target cluster is the same as current cluster? The conditional might seem bit obscure if you don't know why it's there.
There doesn't seem a straight forward way to obtain the current cluster though, so a simple comment might be the better approach.

And as for the unwrap, i think if you use a switch statement (or is_some()) to prevent the panic. Something like

match self.frames.front(){
  Some(frame) => {
    if seek_ts > frame.current_ts || ts < frame.current_ts {
        self.iter.seek(pos)?;
    }
}

I didn't reproduce the problem though, if you have any file/stream that allows me to replicate it and there is no copyright issues with sharing it, would you mind sharing it? Thanks

from symphonia.

GonnaHackIt avatar GonnaHackIt commented on May 27, 2024

Hi @dedobbin!

So to begin with:

I was thinking though, it might be bit more expressive/clear to have a conditional that check if the target cluster is the same as current cluster? The conditional might seem bit obscure if you don't know why it's there.

That's what the conditional is supposed to check. If ts < frame.current_ts it means that we for sure want to go backwards and if seek_ts > frame.current_ts it means we want to go forward to some next cluster/block, otherwise we want to go forward but in the current cluster/block. Sorry for not adding a comment for that, my bad. Also, maybe this conditional would be more readable: if !(seek_ts <= frame.current_ts && ts > frame.current_ts)?

And as for the unwrap, i think if you use a switch statement (or is_some()) to prevent the panic. Something like

So my main concerns with this solution are:

  1. I think there could be a very rare case where we processed all frames from the current cluster (but there are still some clusters left) so none frames are left and this would result in not performing a seek backward (self.frames.front() would return None). Seeking forward would work tho, because function seek_track_by_ts_forward would load new frames by itself, but it might affect performance because we are skipping to target packet by packet (for some reason called frames in code)
  2. If we processed whole file/stream, there would be none frames left, so seeking backward becomes impossible because self.frames.front() will return None

But I think adding the while loop and None handling would resolve these issues:

use std::io;

fn seek_track_by_ts(&mut self, track_id: u32, ts: u64) -> Result<SeekedTo> {
    if self.clusters.is_empty() {
        self.seek_track_by_ts_forward(track_id, ts)
    } else {
        while self.frames.is_empty() {
            match self.next_element() {
                Err(Error::IoError(error)) => {
                    if matches!(error.kind(), io::ErrorKind::UnexpectedEof) {
                        // break so user can seek backward
                        break;
                    }
                    
                    return Err(error);
                }
                Err(error) => return Err(error),
                _ => {},
            }
        }

        /*
            Code for getting cluster, block and pos
        */

        let seek_ts = match target_block {
            Some(block) => block.timestamp,
            None => cluster.timestamp,
        };

        match self.frames.front() {
            Some(frame) => {
                // skip internal seeking if the target timestamp is in the current block/cluster
                if seek_ts > frame.timestamp || ts < frame.timestamp {
                    self.iter.seek(pos)?;
                }
            }
            // Should be end of stream, maybe user trying to seek backward
            None => self.iter.seek(pos)?,
        }

        /*
            The rest
         */
    }
}

There doesn't seem a straight forward way to obtain the current cluster though, so a simple comment might be the better approach.

Maybe we could somehow make use of self.current_cluster? It would make solution simpler and more readable. As I analyzed a code for a bit, it seems like in the seek_track_by_ts function, it can only be None if it's the end of track, but I might be wrong.

I didn't reproduce the problem though, if you have any file/stream that allows me to replicate it and there is no copyright issues with sharing it, would you mind sharing it? Thanks

I think the problem is a source not being seekable (read-only) as it didn't work for all streams I used (temporary links, limited to ip address, so can't share it). I managed to recreate the issue with the following code (file in attachments):

use std::fs::File;
use symphonia::{
    core::{
        formats::{SeekMode, SeekTo},
        io::{MediaSourceStream, ReadOnlySource},
    },
    default::formats::MkvReader,
};
use symphonia_core::formats::FormatReader;

#[tokio::main]
async fn main() {
    let file = File::open("nocopyright.webm").unwrap();
    let source = ReadOnlySource::new(file);

    let mss = MediaSourceStream::new(Box::new(source), Default::default());

    let mut mkv = MkvReader::try_new(mss, &Default::default()).unwrap();

    for _ in 0..5 {
        let packet = mkv.next_packet().unwrap();

        println!("{} - {}", packet.ts, packet.dur);
    }

    let result = mkv.seek(
        SeekMode::Accurate,
        SeekTo::TimeStamp {
            ts: 500,
            track_id: 1,
        },
    );

    println!("{:?}", result);
}
nocopyright.webm

from symphonia.

dedobbin avatar dedobbin commented on May 27, 2024

Ah super. Thanks for the file and way to reproduce it, appreciated.

That's what the conditional is supposed to check.

Yes i figured it out, no worries, just feels like stating it in a conditional that involves explicit comparing clusters is more intuitive.

Maybe we could somehow make use of self.current_cluster

Ah, i implied in previous post tat this would be difficult but when playing around with your provided code i think it's clearly viable yes. I think it's worth trying to use it! It would make it more clear.

As for the issues you raise with my provided code snippet; You seem very right, i was clearly thinking too naive. The solution you provide here does seem to make sense but i haven't tested it yet.

I've been spending some time trying to recreate these 2 "edge"-cases (if you can even call them that haha) and to get more input data so we can test for any regressions with your provided solution. If that all behaves as expected we can just focusing on the code structure itself.

I didn't have enough time to get that all done in a proper way though and it i'll be bit busy upcoming weeks, so not sure when i can give it the attention it deserves, so i felt it was good to give you this update for now.

You clearly spent some nice time with this though, so if you have any input or ideas on this don't hesitate to share it.

from symphonia.

GonnaHackIt avatar GonnaHackIt commented on May 27, 2024

Thanks for the response.

I managed to recreate these 2 "edge"-cases pretty easily with the audio file I provided recently. Here is the code with comments for respective case:

use std::fs::File;
use symphonia::{
    core::{
        formats::{SeekMode, SeekTo},
        io::MediaSourceStream,
    },
    default::formats::MkvReader,
};
use symphonia_core::formats::FormatReader;

fn main() {
    let file = File::open("nocopyright.webm").unwrap();
    let source = file;

    let mss = MediaSourceStream::new(Box::new(source), Default::default());

    let mut mkv = MkvReader::try_new(mss, &Default::default()).unwrap();

    // edge case number 1
    for _ in 0..500 {
        let packet = mkv.next_packet().unwrap();
    }

    // edge case number 2
    // while let Ok(_) = mkv.next_packet() {}

    let result = mkv.seek(
        SeekMode::Accurate,
        SeekTo::TimeStamp {
            ts: 500,
            track_id: 1,
        },
    );

    println!("{:?}", result);
}

But before running it, I put your code snippet in the symphonia's code:

let seek_ts = match target_block {
    Some(block) => block.timestamp,
    None => cluster.timestamp,
};

match self.frames.front() {
    Some(frame) => {
        if seek_ts > frame.timestamp || ts < frame.timestamp {
            self.iter.seek(pos)?;
        }
    }
    None => {}
}

So in the first scenario I read 500 packets because first cluster's timestamp is 0, second's is 10001 and each packet is 20. This means we read all packets from the first cluster but don't go to the next one, so later when we try to seek backward it doesn't, it seek forward to the next cluster (because of seek_track_by_ts_foward function).

In the second scenario we read all packets from track and then we try to seek backward, we can't.

When playing with these "edge"-cases I've come to the conclusion that the while loop from my previous solution can be deleted, meaning that this match is sufficient:

match self.frames.front() {
    Some(frame) => {
        // skip internal seeking if the target timestamp is in the current block/cluster
        if seek_ts > frame.timestamp || ts < frame.timestamp {
            self.iter.seek(pos)?;
        }
    }
    // Should be end of stream, maybe user trying to seek backward
    None => self.iter.seek(pos)?,
}

Because as far as I know there can be these cases:

  1. No frames left and it's the end of track - None handles this and seeks backward if user wants to
  2. No frames left but it's not the end of track - it must mean that we processed all frames from the current cluster. This imply that we can't seek forward in the current cluster so None handles this correctly too (seeks backward only when user wants to or to the next cluster)
  3. There are some frames left - well then Some(frame) handles this correctly too

I also tried to simplify the conditional with the self.current_cluster but it turned out to be worse in my opinion because the self.current_cluster is actually just a state (actually Option from state, so more handling) with a timestamp that is an Option (so we should probably handle it), so we need to get corresponding cluster for it, then we can check if the target cluster is the current cluster but that's not the end because we should also check if the target block is the current block which requires even more logic for obtaining it and comparing them and then we can decide whether to seek internally.


You clearly spent some nice time with this though, so if you have any input or ideas on this don't hesitate to share it.

I think I shared all my knowledge about this problem. I feel that the only possible things left are testing, which I unfortunately don't know how to do properly, maybe better solution and analyzing whole code for perhaps new nuances, but I think we have a pretty solid understanding on this one.

i'll be bit busy upcoming weeks, so not sure when i can give it the attention it deserves

No problem, I will be a bit more busy too, so might take longer to reply.

from symphonia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.