Giter Site home page Giter Site logo

Comments (14)

ChristianFeldmann avatar ChristianFeldmann commented on June 2, 2024

Hi!
That sounds like a YUV format that I have not encountered yet. But there is a wild number of specialized YUV formats out there. Do you have any specification on this YUV format? Or if not, can you share where this came from?
It would be great if you could provide a file in this format for me to test for implementation. Or a way how I can create one.

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

Hi Christian,
I didn't find any spec about this, I just found some reference here but I'm not even sure it's the same thing.

This format is used internally by Chromium. Sample.
The format that chromium assigns is this:
PIXEL_FORMAT_I420, 12bpp YUV planar 1x1 Y, 2x2 UV samples, a.k.a. YU12.

Not sure this is correct or makes sense though.
Following Python script is capable of showing the image correctly, sorry if not polished and I bet there are tons of better way to do what I'm doing with numpy, but it's the first time I'm using it:

from PIL import Image
import numpy

def getyuv():
    y = []
    uv = []

    with open('single_frame.yuv', 'rb') as f:
        for i in range(0,1080):
            row = list(f.read(1984))
            y.append(row)
        for i in range(0,540):
            row = list(f.read(1984))
            uv.append(row)
    return y,uv

def convert_input(y,uv):
    output = numpy.full((1080, 1984,3), (0,0,0), dtype=numpy.uint8)
    for row in range(0,len(uv)):
        for column in range(0,int(len(uv[row])/2)-1):
            output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+992])
            output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+992])
            output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+992])
            output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+992])
    return output

y, uv = getyuv()
out = convert_input(y,uv)

# Trimming the last 64 px on each rows (garbage)
out_2 = numpy.full((1080, 1920,3), (0,0,0), dtype=numpy.uint8)
for row in range(0,len(out)):
    out_2[row] = out[row][:-64]

img = Image.fromarray(out_2, mode='YCbCr')
img.show()

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

Oddly enough, the memory dump of this image contains 541 rows in the YU matrix, this means can't be open directly in yuview in any case. The frame I shared in the previous comment does not contains this extra bytes.
Maybe a feature to discard XX bytes from the beginning or the end of each frame (and maybe also from the beginning of a file) can be surely useful when dealing with these raw information.

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

UPDATE: tried with many different pix_fmt on ffmpeg and none of them are able to decode the image as well. Seems like this is surely not a common format, maybe it's just Chromium that is using this internally and is not made to be shared/stored on disk. After all I was just trying to fix an issue in chromium, so it can be.

In any case, I built my own tooling for this. Probably we can close the issue as I don't think implementing this will bring any value to the project.

Here's a better script in the unlikely case someone else should run into the same issue:

from PIL import Image
import numpy
import os

class Convert:
    def __init__(self, column, row, stride, filename) -> None:
        self.column = column
        self.row = row
        self.stride = stride
        self.filename = filename
        self.fpos = 0
        if os.path.exists(self.get_destination()):
            os.unlink(self.get_destination())


    def get_single_frame(self):
        y = []
        uv = []

        with open(self.filename, 'rb') as f:
            if f.seek(self.fpos) == -1:
                return None,None
            y = list(f.read(self.stride*self.row))
            if len(y)==0:
                return None, None
            y = numpy.array(y)
            y = y.reshape(self.row, self.stride)

            half_row = int(self.row/2)
            uv = list(f.read(self.stride*half_row))
            if len(uv)==0:
                return None, None
            uv = numpy.array(uv)
            uv = uv.reshape(half_row, self.stride)
            # Remove last line (contain garbage)
            self.fpos = f.tell() + self.stride

        return y,uv

    def trim(self, frame):
        output = numpy.full((self.row,self.column,3), (0,0,0), dtype=numpy.uint8)
        for row in range(0,self.row):
            trimmed_row = frame[row][:(self.column-self.stride)]
            output[row] = trimmed_row
        return output

    def show(self, frame):
        image = Image.fromarray(frame, mode='YCbCr')
        image.show()

    def merge_yuv(self,y,uv):
        output = numpy.full((self.row, self.stride, 3), (0,0,0), dtype=numpy.uint8)
        half_stride = int(self.stride/2)
        uv_rows = len(uv)
        for row in range(0,uv_rows):
            uv_columns = len(uv[row])
            half_uv_columns = int(uv_columns/2)-1
            for column in range(0,half_uv_columns):
                output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+half_stride])
                output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+half_stride])
                output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+half_stride])
                output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+half_stride])
        return output

    def get_destination(self):
        return self.filename[:self.filename.rfind('.')] + ".converted.yuv"
    
    def save(self, y,uv):
        with open(self.get_destination(), 'ab') as f:
            f.write(bytes(y.reshape(y.shape[0]*y.shape[1]).tolist()))
            u = uv[:,0:int(uv.shape[1]/2)]
            v = uv[:,int(uv.shape[1]/2):]
            f.write(bytes(u.reshape(u.shape[0]*u.shape[1]).tolist()))
            f.write(bytes(v.reshape(v.shape[0]*v.shape[1]).tolist()))


target1 = 'your_file.yuv'

# change me according to target1 file spec
a = Convert(848,480,960, target1)

i=0
y, uv = a.get_single_frame()
while y is not None:
    a.save(y,uv)
    # Show an image every 20 frame for debug purposes
    if (i+1)%20 == 0:
        frame = a.merge_yuv(y,uv)
        frame = a.trim(frame)
        a.show(frame)
    y,uv = a.get_single_frame()
    i=i+1

This will take your_file.yuv in the format described above and generate `your_file.converted.yuv' that can be encoded into h264 with a command like this:

ffmpeg -f rawvideo -pixel_format yuv420p -video_size 960x480 -framerate 24 -i ./your_file.converted.yuv ./your_file.mp4

The only things that needs to be changed according to video spec are the parameter of the Convert ctor:

  • Cols, Rows -> The size of the visible portion of the frame
  • Stride -> The real number of column in the frame (can be bigger or equal to Cols)

Cheers!

from yuview.

ChristianFeldmann avatar ChristianFeldmann commented on June 2, 2024

Ah wait a second! I think we do have support for these semi planar files. At least for some of them. In the link you provided I saw the name NV12 and that rang a bell. We do support that. So you can open the YUV file and go to YUV Format ... custom. In the dialog you have to select the UV(A) interleaved checkbox. Can you try that? It may be what you are looking for.
image

Alternatively you can put nv12 into the name of the file and YUView should apply the format based on that.

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

I think at this point I may have tried each combination of custom and non custom decoding option but it's never displaying the image correctly.
I think the option you're suggesting must have a sequence of alternating Cb and Cr on each row of the UV matrix.
The format I'm talking about has the 1st half of the row for Cb values and the 2nd half of the row for Cb values.

from yuview.

ChristianFeldmann avatar ChristianFeldmann commented on June 2, 2024

Ah sorry then that is not exactly the format that you are looking for. Sorry. That would have been to easy anyway.
All of the info you provided is already super helpful. But can you please somehow share a file in that format with me? It also only has to be a few frames. That will already do.

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

Yup, I already shared it here: #518 (comment)

Pasting the link again here: https://github.com/IENT/YUView/files/10848045/single_frame.zip

from yuview.

ChristianFeldmann avatar ChristianFeldmann commented on June 2, 2024

Ah sorry my bad I was blind. Got it!

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

No worries, YW!

from yuview.

ChristianFeldmann avatar ChristianFeldmann commented on June 2, 2024

Ok so I looked though all the data and files and its still a bit strange:

  • Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.
  • The file looks like its as you described with interlaced Cb Cr lines. I have never seen a format like that. But there are some extra bytes that I can not account for. So if the resolution is 1984x1080 then there is one line of 1984 bytes that is not accounted for.

I have still not found any documentation of a format like this mentioned anywhere. I mean there is all sorts of strange YUV formats out there.

from yuview.

chaplin89 avatar chaplin89 commented on June 2, 2024

Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.

Yes, I'm sure, and yes, chromium code "apparently" support normal planar YUV files. However, if you debug the media part, you'll find out that this is just illusory. A YUV file in chrome is represented as a contiguous memory area in which:

  • The first plane is starting from offset 0
  • The second plane starts from offset stride*width
  • The third plane starts from offset stride*width + half stride

This means that in order to render the image correctly you still have to take into account that the UV plane is interleaved in this way (half row U, half row V).

But there are some extra bytes that I can not account for.

Yup, this is where I'm talking about that #518 (comment)

I thought I removed this extra line but perhaps I'm wrong. And yeah, I agree it's strange. ffmpeg it's not even supporting it, this is why I was saying probably it's not adding much value to the project.

from yuview.

ChristianFeldmann avatar ChristianFeldmann commented on June 2, 2024

Can you refer to the code where this happens in the chromium media part please? I checked out the code and to me it looks like the I420 format has 3 separate planes. E.g. here is the code from video_framce.cc:

    case PIXEL_FORMAT_I420: {
      int uv_width = (coded_size.width() + 1) / 2;
      int uv_height = (coded_size.height() + 1) / 2;
      int uv_stride = uv_width;
      int uv_size = uv_stride * uv_height;
      planes = std::vector<ColorPlaneLayout>{
          ColorPlaneLayout(coded_size.width(), 0, coded_size.GetArea()),
          ColorPlaneLayout(uv_stride, coded_size.GetArea(), uv_size),
          ColorPlaneLayout(uv_stride, coded_size.GetArea() + uv_size, uv_size),
      };
      break;
    }

The offset for the V plane here is uv_size which indicates that the 3 planes are completely separate.
The PIXEL_FORMAT_NV12 format seems to have 2 frames where UV are packed. But here, the UV values are packed per value (UVUVUV) and not per line.

I am riding on this so much because if we find out the name of this format, then we can also use it. I don't want to invent a new name for this as there must be one if it is used in chromium.

I think I found one reference to a format like this in the Microsoft docs: https://learn.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering#imc2 . They call it IMC2.

from yuview.

FaiScofield avatar FaiScofield commented on June 2, 2024

From the view of data arrangement, we can divide the YUV formats into 3 basic types: Planar(3/4(alpha) plans), Semi-Plan(2 plans), Interleaved(or so called packed, only 1 plan). Then divide the Semi-Plan to 2 subclass: uv_interleaved(UVUV...UVUV), or uv_followed(UU...UUVV...VV). I believe this will work in distinguishing from YUV formats.

from yuview.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.