Giter Site home page Giter Site logo

mpv-prescalers's People

Contributors

bjin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mpv-prescalers's Issues

Possibly share the gradient calculations?

Depending on shmem requirements, it might be possible to share the gradients as well. Basically, instead of just sampling the input texture for each thread, sample the entire 4x4 quad (use textureGather) and store both the local sample and the gradient to its neighbours into the shmem arrays.

compute/ravu-r3-x3 fails with libplacebo-3.120.3 and shaderc-2022.1

Throws error:

[vo/gpu/vulkan/libplacebo] shaderc output:
[vo/gpu/vulkan/libplacebo] input:9: warning: '#extension' : extension not supported: GL_ARB_texture_buffer_object
[vo/gpu/vulkan/libplacebo] shaderc: internal error: compilation succeeded but failed to optimize: In SPIR-V 1.6 or later, sampled image dimension must not be Buffer
[vo/gpu/vulkan/libplacebo]   %57 = OpTypeSampledImage %56
[vo/gpu/vulkan/libplacebo] 
[vo/gpu/vulkan/libplacebo] 
[vo/gpu/vulkan/libplacebo] 
[vo/gpu/vulkan/libplacebo] shaderc compile status 'error' (0 errors, 1 warnings)
[vo/gpu/vulkan/libplacebo] [  1] #version 450
[vo/gpu/vulkan/libplacebo] [  2] #extension GL_ARB_compute_shader : enable
[vo/gpu/vulkan/libplacebo] [  3] #extension GL_KHR_shader_subgroup_basic : enable 
[vo/gpu/vulkan/libplacebo] [  4] #extension GL_KHR_shader_subgroup_vote : enable 
[vo/gpu/vulkan/libplacebo] [  5] #extension GL_KHR_shader_subgroup_arithmetic : enable 
[vo/gpu/vulkan/libplacebo] [  6] #extension GL_KHR_shader_subgroup_ballot : enable 
[vo/gpu/vulkan/libplacebo] [  7] #extension GL_KHR_shader_subgroup_shuffle : enable 
[vo/gpu/vulkan/libplacebo] [  8] #extension GL_ARB_shader_image_load_store : enable
[vo/gpu/vulkan/libplacebo] [  9] #extension GL_ARB_texture_buffer_object : enable
[vo/gpu/vulkan/libplacebo] [ 10] layout(binding=0) uniform samplerBuffer _data_0_0;
[vo/gpu/vulkan/libplacebo] [ 11] layout(binding=1, rgba16f) writeonly restrict uniform image2D _image_1_0;
[vo/gpu/vulkan/libplacebo] [ 12] layout (local_size_x = 256, local_size_y = 1) in;
[vo/gpu/vulkan/libplacebo] [ 13] 
[vo/gpu/vulkan/libplacebo] [ 14] void _main_2_0() {
[vo/gpu/vulkan/libplacebo] [ 15] if (gl_GlobalInvocationID.x >= 26) 
[vo/gpu/vulkan/libplacebo] [ 16]     return;                        
[vo/gpu/vulkan/libplacebo] [ 17] vec4 color = vec4(0.0);                                       
[vo/gpu/vulkan/libplacebo] [ 18] ivec3 pos = ivec3(gl_GlobalInvocationID) + ivec3(0, 0, 0); 
[vo/gpu/vulkan/libplacebo] [ 19] int base = ((pos.z * 216 + pos.y) * 26 + pos.x) * 4;          
[vo/gpu/vulkan/libplacebo] [ 20] color[0] = texelFetch(_data_0_0, base + 0).r; 
[vo/gpu/vulkan/libplacebo] [ 21] color[1] = texelFetch(_data_0_0, base + 1).r; 
[vo/gpu/vulkan/libplacebo] [ 22] color[2] = texelFetch(_data_0_0, base + 2).r; 
[vo/gpu/vulkan/libplacebo] [ 23] color[3] = texelFetch(_data_0_0, base + 3).r; 
[vo/gpu/vulkan/libplacebo] [ 24] imageStore(_image_1_0, ivec2(pos), color);
[vo/gpu/vulkan/libplacebo] [ 25] 
[vo/gpu/vulkan/libplacebo] [ 26] }
[vo/gpu/vulkan/libplacebo] [ 27] 
[vo/gpu/vulkan/libplacebo] [ 28] void main() {
[vo/gpu/vulkan/libplacebo] [ 29] _main_2_0();
[vo/gpu/vulkan/libplacebo] [ 30] }
[vo/gpu/vulkan/libplacebo] vk_compile_glsl(gpu, tmp, GLSL_SHADER_COMPUTE, params->glsl_shader, &comp): VK_ERROR_INITIALIZATION_FAILED (../libplacebo-v3.120.3/src/vulkan/gpu.c:2739)
[vo/gpu/vulkan/libplacebo] Failed creating render pass for dispatch

Please clarify the license of code vs generated shaders?

I noticed that in your ravu branch, you have the LICENSE file set up for a BSD-3 clause license, but in your master branch, it instead refers to the LGPLv3 (as do your generated shaders).

Is this mismatch intentional? Do I understand it correctly that you license specifically your trained weights under LGPLv3 but the generating code under BSD3?

Play around with the texture order

Is it possible that, due to the linear interpolation of weights, we would be better off with a different sampling order? As far as I can tell, you use the x coordinate to store multiple weights and the y coordinate to store a projected offset based on the entire key, using a 2D texture overall. But this is not a requirement - you could instead use a 3D texture, or even a “4D” texture (texture array / multiple textures) in theory.

Then you can use one “linear” coefficient for each coordinate, allowing possibly better linear interpolation of the values. Basically use vec3(angle, strength, coherence) as your base coordinate. To deal with the fact that you need to sample multiple times (for many weights), you can pick one coordinate (ideally the “smallest”) as your “offset” coordinate, and just repeat the entire texture a few times in that dimension - for the different rows of the weights.

Perhaps with this organization you can get a better result with a smaller weights texture, which benefits from better caching and therefore typically better performance.

Play around with the kernel shape

Instead of sampling and training the entire R*R kernel, it's possible that you could get a better result with fewer required samples by using a more circular organization (e.g. store a diamond-like pattern or a circle approximation), much in the same way that polar EWA sampling does.

Since you can just use any pattern of samples you want for your weights and interpolation, this should be relatively easy to accomplish in practice, I would imagine?

add SmoothD2

SmoothD2 I'd be grateful if the filter was ported to mpv-prescalers or mpv

view result in to videos.
Captura de pantalla -2019-03-24 18-16-13

new upscale free to blocks colors and clean image please implement SmoothD2 in your project

page to pluging avisynth SmoothD2

http://avisynth.nl/index.php/SmoothD2

@bjin

Configuration guide?

Now that these are user shaders, how does one specific the parameters e.g. downscaling threshold?

Thank you!

nnedi3 picture shift

Hello it seems that the current nnedi3 implementation shifts the image a little bit. Here's what I did:

  • mpv --no-config --screenshot-format=png --glsl-shaders=~/.config/mpv/shaders/nnedi3-nns16-win8x6.hook --start 00:02:02.455 my.mkv
  • Hit s and ^s
  • Scale the original one with vapoursynth.
  • convert -filter box -resize 400% -crop 800x800+0+0 +repage in.png out_crop.png for both
  • convert -delay 30 *crop.png out.gif
    What I get:
    out

PS: Oh yeah, I compared to a plain mpv lanczos upscale too (so vapoursynth isn't the bad one).

ravu-lite: implement direct 3x and 4x prescaling

RAVU could be trained to perform direct 3x and 4x prescaling, and is actually kind of straightforward in compute shader. As for fragment shader, there is a restriction that only 4 components can be stored in intermediate texture. Therefore, 2 passes is required to gather 8 components for 3x, and 4 passes is required to gather 16 components for 4x prescaling (assuming no offset). It might be the case that only compute shader implementation make sense in term of performance over repeated use of old 2x shader.

RAVU LUTs in PNG format?

Hi, I was interested in trying to port RAVU to work with a program that needs LUTs in png format. Can you provide any guidance on generating/converting such a thing? Or would you happen to have them lying around? :)

Shader from HLSL, how to obtain the .hook?

Hi, I normally use MPC-HC, but now use MPV because of the better cpu/gpu usage. Only there's a pixel shader (.HLSL) on MPC that I want to use as shader on MPV, is it possible to generate the .hook/.glsl equivalent?
this is the small code of the pixel shader .hlsl, that converts colors 16-235 to 0-255:

sampler s0 : register(s0);

#define const_1 ( 16.0 / 255.0)
#define const_2 (255.0 / 219.0)

float4 main(float2 tex : TEXCOORD0) : COLOR
{
	// original pixel
	float4 c0 = tex2D(s0, tex);

	return ((c0 - const_1) * const_2);
}

Thanks

Possible RAVU improvement ideas

To make sure they don't get forgotten to time:

  • Add naive anti-ringing. You could do something like bilinear or bicubic sampling along the “edge direction” in a line (or small bundle of offset lines), gathering a few samples, and clamping your output pixel to this value range. Alternatively, you could try adapting the “in-place” antiringing filter from my antiring.hook and adding it as a separate post-RAVU pass, slightly adjusted to account for the fact that you introduce an offset.

  • Train kernels differently? Right now, you said you use a linear regression to combine multiple scaler kernels - but how do you actually choose the kernels to combine to begin with? Are they jinc functions? Or am I misunderstanding how the algorithm works?

Implement Winograd optimization properly

I tried to implement 3x3 Winograd convolution algorithm months ago, based on the scripts from the wincnn repo. In theory it could make 3x3 convolution2d layer (in CNN models) 2.25 times faster. It will benefit mostly ResNet based models, but could also help other CNN models.

The initial result is not promising, it's slower than the naive implementation. The code can be found in the conv2d-slow branch. There are two approaches that I tried

  1. commit 9324c5a uses 2x2 group and mat4-vec4 multiplication, and just slightly slower than naive approach
  2. commit ea93872 uses 4x4 group and mat4-mat4 multiplication, and about 2 times slower.

A proper implementation requires finding the overhead first. Some low level primitivess might also be required, probably from some vendor specific extensions.

RAVU 3x

Is it possible to generate a RAVU shader that triples the image without compute? I have an Apple Silicon Macbook which doesn't support OpenGL or Vulkan so the compute branch just displays a blue image.

Crash with >= nns64 versions of NNEDI3

I wanted to test these shaders today and made my first try with nnedi3-nns64-win8x4-yuv.hook from your repo. The result was mpv crashing (segmentation fault) right at startup.
I remember I successfully tested your shaders some time ago so I checked the other files and while nnedi3-nns64-win8x4.hook seems to work, nnedi3-nns64-win8x4-chroma.hook and nnedi3-nns64-win8x4-all.hook result in crashes again.
One more test showed that all nns32 variants work without crash, so maybe the problem is the shader's size? I know nothing about shaders, how could I confirm this? Is there something my GPU has to support for bigger shader files?

This is on OSX 10.11 running a AMD 6870 (a hackintosh) with support for OpenGL 4.1 (thanks Apple).
Not sure if useful but this is the stack trace I get:

Thread 23 Crashed:: mpv/vo
0   com.apple.AMDRadeonX3000GLDriver	0x0000000127645d53 0x1275d9000 + 445779
1   com.apple.AMDRadeonX3000GLDriver	0x0000000127648248 0x1275d9000 + 455240
2   com.apple.AMDRadeonX3000GLDriver	0x00000001276582f7 0x1275d9000 + 520951
3   com.apple.AMDRadeonX3000GLDriver	0x000000012763b78f gldUpdateDispatch + 2853
4   GLEngine                      	0x00007fff93fb3037 gleDoDrawDispatchCoreGL3 + 533
5   GLEngine                      	0x00007fff93f637a8 gleDrawArraysOrElements_Entries_Body + 128
6   GLEngine                      	0x00007fff93f5c632 glDrawArrays_GL3Exec + 173
7   mpv                           	0x00000001093dd460 gl_vao_draw_data + 112
8   mpv                           	0x00000001093e8353 finish_pass_direct + 1283
9   mpv                           	0x00000001093e9c9d pass_hook + 2237
10  mpv                           	0x00000001093e2d43 pass_render_frame + 2707
11  mpv                           	0x00000001093e0cbc gl_video_render_frame + 1052
12  mpv                           	0x00000001093f2307 draw_frame + 103
13  mpv                           	0x00000001093ef346 vo_thread + 1030
14  libsystem_pthread.dylib       	0x00007fff8288699d _pthread_body + 131
15  libsystem_pthread.dylib       	0x00007fff8288691a _pthread_start + 168
16  libsystem_pthread.dylib       	0x00007fff82884351 thread_start + 13

More neurons?

Hi!

Is it possible to make nnedi3 with more neurons, like 256 or 512?
Currently the script's --nns option accepts max 128.

Regards,
Alex

I cannot load shaders with .hook format

Hello,

I use the last stable version of MPV (and last April nightly build). I don't manage to load shaders in .hook format while shaders in the .glsl format works perfectly.

I specify that I am under Windows 10, and that my equipment is recent. The video part of my mpv.conf file is so:

profile=opengl-hq
hwdec=no
target-prim=auto
target-trc=auto
format=default:colormatrix=auto
video-output-levels=auto
deinterlace=auto

interpolation
video-sync=display-resample
framedrop=vo
deband
deband-grain=40
dither-depth=auto
scaler-resizes-only

opengl-shader="C:\Users\thoach\AppData\Roaming\mpv\shaders\mpv-prescalers\test\ravu-r3-smoothtest1.hook"
#opengl-shader="C:\Users\thoach\AppData\Roaming\mpv\shaders\adaptive-antiringing.glsl"

sigmoid-upscaling
scale=ewa_lanczossharp
scale-antiring=1.0
cscale=ewa_lanczossoft
cscale-antiring=1.0
dscale=lanczos
dscale-antiring=1.0
tscale=mitchell

I tried several combinations, with different hook shaders, but they are never loaded.

Thanks you for your answers.
Thoach

scale = bilinear (rgb) query

@bjin stats, I am seeing scale = bilinear (rgb) with ravu_zoom r4
Without ravu scale = ewa_lanczossharp (rgb) as specified in mpv.conf

ravu zoom
Without ravu

You had explained to another user:

Chroma is upscaled to luma resolution (video size), and then the converted RGB is upscaled to target resolution (screen size).

Isn't bilinear a low quality algorithm for scaling? Is there a way to switch it to better algorithm or am I missing something?

glsl-shader = E:/Softwares/Video Player\mpv/shaders/KrigBilateral.glsl
glsl-shader = E:/Softwares/Video Player\mpv/shaders/SSimDownscaler.glsl
glsl-shader = E:/Softwares/Video Player\mpv/shaders/SSimSuperRes.glsl
glsl-shader = E:/Softwares/Video Player\mpv/shaders/ravu-zoom-r4.hook

scale = ewa_lanczossharp
cscale = ewa_lanczossharp
dscale = mitchell
correct-downscaling = yes

gpu-api = d3d11
hwdec = no

Different shaders?

Hi!

Sorry for a dumb question, but if i take your Py scripts unchanged, and generate the shader hooks myself on my system -- will my hooks be any different from yours of master branch?

Regards,
Alex

mpv crashes when using vulkan/compute/ravu-zoom-r3

This is the log before the crash:

[   0.317][v][cplayer] Setting option 'glsl-shaders-append' = '/usr/share/mpv-prescalers/vulkan/compute/ravu-zoom-r3.hook' (flags = 4)
[   0.318][v][cplayer] Setting option 'glsl-shaders-append' = '~~/glsl/SSimDownscaler.glsl' (flags = 4)
[   0.318][v][ao] Trying audio driver 'pipewire'
[   0.318][v][ao/pipewire] requested format: 192000 Hz, stereo channels, double
[   0.318][v][ao/pipewire] Headers version: 0.3.65
[   0.318][v][ao/pipewire] Library version: 0.3.65
[   0.319][d][global] user path: '/usr/share/mpv-prescalers/vulkan/compute/ravu-zoom-r3.hook' -> '/usr/share/mpv-prescalers/vulkan/compute/ravu-zoom-r3.hook'
[   0.321][v][vo/gpu-next/libplacebo] Registering hook pass: RAVU-Zoom (luma, r3, compute)
[   0.321][t][ffmpeg/video] hevc: Decoded frame with POC 1.
[   0.324][v][ao/pipewire] Core user: sommio
[   0.324][v][ao/pipewire] Core host: arch
[   0.324][v][ao/pipewire] Core version: 0.3.65
[   0.324][v][ao/pipewire] Core name: pipewire-0
[   0.324][d][ao/pipewire] Stream state changed: old_state=unconnected state=connecting error=(null)
[   0.325][t][ffmpeg/video] hevc: Decoded frame with POC 3.
[   0.325][d][ao/pipewire] Stream state changed: old_state=connecting state=paused error=(null)
[   0.330][v][ao/pipewire] device buffer: 3840 samples.
[   0.330][v][ao/pipewire] using soft-buffer of 38400 samples.
[   0.330][i][cplayer] AO: [pipewire] 192000Hz stereo 2ch double
[   0.330][v][cplayer] AO: Description: PipeWire audio output
[   0.330][t][ffmpeg/video] hevc: Decoded frame with POC 9.
[   0.335][t][vo/gpu-next/libplacebo] vk->CreateImage(vk->dev, &iinfo, PL_VK_ALLOC, &tex_vk->img)
[   0.335][d][vo/gpu-next/libplacebo] Allocating 4194304 memory of type 0x1 (id 7) in heap 0
[   0.336][t][vo/gpu-next/libplacebo] vk->BindImageMemory(vk->dev, tex_vk->img, mem->vkmem, mem->offset)
[   0.336][t][vo/gpu-next/libplacebo] vk->CreateSemaphore(vk->dev, &sinfo, PL_VK_ALLOC, &sem->semaphore)
[   0.336][t][vo/gpu-next/libplacebo] vk->CreateImageView(vk->dev, &vinfo, PL_VK_ALLOC, &tex_vk->view)
[   0.336][t][vo/gpu-next/libplacebo] vk->CreateSemaphore(vk->dev, &sinfo, PL_VK_ALLOC, &sem->semaphore)
[   0.336][t][vo/gpu-next/libplacebo] vk->CreateBuffer(vk->dev, &binfo, PL_VK_ALLOC, &slab->buffer)
[   0.336][d][vo/gpu-next/libplacebo] Allocating 3932160 memory of type 0x1 (id 7) in heap 0
[   0.337][t][vo/gpu-next/libplacebo] vk->BindBufferMemory(vk->dev, slab->buffer, slab->mem, 0)
[   0.337][t][vo/gpu-next/libplacebo] vk->AllocateCommandBuffers(vk->dev, &ainfo, &cmd->buf)
[   0.337][t][vo/gpu-next/libplacebo] vk->CreateFence(vk->dev, &finfo, PL_VK_ALLOC, &cmd->fence)
[   0.337][t][vo/gpu-next/libplacebo] vk->BeginCommandBuffer(cmd->buf, &binfo)
[   0.337][t][vo/gpu-next/libplacebo] Using multiple vkCmdUpdateBuffer calls to upload large buffer. Consider using buffer-buffer transfers instead!`

vulkan/compute/ravu-r3 works fine:

[   0.353][v][cplayer] Setting option 'glsl-shaders-append' = '/usr/share/mpv-prescalers/vulkan/compute/ravu-r3.hook' (flags = 4)
[   0.354][v][cplayer] Setting option 'glsl-shaders-append' = '~~/glsl/SSimDownscaler.glsl' (flags = 4)
[   0.354][d][cplayer] Run command: change-list, flags=64, args=[name="shared-script-properties", operation="append", value="osc-margins=0.000000,0.000000,0.000000,0.000000"]
[   0.354][v][cplayer] Set property: shared-script-properties -> 1
[   0.354][d][global] user path: '/usr/share/mpv-prescalers/vulkan/compute/ravu-r3.hook' -> '/usr/share/mpv-prescalers/vulkan/compute/ravu-r3.hook'
[   0.355][v][ao] Trying audio driver 'pipewire'
[   0.355][v][ao/pipewire] requested format: 192000 Hz, stereo channels, double
[   0.355][v][ao/pipewire] Headers version: 0.3.65
[   0.355][v][ao/pipewire] Library version: 0.3.65
[   0.355][v][vo/gpu-next/libplacebo] Registering hook pass: RAVU (step1, luma, r3, compute)
[   0.355][v][vo/gpu-next/libplacebo] Registering hook pass: RAVU (step2, luma, r3, compute)
[   0.355][t][vo/gpu-next/libplacebo] vk->CreateImage(vk->dev, &iinfo, PL_VK_ALLOC, &tex_vk->img)
[   0.355][d][vo/gpu-next/libplacebo] Allocating 262144 memory of type 0x1 (id 7) in heap 0
[   0.356][t][vo/gpu-next/libplacebo] vk->BindImageMemory(vk->dev, tex_vk->img, mem->vkmem, mem->offset)
[   0.356][t][vo/gpu-next/libplacebo] vk->CreateSemaphore(vk->dev, &sinfo, PL_VK_ALLOC, &sem->semaphore)
[   0.357][t][ffmpeg/video] hevc: Decoded frame with POC 1.
[   0.357][t][vo/gpu-next/libplacebo] vk->CreateImageView(vk->dev, &vinfo, PL_VK_ALLOC, &tex_vk->view)
[   0.357][t][vo/gpu-next/libplacebo] vk->CreateSemaphore(vk->dev, &sinfo, PL_VK_ALLOC, &sem->semaphore)
[   0.359][t][vo/gpu-next/libplacebo] vk->CreateBuffer(vk->dev, &binfo, PL_VK_ALLOC, &slab->buffer)
[   0.359][d][vo/gpu-next/libplacebo] Allocating 786432 memory of type 0x1 (id 7) in heap 0
[   0.360][t][ffmpeg/video] hevc: Decoded frame with POC 3.
[   0.360][t][vo/gpu-next/libplacebo] vk->BindBufferMemory(vk->dev, slab->buffer, slab->mem, 0)
[   0.360][t][vo/gpu-next/libplacebo] vk->AllocateCommandBuffers(vk->dev, &ainfo, &cmd->buf)
[   0.361][t][vo/gpu-next/libplacebo] vk->CreateFence(vk->dev, &finfo, PL_VK_ALLOC, &cmd->fence)
[   0.361][t][vo/gpu-next/libplacebo] vk->BeginCommandBuffer(cmd->buf, &binfo)
[   0.361][v][vo/gpu-next/libplacebo] Registering named texture 'ravu_lut3'
[   0.361][d][global] config path: 'glsl/SSimDownscaler.glsl' -> '/home/sommio/.config/mpv/glsl/SSimDownscaler.glsl'
[   0.361][d][global] user path: '~~/glsl/SSimDownscaler.glsl' -> '/home/sommio/.config/mpv/glsl/SSimDownscaler.glsl'
[   0.362][v][vo/gpu-next/libplacebo] Registering hook pass: SSimDownscaler L2 pass 1
[   0.362][v][vo/gpu-next/libplacebo] Registering hook pass: SSimDownscaler L2 pass 2
[   0.362][v][vo/gpu-next/libplacebo] Registering hook pass: SSimDownscaler mean & R
[   0.362][v][vo/gpu-next/libplacebo] Registering hook pass: SSimDownscaler final pass

mpv:

mpv 0.35.0-221-ga40958cdf8 Copyright © 2000-2023 mpv/MPlayer/mplayer2 projects
 built on Sat Feb 11 09:34:01 2023
libplacebo version: v5.229.1-157-g85528096
FFmpeg version: n5.1.2
FFmpeg library versions:
   libavutil       57.28.100
   libavcodec      59.37.100
   libavformat     59.27.100
   libswscale      6.7.100
   libavfilter     8.44.100
   libswresample   4.7.100

README for precompiled shaders outdated?

Hi,

Looks like NNEDI3 shaders have lost their suffixes (i.e. -yuv, -chroma, etc) and README is now outdated in this regard?

Is it still possible to use NNEDI3 upscaling for only, say, chroma planes?

Regards,
Alex

UPD. Didn't notice that RAVU still has suffixes. Sorry.

Try using GL_NV_shader_thread_shuffle

https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_shuffle.txt

Not sure if AMD/intel implement this too. If not, then it's probably not worth trying.

In theory, this would allow us to directly share samples between threads in the same warp without going through shmem, which should be even faster. I believe the change required would be essentially rewriting the code that loads the samples (float lumaNN = ...) to load them in groups of 32 where each thread loads one value and then uses the warp exchange primitives to directly shuffle them with the other 31 threads.

NNEDI3: #pragma optionNV(fastprecision on) causes issues on some GPUs

I have a GeForce GTX 970 and those pragma lines seem to cause trouble for me.

raw.png
raw

nnedi3-nns32-win8x4.hook from master

mpv --no-config --window-scale=4 --vf=format=gray --opengl-shader=nnedi3-nns32-win8x4.hook --pause raw.png

fastprecision_on

nnedi3-nns32-win8x4.hook with all occurences of '#pragma optionNV(fastprecision on)' removed

mpv --no-config --window-scale=4 --vf=format=gray --opengl-shader=nnedi3-nns32-win8x4-nopragma.hook --pause raw.png

fastprecision_off

High-level description of RAVU algorithm?

I'd like to see if I can figure out any tricks to make it faster, especially with compute shaders.

How does RAVU work on a high level? It has four passes, what do those passes do? Why the weird weights texture?

Broken alpha channel when using ravu-*-rgb on formats with alpha channels

Using e.g. ravu-r3-rgb together with an RGBA source that has an alpha channel, RAVU currently produces very broken output. This is down to a combination of two issues:

  1. RAVU always outputs the alpha channel as 0.0, when it should really output something like 1.0 instead. This would be a marginal improvement over the status quo by making RAVU effectively perform alpha blending against black.

  2. Even better would be to pass-through the alpha channel of, say, the center pixel, untouched. (Introduces slight drift but whatever, better than completely breaking alpha handling)

Does ravu-zoom work with --cscale ?

I see that "ravu-zoom also comes with a chroma variant which is mostly a drop-in replacement of --cscale."
But I can see the --cscale in the stats' information. Does it real work or just the error display ?

Snipaste_2021-01-10_15-31-23-tuya

Vulkan-compatible versions of RAVU

The RAVU weights texture currently uses rgba16f, which is an OpenGL-only texture format. For vulkan you need to use either rgba16hf (16-bit half float format) or just plain rgba16 (normalized integer).

Release RAVU training code

Would it be possible for you to release the RAVU training code so we can train our own set of weights on specific content we're interested in, rather than anime?

Usage update

Hi,

It seems that usage description in README should be updated, because now it shows deprecated mpv options --vo=opengl:user-shaders => --opengl-shaders

Regards,
Alex

Upscaling Lower Resolution Video to 4K (Primer)

@haasn, @bjin, and @wm4

I have been following your collective work on Vulkan (v2) and RAISR/RAVU since the summer. I have attempted to distill it into an article on my blog: "Upscaling Lower Resolution Video to 4K". Yours is amazing work though I am still trying to wrap my head around it.

  1. I would appreciate your feedback if there are any mistakes or missing information in the primer. My specific focus is providing a multi-room 4K setup box solution using Nvidia Shield TVs, not a single room, Windows PC solution. Full disclosure: my experience to date has been with Kodi/SPMC. But, I know that MPV is embedded in Plex, which runs on the Shield.

I am a bit fuzzy on the exact application by source video resolution. Presumably, for a given broadcast/streamed/stored video:

1080p:

  • no deinterlacer needed
  • debanding needed to convert from Rec709 to Rec2020
  • luma upscaled 2x by RAVU (1080 > 2160)
  • YUV420 chroma upscaled 4x by RAVU (540 > 2160)
  • no downscaling needed

1080i:

  • same as above but deinterlacer needed first

720p:

  • no deinterlacer needed
  • debanding needed to convert from Rec709 to Rec2020
  • luma upscaled 4x by RAVU (720 > 2880)
  • luma downscaled by ewa_lanczossharp (2880 > 2160)
  • YUV420 chroma upscaled 8x by RAVU (360 > 2880)
  • YUV420 chroma downscaled by ewa_lanczossharp (2880 > 2160)

480i:

  • deinterlacer needed
  • debanding needed to convert from Rec601 to Rec2020
  • luma upscaled 8x by RAVU (480 > 3840)
  • luma downscaled by ewa_lanczossharp (3840 > 2160)
  • YUV420 chroma upscaled 16x by RAVU (240 > 3840)
  • YUV420 chroma downscaled by ewa_lanczossharp (3840 > 2160)

I am not clear whether RAVU smooth is needed to avoid an additional anti-ringing function.

My assumptions are that:

  • you want to use RAVU for both chroma and luma, separately (rather than upscaling chroma first 2x and then upscaling both simultaneously).
  • I read that bjin was exploring using the learning data from luma to influence chroma upscaling
  • RAVU is efficient enough to use for "over" interpolation, then downscaling (as opposed to using RAVU for a lesser 2x and then upscaling using an inferior algorithm).

Several additional questions:

  1. have you done any performance testing in general to compare Vulcan-enabled RAVU to OpenGL EWA upscaling? I saw the NNEDI comparison numbers.

  2. same question, specific to the Shield? Are there any performance tradeoffs / limitations (e.g. limited to 2x, limited to luma channel only, different radius--RAV1 v RAV5)?

Thanks!

Use textureGatherOffset for performance

textureGatherOffset can be used to gather a single component of four adjacent pixels at the same time. This seems ideal for NNEDI3, which basically samples a single component of entire 8x4 window area - this can be collapsed down to eight textureGatherOffset calls instead of 8*4 texture()[0] calls.

You can use it like this:

Instead of:

vec4 color;
color.x = textureOffset(tex, pos, ivec2(0, 1))[n];
color.y = textureOffset(tex, pos, ivec2(1, 1))[n];
color.z = textureOffset(tex, pos, ivec2(1, 0))[n];
color.w = textureOffset(tex, pos, ivec2(0, 0))[n];

You can do this:

vec4 color = textureGatherOffset(tex, pos, ivec2(0, 0), n);

Please note that the samples are returned starting from the bottom left pixel going counter-clockwise. You also need a new enough version of mpv to take advantage of GLSL 400. (It was previously clamped to 330 for deprecated reasons).

COnvert .hlsl to .glsl/.hook (open new issue because of blocked comments)

I undestand your reply.. But i ask you for personal favor.. Can you send the port to my email? DOesn't matter if the shader is incorrect, I want to try... My email is:

"[email protected]"

I repeat the code:

sampler s0 : register(s0);

#define const_1 ( 16.0 / 255.0)
#define const_2 (255.0 / 219.0)

float4 main(float2 tex : TEXCOORD0) : COLOR
{
	// original pixel
	float4 c0 = tex2D(s0, tex);

	return ((c0 - const_1) * const_2);
}

SSBOs or UBOs for NNEDI3 weights?

I knew we tested UBOs way back in the day but I would be curious to try it again, especially with vulkan etc. Also, instead of UBOs, we should try SSBOs for this sort of stuff. I'd love to hear the results of using either.

I realize this requires changes in mpv, maybe we could work together to try and implement the necessary API?

Why there isn't a Left/Center choice for "RAVU-Zoom-R2-Chroma.hook"?

I'm afraid that I've chosen the wrong type of RAVU chroma upscale file..

These shaders are in my conf by the way:
glsl-shaders = "~~/shaders/ravu-zoom-r3.hook"
glsl-shaders-append = "~~/shaders/ravu-zoom-r2-chroma.hook"
glsl-shaders-append = "~~/shaders/SSimDownscaler.glsl"
glsl-shaders-append = "~~/shaders/adaptive-sharpen.glsl"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.