Giter Site home page Giter Site logo

realsr-ncnn-vulkan's Introduction

RealSR ncnn Vulkan

CI download

ncnn implementation of Real-World Super-Resolution via Kernel Estimation and Noise Injection super resolution.

realsr-ncnn-vulkan uses ncnn project as the universal neural network inference framework.

Download Windows/Linux/MacOS Executable for Intel/AMD/Nvidia GPU

https://github.com/nihui/realsr-ncnn-vulkan/releases

This package includes all the binaries and models required. It is portable, so no CUDA or Caffe runtime environment is needed :)

About RealSR

Real-World Super-Resolution via Kernel Estimation and Noise Injection (CVPRW 2020)

https://github.com/jixiaozhong/RealSR

Xiaozhong Ji, Yun Cao, Ying Tai, Chengjie Wang, Jilin Li, and Feiyue Huang

Tencent YouTu Lab

Our solution is the winner of CVPR NTIRE 2020 Challenge on Real-World Super-Resolution in both tracks.

https://arxiv.org/abs/2005.01996

Usages

Example Command

realsr-ncnn-vulkan.exe -i input.jpg -o output.png -s 4

Full Usages

Usage: realsr-ncnn-vulkan -i infile -o outfile [options]...

  -h                   show this help
  -v                   verbose output
  -i input-path        input image path (jpg/png/webp) or directory
  -o output-path       output image path (jpg/png/webp) or directory
  -s scale             upscale ratio (4, default=4)
  -t tile-size         tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
  -m model-path        realsr model path (default=models-DF2K_JPEG)
  -g gpu-id            gpu device to use (-1=cpu, default=0) can be 0,1,2 for multi-gpu
  -j load:proc:save    thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
  -x                   enable tta mode
  -f format            output image format (jpg/png/webp, default=ext/png)
  • input-path and output-path accept either file path or directory path
  • scale = scale level, 4 = upscale 4x
  • tile-size = tile size, use smaller value to reduce GPU memory usage, default selects automatically
  • load:proc:save = thread count for the three stages (image decoding + realsr upscaling + image encoding), using larger values may increase GPU usage and consume more GPU memory. You can tune this configuration with "4:4:4" for many small-size images, and "2:2:2" for large-size images. The default setting usually works fine for most situations. If you find that your GPU is hungry, try increasing thread count to achieve faster processing.
  • format = the format of the image to be output, png is better supported, however webp generally yields smaller file sizes, both are losslessly encoded

If you encounter crash or error, try to upgrade your GPU driver

Build from Source

  1. Download and setup the Vulkan SDK from https://vulkan.lunarg.com/
  • For Linux distributions, you can either get the essential build requirements from package manager
dnf install vulkan-headers vulkan-loader-devel
apt-get install libvulkan-dev
pacman -S vulkan-headers vulkan-icd-loader
  1. Clone this project with all submodules
git clone https://github.com/nihui/realsr-ncnn-vulkan.git
cd realsr-ncnn-vulkan
git submodule update --init --recursive
  1. Build with CMake
  • You can pass -DUSE_STATIC_MOLTENVK=ON option to avoid linking the vulkan loader library on MacOS
mkdir build
cd build
cmake ../src
cmake --build . -j 4

Sample Images

Original Image

origin

Upscale 4x with ImageMagick Lanczo4 Filter

convert origin.jpg -resize 400% output.png

browser

Upscale 4x with srmd scale=4 noise=-1

srmd-ncnn-vulkan.exe -i origin.jpg -o 4x.png -s 4 -n -1

waifu2x

Upscale 4x with realsr model=DF2K scale=4 tta=1

realsr-ncnn-vulkan.exe -i origin.jpg -o output.png -s 4 -x -m models-DF2K

realsr

Original RealSR Project

Other Open-Source Code Used

realsr-ncnn-vulkan's People

Contributors

lj1995-computer-vision avatar nihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

realsr-ncnn-vulkan's Issues

Include DPED model

Hi, can you please also include the DPED model for noisy smartphone images.
Thanks!

core dump on GPU H100

I'm trying to run realsr-ncnn-vulkan on H100 GPU and receiving core dump error.

Hardware/software config:
2 x Intel Xeon Platinum 8468
GRAPHICS: llvmpipe
BAR1 / Visible vRAM: 131072 MiB
OpenGL: 4.5 Mesa 20.3.3 (LLVM 11.0.0 256 bits)
Display Driver: NVIDIA 530.30.02
Screen: 640x480
MEMORY: 16 x 64 GB 4800MT/s
OPERATING SYSTEM: Red Hat Enterprise Linux 8.4
Kernel: 4.18.0-305.25.1.el8_4.x86_64 (x86_64)
Desktop: GNOME Shell 3.32.2
Display Server: X Server 1.20.10
Compiler: GCC 8.4.1 20200928 + Clang 11.0.1 + CUDA 12.1

First used Phoronix Test Suite v10.8.4 that install pts/realsr-ncnn-1.0.0, following error is showed:

realsr-ncnn-vulkan-20200818-linux]$ ./realsr-ncnn-vulkan -i low-end-image-sample1.JPG -o out.png
more than 64 cpu detected, thread affinity may not work properly :(
double free or corruption (!prev)
Aborted (core dumped)

Compiled the last version from github and the following error is shown:

$ sudo ./realsr-ncnn-vulkan -i /home/user/realsr-ncnn-vulkan/images/2.png -o output.png -s 4
Segmentation fault

Tried also the binaries from stable release 20220728 and the following error is shown:

realsr-ncnn-vulkan-20220728-ubuntu]$ sudo ./realsr-ncnn-vulkan -i /home/user/realsr-ncnn-vulkan/images/0.png -o output.png -s 4
double free or corruption (!prev)
Aborted

I followed same installation process for Phoronix in a system with same software but using A100 GPUs and worked fine.

How can i build it with vs2017

When i use cmake it like this
CMake Error at CMakeLists.txt:5 (project):
Failed to run MSBuild command:
MSBuild/Current/Bin/MSBuild.exe
get the value of VCTargetsPath:

Is there something not right with cmakelist?

Invalid scale argument

Hi, I have tried other upscale ratio like 3 by
'realsr-ncnn-vulkan.exe -i input.jpg -o output.png -s 3',
the the following error prompt out:
'invalid scale argument', do I need to recompile the files somehow ?
I am using Windows.

Thanks so much in advance.

Can't run on your PC

My laptop is Legion Y7000p 2019 with gtx 1660ti, system windows 1809 LTSC.When I run the exe from cmd,it shows "Can't run on your PC". I also tried to run it from Waifu2x-GUI,and it crashed immediately.Drivers are updated,so is there something wrong with my system or graphic card ?

Failure to deallocate memory when batch processing.

If I specify input directory instead of a file, it works, but nvidia-smi shows memory usage growing and growing, eventually leading to vkAllocateMemory failed -2 and other errors and output pictures become garbled.

Version is "Release 20210210" from Github releases, running on Linux.

Training models

Would it be possible to train the models using this vulkan repository? Are there any guides for this, or plans to release that code?

Problem with other model

I'm going to have to load another model, and you don't accept it.

realsr-ncnn-vulkan -i in.png -m /usr/share/realsr-ncnn-vulkan/realesrgan-x4plus/ -o out.png
unknown model dir type

realsr-ncnn-vulkan -i in.png -m realesrgan-x4plus -o out.png
unknown model dir type

I copy this model, and it is compatible if I force it, but it does not recognize it in the path

https://github.com/xinntao/Real-ESRGAN

weird error message, if the output directory doesn't exist

hi, thanks for this cool tool.
when runnning windows executable against existing input directory (but without existing output one), it fails with the message: invalid outputpath extension type. it's not a big deal to create output directory beforehand manually, but it really took a while to understand first time what was the actual problem.
it would be very helpful either to change that message to something more meaningful, or (better) to add output directory auto-creation.

Security Address

Hello!

I may have found a security issue in latest version of realsr-ncnn-vulkan . Following responsible disclosure, is there an email or other private channel where I could share the details?
Thank you

Will this work without GPU?

Hi.

I'm very interested to try out realsr-nccn-vulkan.

I have a Windows 10 laptop (Lenovo Yoga 3 Pro), which has no external GPU.

I tried downloading the binary for Windows and running a script like your example in "README", but I got "no command found". (I tried this in Powershell, CMD, and CMD as Administrator.)

I then tried downloading the binary for Ubuntu and running a script like your example within the Ubuntu 18.04 app (through WSL on Windows). Likewise, I got "no command found".

I searched some of your references in "README" and saw Tencent/ncnn list intel-cpu on Windows as known to work. But I don't know how this applies to realsr in particular.
image

Your advisement on this would be greatly appreciated.

Buggy outputs with single channel grayscale inputs

Hi,

First congrats for the great work. I am quite impressed by your paper and your efforts to make it testable by anyone.

I tried your program on many inputs, and found that single channel grayscale inputs give a buggy output (sometimes black, mix of black and some white, etc).
This is solved by taking the very same image, and concatenate the channel to get a 3 channel image. The result is quite good even though the image is not color.
I tried both png and jpg and both have the bug.

I guess to fix the issue your code should be updated to concatenate the channel 3 times when the input is a single channel grayscale image.

vkCreateInstance failed -9

提示:
vkCreateInstance failed -9
invalid gpu device

系统信息如下:
ubuntu 18.04 NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 Tesla V100-PCIE

vkCreateInstance failed -9, invalid gpu device on google.colab

Dear all,

got issues when using on google colab with T4 (or any other GPU)

Driver Version: 525.85.12 CUDA Version: 12.0

/content/realsr-ncnn-vulkan-20220728-ubuntu/realsr-ncnn-vulkan -v -s 4 -t 0 -g 0 -j 1:1:1 -f png -i origin.png -o output.png

vkCreateInstance failed -9
vkCreateInstance failed -9
invalid gpu device

Any help appreciated.

Thanks a lot!

Can only scale up by 4X

Hey I noticed that if I go lower than -s 4 that I get "invalid scale argument". Is there any way that I can only scale up by 2x instead of 4? Thanks.

What if the gpu doesn't support support_int8_storage?

Some derive doesn't support support_int8_storage, just print “fp16p/s/a=1/1/1 int8-p/s/a=1/0/1” and "vkQueueSubmit failed -4"

I find this

net.opt.use_int8_storage = true;

and this

if (net.opt.use_fp16_storage && net.opt.use_int8_storage)


          if (net.opt.use_fp16_storage && net.opt.use_int8_storage)
                realsr_preproc->create(realsr_preproc_int8s_spv_data, sizeof(realsr_preproc_int8s_spv_data), specializations);
            else if (net.opt.use_fp16_storage)
                realsr_preproc->create(realsr_preproc_fp16s_spv_data, sizeof(realsr_preproc_fp16s_spv_data), specializations);
            else
                realsr_preproc->create(realsr_preproc_spv_data, sizeof(realsr_preproc_spv_data), specializations);

            if (net.opt.use_fp16_storage && net.opt.use_int8_storage)
                realsr_postproc->create(realsr_postproc_int8s_spv_data, sizeof(realsr_postproc_int8s_spv_data), specializations);
            else if (net.opt.use_fp16_storage)
                realsr_postproc->create(realsr_postproc_fp16s_spv_data, sizeof(realsr_postproc_fp16s_spv_data), specializations);
            else
                realsr_postproc->create(realsr_postproc_spv_data, sizeof(realsr_postproc_spv_data), specializations);

(net.opt.use_fp16_storage && net.opt.use_int8_storage) always be true.

So I fixed the code as:


RealSR::RealSR(int gpuid, bool _tta_mode)
{
    const ncnn::GpuInfo& gpuInfo =  ncnn::get_gpu_info(gpuid);

    net.opt.use_vulkan_compute = true;
    net.opt.use_fp16_packed = gpuInfo.support_int8_packed();
    net.opt.use_fp16_storage = gpuInfo.support_fp16_storage();
    net.opt.use_fp16_arithmetic = false;
    net.opt.use_int8_storage = gpuInfo.support_int8_storage();
    net.opt.use_int8_arithmetic = false;

    net.set_vulkan_device(gpuid);

    realsr_preproc = 0;
    realsr_postproc = 0;
    bicubic_4x = 0;
    tta_mode = _tta_mode;
}


Does the modify is unnecessary?

Use ESRGAN (old Architecture) weights

I made a google colab script to convert ESRGAN (old Architecture) pth files to .param and .bin:

import shutil
import os
!7z x '/content/drive/My Drive/TFMLz/ESRGAN_oldarch/ncnnconv.7z' -o/content
!apt-get install -y libomp-dev
!pip install onnx-simplifier
!chmod u+x onnx2ncnn
!chmod u+x ncnnoptimize
adst='/content/esr.pth'

'''
asrc='/content/drive/My Drive/TFMLz/ESRGAN_oldarch/models/FatalimiX.pth'
shutil.copyfile(asrc,adst)
'''
'''
!7z e '/content/drive/My Drive/TFMLz/ESRGAN_oldarch/models/falcoon300.7z' -o/content
ren xxx.pth esr.pth
...


!python to_onnx.py
!python -m onnxsim '/content/esr.onnx' '/content/esr_simp.onnx'
!/content/onnx2ncnn /content/esr_simp.onnx /content/x4_o1.param /content/x4_o1.bin
!/content/ncnnoptimize /content/x4_o1.param /content/x4_o1.bin /content/x4_fp32.param /content/x4_fp32.bin 0
!/content/ncnnoptimize /content/x4_o1.param /content/x4_o1.bin /content/x4_fp16.param /content/x4_fp16.bin 1

ncnnconv.zip, rename it to .7z
(to_onnx.py .etc from github.com/achie27/super-resolution )

Replace 'models-DF2K/x4.bin' with the generated x4_fp32.bin or x4_fp16.bin then that's it.

There are about +-3 value bias per channel, comparing to it's original pytorch results.

Feature requests - Model Human Faces

Real -SR, release training code at Github/Tencent
Model on human faces is posible?, like Remini app make focus in faces deblur and denoise, also regenerate pixelate faces?
DF2K-JPEG Model, works really cool.
but sometimes is not enough, for human faces.
I don´t know how to create a model, i´m not a programer, i´m just a final user.
if this would be possible, can you made that model?
many thanks and keep going. 👍 )

Upscaling slow and produces noise only

If I try to upscale any png picture with RealSR ncnn it's unusable slow at around 10s/image (SRMD does 2fps) and produces these results:
output
I updated the GPU drivers to the newest version (445.87) but still the same result. What is going wrong?

My system config:
Windows 10
RTX 2060 Super
Ryzen 7 3700X
32GB DDR4

Most of the actual upscaling time is invested in decoding the picture (where I think the problem lays). The actual upscale goes really fast.
@nihui

-i -o in directory mode not re-extension

realsr-ncnn-vulkan.exe -s 4 -t 0 -j 2:2:2 -m models-DF2K_JPEG -x -g 0 -i G:\Users\susu\Desktop\16x9 -o G:\Users\susu\Desktop\16x9\re

input:
U1RTKCPC7UY.jpg
output:
U1RTKCPC7UY.jpg.png

Your Results in New Super-Resolution Benchmarks

Hello,

MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

Your method achieved 4th place in Video Upscalers Benchmark: Quality Enhancement in 'Animation 2x' category and 1st place in Super-Resolution for Video Compression Benchmark in 'x264 compression' category. We congratulate you on your result and look forward to your future work!

We would be grateful for your feedback on our work.

vkEnumeratePhysicalDevices failed

My dear nihui dalao, compile has passed, but I faced the problem as below when I run the realsr-ncnn-vulkan:

command : ./realsr-ncnn-vulkan -i ../images/timg.jpg -o output.png -s 4
vkEnumeratePhysicalDevices failed -3
invalid gpu device

is there something wrong with my gpu device?? or the problem of vulkan SDK???

Graphical errors (?) when handling small images

Recently, one of my users has reported that there's an error with the output if the image is particularly small (18x18).
I checked all the drivers and only realsr appears to have this issue.

Input image:

small-sample

Output image:

small-sample_output

On the other hand, waifu2x-ncnn-vulkan handles it without problems.

output-waifu2x-ncnn-vulkan

No Standard Output?

Hello there,

I'm trying to read the output of the executable, but getting empty always. This is due of no stdout usage.

You can test it fast on Windows by doing srmd-ncnn-vulkan.exe -h > out.txt and check that the file is empty.

Any workaround for this?

CMake build with USE_STATIC_MOLTENVK=ON is still linking the vulkan loader library on MacOS

Hi,
Just followed the document and passed USE_STATIC_MOLTENVK=ON to cmake on MacOS, the generated binary still links to the libvulkan.dylib file. Please help.

Thanks.

$ otool -l ./realsr-ncnn-vulkan
Load command 13
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name @rpath/libvulkan.1.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1.2.198
compatibility version 1.0.0

Feature requests

1)support input&output pipes, Instead of relying on large amounts of temporary images
2) Add Vapoursynth filter Support,

%1 is not recognized...

I try to add a batch to the contextual menu of windows to be able to directly convert the jpg image as I do with ffmpeg but I am surprised that the program does not accept the %1, please fix it !!!

Thanks in advance...

DeepDeblur to NCNN-Vulkan

Hello, I'm sorry this isn't related to RealSR, I just don't see a way to contact you outside a github issue.

I'm a sucker for image quality, and have been using RealESRGAN-NCNN-Vulkan (https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan) for some time now, and it's really good. I found that it heavily borrows from this repo, and would like to request you port SimDeblur (https://github.com/ljzycmd/SimDeblur) to NCNN-Vulkan. I'd hugely appreciate it. If you won't, that's fine, I understand.

Thanks in advance, regardless of decision.

Does the code of real-esrgan-ncnn-vulkan release package is same as this?

Hi, I find that the real-esrgan is a Python project, but the release package is build with ncnn. I'm trying to find the code used to compile the release package, but it fails. Then I noticed this project that also uses ncnn, and using the release package from this project to call the real-esrgan-ncnn model, I found that I could get the exact same result as the real-esrgan-ncnn.
Considering Xinntao is your partner, I guess that the real-esrgan-ncnn release package may be compiled by you, or maybe even using the code from this project?
Is there any problem I might have calling real-esrgan's model using realsr-ncnn's code?

Why did you choose libwebp

I read the code about file input and save.
And I found that libwebp is used to read and write png, jpeg, webp file.
I am wondering why did you choose libwebp instead of opencv?
Maybe opencv can handle this questions with more simple code.

I tried custom model to load with executable...

Hi, @nihui
Does windows executable work with custom models?
In my case, i tried ncnn converted model(with .bin and .params), but exe throws error. I am 99% sure that model is correctly converted from .pth to torchscript and later with pnnx to ncnn.
error I have got:
"layer GroupNorm not exists or registered"

Thank you

Results in MSU Video Super Resolution Benchmark

Hello,
MSU Video Group has recently launched Video Super Resolution Benchmark and evaluated this algorithm.

It takes 7th place by subjective score, 9th place by PSNR, and 5th by our metric ERQAv1.0. You can see the results here.

If you have any other VSR method you want to see in our benchmark, we kindly invite you to participate.
You can submit it for the benchmark, following the submission steps.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.