Giter Site home page Giter Site logo

deepwpt's Introduction

DeepWPT

Hardware

GPU: ZOTAC GAMING GeForce RTX 3060 Ti Twin Edge
CPU: AMD Ryzen 9 3900X desktop processor
RAM: 16GB 3200MHz DDR4

Environment setup

Operating system: Tested on Ubuntu 20.04.5 LTS
Package management system: conda
Deep Learning framework: Pytorch
GPU Driver Version: 515.76
Python version: Python 3.9.13
cudatoolkit version: 11.6.

Code explanation

Some terminologies:

gc: Growth channel or intermediate channels. Growth rate represents the dimension of output feature mapping Defined and tested by Residual Dense Network for Image Super-Resolution, CVPR 18

1x1 convolutions: Used to Increase or decrease Feature Map size. (e.g from 48 to 64 channels and from 64 channels to 48)

stride=1: means the kernel/filter will move one pixel at a time.

Parameters vs hyperparameters: see this video

VGG19: what is vgg19

Epoch: epochs is a hyperparameter that defines the number times that the learning algorithm will work through the entire training dataset.(We set it to 50)

Wavelet packet transform:

The original Implementation of "Wavelet packet transform" of our paper is from "Wavelet Domain Generative Adversarial Network for Multi-Scale Face Hallucination", code

Some resource to understand wavelet and it's different implementation:

  1. Wavelets: a mathematical microscope

  2. Discrete Wavelet Transform of Images (Haar and Hadamard)

Loss Function:

loss_G = (1*loss_p) + loss_sr.mul(100) + loss_lr.mul(10) + loss_textures.mul(5)

We did not use Attention Loss because there is no IRNN is used.

  1. loss_p= preceptual loss

    ABOUT:

    Perceptual loss functions are used when comparing two different images that look similar, like the same photo but shifted by one pixel. The function is used to compare high level differences.

    In instances where we want to know if two images look like each-other, we could use a mathematical equation to compare the images but this is unlikely to produce good results. Two images can look the same to humans but be very different mathematically (i.e. if there is a picture of a man vs the same picture of the man but the man is shifted one pixel to the left). Using a perceptual loss function solves this issue by taking a neural network that recognizes features of the image; these can include autoencoders, image classifiers, etc.

    They make use of a loss network φ pre- trained for image classification, meaning that these perceptual loss functions are themselves deep convolutional neural networks. In all our experiments φ is the 16-layer VGG network pretrained on the ImageNet dataset.

    SOURCE

    CODE

  2. loss_sr= loss MAE(mean absolute error) for SR or Short Reach or high-frequency components.

    loss_lr= loss MAE(mean absolute error) for LR or Long Reach or low-frequency components.

    SOURCE

    CODE

  3. loss_textures = Wavelet Reconstruction Loss

    ABOUT:

    Minimizing MSE loss can hardly capture high-frequency texture details to produce satisfactory perceptual results. As texture details can be depicted by high-frequency wave- let coefficients, we transform the super-resolution problem from the original image pixel domain to the wavelet domain and introduce wavelet-domain loss functions to help texture reconstruction.

    SOURCE

    CODE

Dataset:

Download dataset from here

paper: "Moiré Photo Restoration Using Multiresolution Convolutional Neural Networks"

Code

images = 130,307 pair (90% for training and 10% testing) of RGB images.

Type: PNG

Resolution: average 850x850. Converted to 256x256 for training and testing.

Created from: ImageNet ISVRC 2012 dataset

Directional Residual Dense Network:

Used the Residual Dense Block (RDB) from “Residual Dense Network for Image Super-Resolution

Code of RDB

deepwpt's People

Contributors

zareefjafar avatar ibrahimkhan-ibu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.