DeepWPT

Hardware

GPU: ZOTAC GAMING GeForce RTX 3060 Ti Twin Edge
CPU: AMD Ryzen 9 3900X desktop processor
RAM: 16GB 3200MHz DDR4

Environment setup

Operating system: Tested on Ubuntu 20.04.5 LTS
Package management system: conda
Deep Learning framework: Pytorch
GPU Driver Version: 515.76
Python version: Python 3.9.13
cudatoolkit version: 11.6.

Code explanation

Some terminologies:

gc: Growth channel or intermediate channels. Growth rate represents the dimension of output feature mapping Defined and tested by Residual Dense Network for Image Super-Resolution, CVPR 18

1x1 convolutions: Used to Increase or decrease Feature Map size. (e.g from 48 to 64 channels and from 64 channels to 48)

stride=1: means the kernel/filter will move one pixel at a time.

Parameters vs hyperparameters: see this video

VGG19: what is vgg19

Epoch: epochs is a hyperparameter that defines the number times that the learning algorithm will work through the entire training dataset.(We set it to 50)

Wavelet packet transform:

The original Implementation of "Wavelet packet transform" of our paper is from "Wavelet Domain Generative Adversarial Network for Multi-Scale Face Hallucination", code

Some resource to understand wavelet and it's different implementation:

Loss Function:

loss_G = (1*loss_p) + loss_sr.mul(100) + loss_lr.mul(10) + loss_textures.mul(5)

We did not use Attention Loss because there is no IRNN is used.

loss_p= preceptual loss

ABOUT:

Perceptual loss functions are used when comparing two different images that look similar, like the same photo but shifted by one pixel. The function is used to compare high level differences.

In instances where we want to know if two images look like each-other, we could use a mathematical equation to compare the images but this is unlikely to produce good results. Two images can look the same to humans but be very different mathematically (i.e. if there is a picture of a man vs the same picture of the man but the man is shifted one pixel to the left). Using a perceptual loss function solves this issue by taking a neural network that recognizes features of the image; these can include autoencoders, image classifiers, etc.

They make use of a loss network φ pre- trained for image classification, meaning that these perceptual loss functions are themselves deep convolutional neural networks. In all our experiments φ is the 16-layer VGG network pretrained on the ImageNet dataset.

SOURCE

CODE
loss_sr= loss MAE(mean absolute error) for SR or Short Reach or high-frequency components.

loss_lr= loss MAE(mean absolute error) for LR or Long Reach or low-frequency components.

SOURCE

CODE
loss_textures = Wavelet Reconstruction Loss

ABOUT:

Minimizing MSE loss can hardly capture high-frequency texture details to produce satisfactory perceptual results. As texture details can be depicted by high-frequency wave- let coefficients, we transform the super-resolution problem from the original image pixel domain to the wavelet domain and introduce wavelet-domain loss functions to help texture reconstruction.

SOURCE

CODE

Dataset:

Download dataset from here

paper: "Moiré Photo Restoration Using Multiresolution Convolutional Neural Networks"

Code

images = 130,307 pair (90% for training and 10% testing) of RGB images.

Type: PNG

Resolution: average 850x850. Converted to 256x256 for training and testing.

Created from: ImageNet ISVRC 2012 dataset

Directional Residual Dense Network:

Used the Residual Dense Block (RDB) from “Residual Dense Network for Image Super-Resolution”

Code of RDB

zareefjafar / deepwpt Goto Github PK

deepwpt's Introduction

DeepWPT

Hardware

Environment setup

Code explanation

Some terminologies:

Wavelet packet transform:

Loss Function:

Dataset:

Directional Residual Dense Network:

deepwpt's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent