Giter Site home page Giter Site logo

sanaznami / mtjnd Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 1.37 MB

Supplementary material for the paper "MTJND: MULTI-TASK DEEP LEARNING FRAMEWORK FOR IMPROVED JND PREDICTION", IEEE ICIP 2023

Python 100.00%
jnd just-noticeable-difference just-noticeable-distortion multi-task-learning paper-with-code perceptual-compression perceptual-quality video-coding

mtjnd's Introduction

MTJND: MULTI-TASK DEEP LEARNING FRAMEWORK FOR IMPROVED JND PREDICTION

Introduction

This is the implementation of MTJND: MULTI-TASK DEEP LEARNING FRAMEWORK FOR IMPROVED JND PREDICTION paper in Tensorflow. Preprint is available here.

Abstract

The limitation of the Human Visual System (HVS) in perceiving small distortions allows us to lower the bitrate required to achieve a certain visual quality. Predicting and applying the Just Noticeable Distortion (JND), which is a threshold for maximum unperceived level of distortions, is among the popular ways to do so. Recently, machine learning based methods have been able to reduce bitrate even further by improving JND prediction accuracy. However, accurate modeling of JND is very challenging, as it is highly content dependent. Furthermore, existing datasets provide little information to learn the best parameters. To remedy this issue, we propose a multi-task deep learning framework that jointly learns various complementary visual information. We design three separate methods and training strategies that jointly learn: (1) three JND levels, (2) visual attention map and a JND level, and (3) three JND levels and the visual attention map. We show that accumulating information from multiple tasks leads to a more robust prediction of JND. Experimental results confirm the superiority of our framework compared to the state-of-the-art.

The proposed framework

image

The proposed framework and its components. (a) overall framework. (b) shared feature backbone. (c) decision tail for visual attention modeling. (d) decision tail for JND prediction. (e) MT_3LJND. (f) MT_1LJND_VA. (g) MT_3LJND_VA

Requirements

  • Tensorflow
  • FFmpeg

Dataset

Our evaluation is conducted on VideoSet and MCL-JCI datasets.

Pre-trained Models

Our pre-trained models can be downloaded using this Link on Zenodo repository, or mirror.

Usage

Our pretrained models are capable of predicting JND values, and they can also be employed for training on a custom dataset.

Note: The dataset used for training and testing should have such a structure.
- rootdir/
     - train/
         - img#1
         - img#2
         - ...
         - JND-Levels.txt (a file containing the 3 JND levels per image: first column for the first JND, second column for the second JND, and third column for the third JND level)
     - valid/
         - img#1
         - img#2
         - ...
         - JND-Levels.txt (a file containing the 3 JND levels per image: first column for the first JND, second column for the second JND, and third column for the third JND level)
     - test/
         - img#1
         - img#2
         - ...

Testing

For prediction with MT_3LJND or MT_3LJND_VA, the following commands can be used.

python3 MT_3LJND.py test --data_dir "Path-to-the-folder-containing-train,valid,and-test-subfolders/" --model_weights_path "Path-to-the-pretrained-model/model-name.h5" --result_path "Path-to-save-test-results/result.csv"

For prediction with MT_1LJND_VA, the following commands can be used.

python3 MT_1LJND_VA.py test --data_dir "Path-to-the-folder-containing-train,valid,and-test-subfolders/" --model_weights_path "Path-to-the-pretrained-model" --jnd_column int --result_path "Path-to-save-test-results/result.csv"
Note: For "jnd_column", the choices are 0, 1, and 2 (0 for JND1, 1 for JND2, and 2 for JND3).

Training

For training with MT_3LJND, the following commands can be used.

python3 MT_3LJND.py train --data_dir "Path-to-the-folder-containing-train,valid,and-test-subfolders/" --checkpoint_path "Path-to-save-checkpoints/checkpoint.h5" --csv_log_path "Path-to-save-CSV-logs-during-training/log.txt" --epochs Number-of-training-epochs --batch_size Batch-size-for-training --learning_rate Learning-rate-for-optimizer

For training with MT_1LJND_VA, the following commands can be used.

python3 MT_1LJND_VA.py train --data_dir "Path-to-the-folder-containing-train,valid,and-test-subfolders/" --checkpoint_path "Path-to-save-checkpoints/checkpoint.h5" --csv_log_path "Path-to-save-CSV-logs-during-training/log.txt" --epochs Number-of-training-epochs --batch_size Batch-size-for-training --learning_rate Learning-rate-for-optimizer --jnd_column int(0 for JND1, 1 for JND2, and 2 for JND3)

Citation

If our work is useful for your research, please cite our paper:

@inproceedings{nami2023mtjnd,
	title={MTJND: MULTI-TASK DEEP LEARNING FRAMEWORK FOR IMPROVED JND PREDICTION},
author={Nami, Sanaz and Pakdaman, Farhad and Hashemi, Mahmoud Reza and Shirmohammadi, Shervin and Gabbouj, Moncef},
booktitle={Proceedings of the IEEE International Conference on Image Processing (ICIP)},
year={2023}
}

Project information

This repository is associated with the project FALCON, under Work Package 3 (WP3). This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 101022466.

Contact

If you have any question, leave a message here or contact Sanaz Nami ([email protected], [email protected]).

mtjnd's People

Contributors

farhad02 avatar sanaznami avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.