Giter Site home page Giter Site logo

megvii-research / hdr-transformer Goto Github PK

View Code? Open in Web Editor NEW
89.0 6.0 10.0 37 KB

The official MegEngine implementation of the ECCV 2022 paper: Ghost-free High Dynamic Range Imaging with Context-aware Transformer

License: Apache License 2.0

Python 100.00%
hdr-imaging transformer megengine

hdr-transformer's Introduction

[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer

By Zhen Liu1, Yinglong Wang2, Bing Zeng3 and Shuaicheng Liu3,1*

1Megvii Technology, 2Noah’s Ark Lab, Huawei Technologies, 3University of Electronic Science and Technology of China

This is the official MegEngine implementation of our ECCV2022 paper: Ghost-free High Dynamic Range Imaging with Context-aware Transformer (HDR-Transformer). The PyTorch version is available at HDR-Transformer-PyTorch.

News

  • 2022.08.26 The PyTorch implementation is now avaible.
  • 2022.08.11 The arXiv version of our paper is now available.
  • 2022.07.19 The source code is now available.
  • 2022.07.04 Our paper has been accepted by ECCV 2022.

Abstract

High dynamic range (HDR) deghosting algorithms aim to generate ghost-free HDR images with realistic details. Restricted by the locality of the receptive field, existing CNN-based methods are typically prone to producing ghosting artifacts and intensity distortions in the presence of large motion and severe saturation. In this paper, we propose a novel Context-Aware Vision Transformer (CA-ViT) for ghost-free high dynamic range imaging. The CA-ViT is designed as a dual-branch architecture, which can jointly capture both global and local dependencies. Specifically, the global branch employs a window-based Transformer encoder to model long-range object movements and intensity variations to solve ghosting. For the local branch, we design a local context extractor (LCE) to capture short-range image features and use the channel attention mechanism to select informative local details across the extracted features to complement the global branch. By incorporating the CA-ViT as basic components, we further build the HDR-Transformer, a hierarchical network to reconstruct high-quality ghost-free HDR images. Extensive experiments on three benchmark datasets show that our approach outperforms state-of-the-art methods qualitatively and quantitatively with considerably reduced computational budgets.

Pipeline

pipeline Illustration of the proposed CA-ViT. As shown in Fig (a), the CA-ViT is designed as a dual-branch architecture where the global branch models long-range dependency among image contexts through a multi-head Transformer encoder, and the local branch explores both intra-frame local details and inner-frame feature relationship through a local context extractor. Fig. (b) depicts the key insight of our HDR deghosting approach with CA-ViT. To remove the residual ghosting artifacts caused by large motions of the hand (marked with blue), long-range contexts (marked with red), which are required to hallucinate reasonable content in the ghosting area, are modeled by the self-attention in the global branch. Meanwhile, the well-exposed non-occluded local regions (marked with green) can be effectively extracted with convolutional layers and fused by the channel attention in the local branch.

Usage

Requirements

  • Python 3.7.0
  • MegEngine 1.8.3+
  • CUDA 10.0 on Ubuntu 18.04

Install the require dependencies:

conda create -n hdr_transformer python=3.7
conda activate hdr_transformer
pip install -r requirements.txt

Dataset

  1. Download the dataset (include the training set and test set) from Kalantari17's dataset
  2. Move the dataset to ./data and reorganize the directories as follows:
./data/Training
|--001
|  |--262A0898.tif
|  |--262A0899.tif
|  |--262A0900.tif
|  |--exposure.txt
|  |--HDRImg.hdr
|--002
...
./data/Test (include 15 scenes from `EXTRA` and `PAPER`)
|--001
|  |--262A2615.tif
|  |--262A2616.tif
|  |--262A2617.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
|--BarbequeDay
|  |--262A2943.tif
|  |--262A2944.tif
|  |--262A2945.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
  1. Prepare the corpped training set by running:
cd ./dataset
python gen_crop_data.py

Training & Evaluaton

cd HDR-Transformer

To train the model, run:

python train.py --model_dir experiments

To evaluate, run:

python evaluate.py --model_dir experiments --restore_file experiments/val_model_best.pth

Results

results

Acknowledgement

The MegEngine version of the Swin-Transformer is based on Swin-Transformer-MegEngine. Our work is inspired the following works and uses parts of their official implementations:

We thank the respective authors for open sourcing their methods.

Citation

@inproceedings{liu2022ghost,
  title={Ghost-free High Dynamic Range Imaging with Context-aware Transformer},
  author={Liu, Zhen and Wang, Yinglong and Zeng, Bing and Liu, Shuaicheng},
  booktitle={European Conference on Computer Vision},
  pages={344--360},
  year={2022},
  organization={Springer}
}

Contact

If you have any questions, feel free to contact Zhen Liu at [email protected].

hdr-transformer's People

Contributors

shuaichengliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hdr-transformer's Issues

import megbrain to avoid dead lock bug

Hi,, thank you for your excellent work. I'm a newcomer to megengine. When I execute the training script, there is a "lock" situation. I don't know what caused it. I only modified the data set path in the params.json , and line 13 in evaluate.py are shielded.

Errors generated during training

Firstly, thanks for your code! And then when I train it on the CUDA11.2 under windows10, some errors were generated. As follows:

DeprecationWarning: Call to deprecated function get_device_count. (use megengine.device.get_device_count instead) -- Deprecated since version 1.5.
train_proc = dist.launcher(main) if dist.helper.get_device_count_by_fork("gpu") > 1 else main
2022-11-10 17:37:21 root Output and logs will be saved to experiments\train.log
2022-11-10 17:37:21 root Loading the datasets from ./data
2022-11-10 17:37:21 dataset.data_loader Dataset type: sig17, transform type: hdr_transformer
2022-11-10 17:37:21 dataset.transformations Train transforms:
2022-11-10 17:37:21 dataset.transformations Val and Test transforms:
10 17:37:27[mgb] WRN [dnn]
Cudnn8 will jit ptx code with cache. You can set
CUDA_CACHE_MAXSIZE and CUDA_CACHE_PATH environment var to avoid repeat jit(very slow).
For example export CUDA_CACHE_MAXSIZE=2147483647 and export CUDA_CACHE_PATH=/data/.cuda_cache
2022-11-10 17:37:27 root Starting training for 100 epoch(s)
0%| | 0/1424 [00:00<?, ?it/s]
Process finished with exit code -1073741819 (0xC0000005)

I pip install megegine 1.10.0_cu112, and install cuda 11.2 , cudnn 8.4.0.

pretrained model

Hi, can you share the pretrained model? The model I trained according to the default parameters is not very effective, and some test images still have obvious ghosts. I guess it may be caused by the unadjusted learning rate, optimizer and other parameters?

Stuck at the beginning of training.

I commented out the ''import megbrain'' code in evaluate.py, and also commented out ''mge.dtr.enable()'' in train.py.
The output log is shown below:

2022-08-02 10:48:03 root Output and logs will be saved to experiments/train.log
2022-08-02 10:48:03 root Loading the datasets from /local/scratch3/data_for_hdr_transformer
2022-08-02 10:48:03 dataset.data_loader Dataset type: hdm, transform type: hdr_transformer
2022-08-02 10:48:03 dataset.transformations Train transforms:
2022-08-02 10:48:03 dataset.transformations Val and Test transforms:

After the training started, I was stuck in this log. I used the dataset constructed by myself and cropped according to the script you provided in advance. Is it normal that 'Train transforms: ' and 'Val and Test transforms: ' are empty?

Inference time

Thank you for your great work.

In your paper, in Table 2 you report an inference time for your proposed network of 0.15s, which is twice as fast as previous state of the art model HDR-GAN.

Using a single Quadro P6000 I obtain an inference time of 6 seconds for HDR-Transformer using your MegEngine repository and 7 seconds using your Pytorch version while I'm able to reproduce the inference time for other networks.

Can you give more information on how to achieve this inference time?
Which hardware is used? Do you use post-training optimisation?

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.