Giter Site home page Giter Site logo

huizeng / image-adaptive-3dlut Goto Github PK

View Code? Open in Web Editor NEW
739.0 23.0 119.0 20.75 MB

Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time

License: Apache License 2.0

Python 64.73% C 8.43% C++ 6.15% Cuda 10.93% MATLAB 9.24% Shell 0.19% Objective-C 0.33%
3d-luts photo-retouching image-enhancement color-enhancement color-manipulation computational-photography image-processing

image-adaptive-3dlut's Introduction

Image-Adaptive-3DLUT

Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time

Downloads

The whole datasets used in the paper are over 300G. Here I only provided the FiveK dataset resized into 480p resolution (including 8-bit sRGB, 16-bit XYZ inputs and 8-bit sRGB targets). I also provided 10 full-resolution images for testing speed. To obtain the entire full-resolution images, it is recommended to convert from the original FiveK dataset.

A model trained on the 480p resolution can be directly applied to images of 4K (or higher) resolution without performance drop. This can significantly speedup the training stage without loading the very heavy high-resolution images.

Abstract

Recent years have witnessed the increasing popularity of learning based methods to enhance the color and tone of photos. However, many existing photo enhancement methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In this paper, we learn image-adaptive 3-dimensional lookup tables (3D LUTs) to achieve fast and robust photo enhancement. 3D LUTs are widely used for manipulating color and tone of photos, but they are usually manually tuned and fixed in camera imaging pipeline or photo editing tools. We, for the first time to our best knowledge, propose to learn 3D LUTs from annotated data using pairwise or unpaired learning. More importantly, our learned 3D LUT is image-adaptive for flexible photo enhancement. We learn multiple basis 3D LUTs and a small convolutional neural network (CNN) simultaneously in an end-to-end manner. The small CNN works on the down-sampled version of the input image to predict content-dependent weights to fuse the multiple basis 3D LUTs into an image-adaptive one, which is employed to transform the color and tone of source images efficiently. Our model contains less than 600K parameters and takes less than 2 ms to process an image of 4K resolution using one Titan RTX GPU. While being highly efficient, our model also outperforms the state-of-the-art photo enhancement methods by a large margin in terms of PSNR, SSIM and a color difference metric on two publically available benchmark datasets.

Framework

Usage

Useful issues

Replace the trilinear interpolation with torch.nn.functional.grid_sample [#14].

Requirements

Python3, requirements.txt

Build

By default, we use pytorch 0.4.1:

cd trilinear_c
sh make.sh

For pytorch 1.x:

cd trilinear_cpp
sh setup.sh

Please also replace the following lines:

# in image_adaptive_lut_train_paired.py, image_adaptive_lut_evaluation.py, demo_eval.py, and image_adaptive_lut_train_unpaired.py
from models import * --> from models_x import *
# in demo_eval.py
result = trilinear_(LUT, img) --> _, result = trilinear_(LUT, img)
# in image_adaptive_lut_train_paired.py and image_adaptive_lut_evaluation.py
combine_A = trilinear_(LUT,img) --> _, combine_A = trilinear_(LUT,img)

Training

paired training

 python3 image_adaptive_lut_train_paired.py

unpaired training

python3 image_adaptive_lut_train_unpaired.py

Evaluation

  1. use python to generate and save the test images:

    python3 image_adaptive_lut_evaluation.py
    

speed can also be tested in above code.

  1. use matlab to calculate the indexes used in our paper:

    average_psnr_ssim.m
    

Demo

python3 demo_eval.py

Tools

  1. You can generate identity 3DLUT with arbitrary dimension by using utils/generate_identity_3DLUT.py as follows:
# you can replace 33 with any number you want
python3 utils/generate_identity_3DLUT.py -d 33
  1. You can visualize the learned 3D LUT either by using the matlab code in visualization_lut or using the python code utils/visualize_lut.py as follows:
python3 utils/visualize_lut.py path/to/your/lut
# you can also modify the dimension of the lut as follows
python3 utils/visualize_lut.py path/to/your/lut --lut_dim 64

Citation

@article{zeng2020lut,
  title={Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time},
  author={Zeng, Hui and Cai, Jianrui and Li, Lida and Cao, Zisheng and Zhang, Lei},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={04},
  pages={2058--2073},
  year={2022},
  publisher={IEEE Computer Society}
}

@inproceedings{zhang2022clut,
  title={CLUT-Net: Learning Adaptively Compressed Representations of 3DLUTs for Lightweight Image Enhancement},
  author={Zhang, Fengyi and Zeng, Hui and Zhang, Tianjun and Zhang, Lin},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={6493--6501},
  year={2022}
}

image-adaptive-3dlut's People

Contributors

flyingzhao avatar hkzhang95 avatar huizeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

image-adaptive-3dlut's Issues

你好,运行的时候遇到两个问题,希望可以得到您的回复

您好,运行image_adaptive_lut_evaluation.py代码的时候,环境是按照你的要求配置的,但是报错RuntimeError: PyTorch was compiled without NumPy support,网上的解决方法不管用;
另外想问一下,这个代码支持多卡运行吗?
很感谢您分享的代码,期待回复哈祝好

There may be mismatch on the choice of input

image

As far as I know, the Input setting used in the DeepUPE experiment is (default), and I downloaded the sample you provided, I found that the Input used in your experiment is the fourth (highlighted).
The default input setting is darker than the fourth input setting, which may be the reason why your method SSIM is particularly high.
In addition, your train/test split setting of FIVEK dataset is also inconsistent with DeepUPE. Considering that DeepUPE does not share the training code, I would like to know whether you have re-implemented and trained the DeepUPE model?

The input of a5000 in your work:
lut5000

The input of a5000 in DeepUPE:
a5000-kme_0204

环境问题

作者你好!我想问一下你运行项目所使用的cuda版本,使用 python3 image_adaptive_lut_evaluation.py后,总是出现问题,我想可能是环境导致的,,望回复,感谢!问题如下:
/root/anaconda3/envs/pytorch041_2/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
/root/anaconda3/envs/pytorch041_2/lib/python3.6/site-packages/torch/nn/functional.py:1961: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=663 error=11 : invalid argument

关于多阶段任务问题

您好,麻烦请教下,我现在想做图像增强任务:一张图片要经过色温调整,对比度调整,颜色调整三种递进的流程,要输出三张不同调整效果图,我想在您的网络结构下将这三个处理过程串行在一起,分阶段进行监督(LUT有三组),我这边有的数据对是一张图片对应三张效果图,这样的流程可行吗

Abnormal loss of D during training

Hey, thanks for your great work!

I 've trained the model using the unpaired version of code and got great results. But I found the loss of D was quiet abnormal: D always give the fake and real input very close scores(at least same signal), no matter how many epochs we train, even when the 800 epochs meet the end and visualization of the result was quiet good. I wonder how could the result get that good with such a weak D and unbalance scores between G&D?

something wrong with unpaired pretrained model

Is the unpaired pretrained model mismatched ?

RuntimeError: Error(s) in loading state_dict for Classifier:
Missing key(s) in state_dict: "model.7.weight", "model.7.bias", "model.9.weight", "model.9.bias", "model.13.weight", "model.13.bias", "model.16.weight", "model.16.bias".
Unexpected key(s) in state_dict: "model.8.weight", "model.8.bias".
size mismatch for model.6.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.6.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for model.10.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]).
size mismatch for model.12.weight: copying a param with shape torch.Size([3, 128, 8, 8]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for model.12.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([128]).

LUTs with values outside [0,1] range

Hi there,

This is probably more a modelling question rather than an issue with the code, but it might be worth discussing here anyway.

I am training your model and found that trained LUTs can actually map normalized RGB to values outside the range [0,1]. Don't you think this makes little sense?

Image Gradient in TrilinearBackward is not well calculated

Hi there,

In TrilinearBackwardCpu, I saw only lut_grad is well calculated, and image_grad is considered as identity.

In this artical, the gradient of image also needs to be backward passed to the CNN weight generation layers. A more correct graident calculation can make this project more elegant.

Yours,

How to test a video or a image folder in demo_eval.py

Thank you for your efforts . I have test a single image folder in demo_eval.py and ouuputs a satisfactory image! Now I just to know how to test a video or a image folder in demo_eval.py! Looking forward to your reply ! Have a good day!

Different SSIM results between matlab and python-skimage

Hi, I am trying to replace the matlab evaluation script with python style for some reason. Then I do something like this:

from skimage.metrics import structural_similarity as img_ssim
# To match the implementation of Wang et. al. [1]_, set `gaussian_weights` to True, `sigma` to 1.5, and `use_sample_covariance` to False.
ssim_val = img_ssim(test_img, gt_img, multichannel=True, data_range=255.0, gaussian_weights=True, sigma=1.5, use_sample_covariance=False)

However, the results are as follows ('for a4994.png'):

Could you please tell me where is the difference between the implementations of the two versions? Or can you provide a python script for evaluating SSIM and Lab?

Thank you very much.

How to generate training images?

Hi, I was wondering how you generated the training images? For the sRGB input, I tried 'dcraw -T', 'dcraw -w -T' and 'dcraw -w -M -T', but I can't get the same image as yours.

Also, how did you get the XYZ 16bit inputs? I tried using you sRGB input, degamma, convert to XYZ, gamma, and save as 16bit using openCV, but the result is still different from your XYZ input.

Thanks.

License

Could you please specify the license for the repo?
Thanks
BR
IG

About detailed architecture of the CNN model

1)Given an input image of any resolution, you simply use bilinear interpolation to downsample it to 256 x 256。Why did you put this step in the network instead of image preprocessing? Is this equivalent?
2)If I don’t want to use InstanceNorm, are there any suggestions for modifying the architecture of the CNN model?
3)Since bilinear interpolation is time-consuming, will it affect the result if I change it to nearest neighbor?

Looking forward to your reply!

Error when batch size > 1

Hi, thanks for your great work.

I try your code on fiveK dataset and find that it can run successfully using the default setting (batch size=1). However, when I switch to a larger batch size (e.g. 2 or 16), an error occurs as follows:

RuntimeError: stack expects each tensor to be equal size, but got [3, 507, 472] at entry 0 and [3, 408, 482] at entry 1

From my previous experience, maybe you need to align the sizes of different images then put them into a tensor.

build error

torch.utils.ffi is is deprecated. Please use cpp extensions instead
torch 1.x can not build successfully, could you provide a modefied version using torch.utils.cpp_extension ?

Bug in demo_eval.py

Hi, thanks again for the great work! I just found a bug in file demo_eval.py. OpenCV reads image into BGR, so after line 77, i think image should be converted to RGB before proceeding.
Please let me know if i'm right.

Thanks again for the great work!

About UIE [11]

Good day, Dr. Zeng,

Thanks for your great work. Here I have some questions about UIE [11] as cited in your paper.

  1. It actually doesn't iteratively enhance the input image. It runs the forward pass once for each input image. You can check its description in the section "Generator" in [11].
  2. About the running time, the slow speed of UIE should be related to the fact that it reproduces those Adobe Lightroom adjustments in Python, which might not be GPU-friendly. In any case, its speed should have nothing to do with its number of iterations, as mentioned in your paper in Sec. 4.6.

Please see if I miss something in your arguments. All in all, the above won't change much about your work's value. However, the discussion here is to view things more objectively and correctly.

Thanks for your attention :-)

关于batch size问题

您好,我将代码改成单卡多batchsize进行训练,psnr值反而没有batchsize为1时效果好,请问这个是什么原因导致的呢,如果是多GPU训练,每张GPU上batchsize为1,最终效果是否也没有单卡单batchsize效果好呢,

Tensor device problems in code

In some situation, training code raises a device error:

cuda error: an illegal memory access was encountered

After debug I found that the main reason is Tensor used during the training is not on the same device. For example:

In Generator3DLUT_identity and Generator3DLUT_zero, self.LUT.device is cpu. In TrilinearInterpolationFunction, int_package and float_package are also on cpu. However, the input and the output of the network are cuda tensor, causing sometimes device error occurs when running the model.

To solve the problem, it's better to init all tensors as the same and dynamic type, instead of initializing the tensor on fixed devices.

'ImportError: libc10.so: cannot open shared object file' after build cpp

Sorry, I am a newbie for this part

I am having a difficulty with building the package 'trilinear'

I have followed the guide in README but it doesn't work with the message 'ImportError: libc10.so: cannot open shared object file'

Do you have any idea of this phenomenon?

I am using following environment

Python 3.7.6
torch==1.7.0
torchvision==0.8.1
numpy==1.19.4
opencv-python==4.4.0

nvcc v10.0
GPU: Tesla K80

Any comments are welcomed .. Thank you

Error occurred when I replace the image in sRGB

Hi,
I tested it successfully with the original image(a duck eating grass).
But when I replace it with another jpg image, an error occurred:

Traceback (most recent call last):
File "demo_eval.py", line 86, in
LUT = generate_LUT(img)
File "demo_eval.py", line 66, in generate_LUT
pred = classifier(img).squeeze()
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/dockerdata/ruodai1/Image-Adaptive-3DLUT/models_x.py", line 99, in forward
return self.model(img_input)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 419, in forward
return self._conv_forward(input, self.weight)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [16, 3, 3, 3], expected input[1, 4, 256, 256] to have 3 channels, but got 4 channels instead

I use the snipper in windows to get this image and save it in jpg format. Is there anything wrong? Thanks!

高分辨率图片(6000x4000)效果如何改进

您好,我想处理高分辨率的图片6000x4000,想请教下,加大3Dlut的维度(比如256),或者采用更大的分类骨干网络能否提高效果呢,您能提供一些建议吗,非常感谢

cudaCheckError() failed : invalid device function

PyTorch: 1.6, CUDA: 10.0
According to the README, modified the code, and also modified the setup.sh toexport CUDA_HOME=/usr/local/cuda-10.0.
But when I try to run demo_eval.py, I meet the error:

cudaCheckError() failed : invalid device function
Process finished with exit code 139

Does anyone knows how to solve that? I have no idea after search on Google.

Cannot reproduce the paper result

Hi, I run your code with the following environment:

  • python 3.7.5
  • CUDA 10.1
  • pytorch 1.5.0

I try the default paired setting without changing any hyper-parameters, but I can only get 23.67 PSNR, 0.897 SSIM and 9.19 lab on the fiveK test-set. I also train the same settings several times and get the following results:

Model PSNR Epoch
baseline_0 23.107576 121
baseline_1 22.001934 236
baseline_2 18.170982 358
baseline_3 23.083334 332

Note that the 23.67 PSNR result is obtained using the model baseline_0 and epoch 121.

From the results, I find that the performance is not stable. And I want to know how to reproduce the result in the paper.

Thank you very much.

27472 Segmentation fault

hello, how can i solve this problem?
I looked it up online and said it was memory or Pointers?

Question about quantitative comparison of photo retouching results

I'm very interested in your work! But, When I read the paper, I found a problem why is the PSNR value of UPE much lower than that of HDRNet in your paper. As shown in the figure below:
image
However, The paper on Underexposed Photo Enhancement using Deep Illumination Estimation Indicates that their model performance is higher than HDRNet
image

color mismatch

Thanks for your great work.

My pytorch version is 1.6.0, I build the project by ‘cd trilinear_cpp, sh setup.sh’, and then I run 'python demo_eval.py' command and get the enhanced image of a1629.jpg. But I find the 'red' flowers becomes 'blue' flowers, and the ‘blue’ tail of the bird becomes ‘brown’ tail, it's different from the result provided in your paper. Is there something wrong in the model you provided here or is there something else I should do before I run 'python demo_eval.py' ? The following picture is the result I get.

Look forward your reply.

result

Dataset download failed

Thank you for your efforts. Under the link of the dataset, the download dataset is incomplete or the download fails. I want to confirm whether the dataset path has changed? Could you please attach a link to Baidu Cloud? Thank you very much!

About wlsTonemap

Hi, the provided local tone mapping code is written in MATLAB
%% Finally, shift, scale, and gamma correct the result gamma = 1.0/1.0; bias = -min(OUT(:)); gain = 0.8; OUT = (gain*(OUT + bias)).^gamma; % figure imshow(OUT); imwrite(OUT,'a1509_T.jpg');
I found that the range of OUT is not in [0, 1], when I convert this script to Python, the result is not correct when using min-max normalization, do you know how to normalize it in python?

about training data

Hi, I want to train the model. I have downloaded dataset from the link you provided, but I am not sure where is train_label.txt and train_input.txt, which is used in datasets.py. Can you provide an example of these two files?

Can you please upload the names of HDR+ dataset?

Hi, thanks again for the great work! I was trying to mimic the result of imaging pipeline enhancement. I checked the HDR+ dataset but I can't find the aligned Nexus 6p subset you mentioned in the paper. Can you please give me a hint on the names of HDR+ Dataset?

Thanks.

There is some bug on trilinear_cpp version.

Pytorch version: 1.5.1, cuda: 9.0
Error follows: AttributeError: module 'trilinear_cpp.trilinear' has no attribute 'trilinear_forward_cuda'

So you have to modify
https://github.com/HuiZeng/Image-Adaptive-3DLUT/blob/master/trilinear_cpp/src/trilinear_cuda.cpp#L32
to
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("trilinear_forward_cuda", &trilinear_forward_cuda, "Trilinear forward");
m.def("trilinear_backward_cuda", &trilinear_backward_cuda, "Trilinear backward");
}

Finally, I can run it on the demo_images, the results is very good.

关于论文接受的问题/ about admission state of this work?

您好,请问您的这篇文章已经被TPAMI2020接受了吗?因为对论文内容很感兴趣,所以有点好奇工作是否被接受,谢谢您的解答!
Hello, If this work is admit by TPAMI 2020? I'm very interested in this paper, thanks for you reply!

Pretrained Model only provide three LUTs?

I downloaded the pretrained model to test and found that it only provides access to first three LUTs. In the eval code, it is indicated that a full test should use 5 LUTs. I wonder if other pretrained LUTs will be released for me to test the performance of your work to a complete extent?

Thanks!

Visualization of LUTs

Hi, very honor to see your works

Actually I have trained with my custom dataset, but there are some issues I cannot verify with my knowledge.

So I am trying to analyze the shapes(?) of LUTs

Is there any tool to visualize the LUTs which can show the results just like in your paper?

Thank you for the answer

How does it work with "Generator3DLUT_identity"

In the model.py and the code for training,

image

What is the difference of the "Generator3DLUT_identity" and "Generator3DLUT_zero" ?

At first, in my opinion, the parameters of the N=3 LUTs are learned while training. However, why is the "Generator3DLUT_identity" be designed? Looking forward to some reply!

image

It's a good job actually!

In TV_3D, what does weight_r(g or b) do?

In case of weight_r (deleted g or b statement for ease):

    self.weight_r = torch.ones(3,dim,dim,dim-1, dtype=torch.float)
    self.weight_r[:,:,:,(0,dim-2)] *= 2.0

def forward(self, LUT):
    dif_r = LUT.LUT[:,:,:,:-1] - LUT.LUT[:,:,:,1:]
    tv = torch.mean(torch.mul((dif_r ** 2),self.weight_r)) + torch.mean(torch.mul((dif_g ** 2),self.weight_g)) + torch.mean(torch.mul((dif_b ** 2),self.weight_b))

Here, 1) I wonder why do we multiply dif_r ** 2 and self.weight_r instead of using just dif_r ** 2?
and 2) why do we multiply the edge elements of weight_r by 2.0?

I really appreciate to have this work :)

unpaired中的img2 的作用是什么

return {"A_input": img_input, "A_exptC": img_exptC, "B_exptC": img2, "input_name": img_name}
作者,我想了解再unpaired形式的训练中的img2 的作用是什么, 我看到在fiveK和hdrplus数据集中都用到了这个参数,一直没想明白,望解答,thank you very much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.