jihyongoh / jsi-gan Goto Github PK

[AAAI 2020] Official repository of JSI-GAN.

MATLAB 5.82% Python 94.18%

aaai aaai2020 convolutional-neural-networks deep-learning divide-and-conquer generative-adversarial-network inverse-tone-mapping joint-models jsi-gan sr-itm super-resolution

jsi-gan's Introduction

Hi there 👋

👨🏻‍💻 I am currently an assistant professor at CMLab (Creative Vision and Multimedia Lab.) in Chung-Ang Univ. (CAU).
👨🏻‍⚕ Please visit my personal homepage (here).
🔬 I primarily focus on a variety of deep-learning-based Computer Vision research areas, such as:

   Neural Radiance Fields (NeRF)
   Video Frame Interpolation / Super Resolution / Deblurring / Colorization  
   Optical Flow Estimation
   Computational Photography
   SDR-to-HDR Inverse Tone Mapping
   Generative AI; Diffusion Models, GANs
   GAN/CNN-based Synthetic Aperture Radar (SAR) Target Recognition/Generation

💻 If you are interested in collaborating with me, please don't hesitate to send the email below.
📧 Contact: [email protected]

jsi-gan's People

Contributors

Stargazers

Watchers

Forkers

aacrobat guidewsp ryul99 subeimanghan templeblock peresi wenhuach chaoyongg aksh97 sooyekim caogaofeng dnnyyq wj1031925 fanld kaist-viclab ip-superresolution

jsi-gan's Issues

HR-SDR datasets

Hi, thanks for your work!

I am very interested in your work related to SR-ITM, such as Multi-purpose CNN, Deep SR-ITM, and JSI-GAN.

I want to repeat and follow up on your work. It may need the high-resolution SDR datasets(HR-SDR datasets).

I download your provided datasets from this link, but these datasets only have LRx2-SDR, LRx4-SDR, and HR-HDR frames. Could you provide HR-SDR datasets?

Thanks!

will the inference code be released?

I test the lena use 'predictor' function, I get result is less ideal. what's wrong my 'predictor'?

    def predictor(self, input_path, output_path):
        # saver to save model
        tf.global_variables_initializer().run()

        self.saver = tf.train.Saver()
        self.load(self.checkpoint_dir)  # for testing JSI-GAN

        """" Test """
        data_path_test = glob.glob(os.path.join(input_path, '*.png'))

        """ Testing """
        patch_boundary = 10  # set patch boundary to reduce edge effect around patch edges
        for index in range(len(data_path_test)):
            import cv2

            img = cv2.imread(data_path_test[index], -1) #BGR
            img = img[:, :, [2, 1, 0]] #RGB
            img = cv2.cvtColor(img, cv2.COLOR_RGB2YUV)
            img = np.expand_dims(img, axis=0)

            data_sz = img.shape
            test_pred_full = np.zeros((data_sz[1] * self.scale_factor, data_sz[2] * self.scale_factor, data_sz[3]))
            img = np.array(img, dtype=np.double) / 255.
            data_test = np.clip(img, 0, 1)

            ###======== Divide Into Patches ========###
            for p in range(self.test_patch[0] * self.test_patch[1]):
                pH = p // self.test_patch[1]
                pW = p % self.test_patch[1]
                sH = data_sz[1] // self.test_patch[0]
                sW = data_sz[2] // self.test_patch[1]
                # process data considering patch boundary
                H_low_ind, H_high_ind, W_low_ind, W_high_ind = \
                    get_HW_boundary(patch_boundary, data_sz[1], data_sz[2], pH, sH, pW, sW)
                data_test_p = data_test[:, H_low_ind: H_high_ind, W_low_ind: W_high_ind, :]
                ###======== Run Session ========###
                st = time.time()
                test_pred_o = self.sess.run(self.test_pred, feed_dict={self.test_input_ph: data_test_p})
                # trim patch boundary
                test_pred_t = trim_patch_boundary(test_pred_o, patch_boundary, data_sz[1], data_sz[2], pH, sH, pW, sW, self.scale_factor)
                # store in pred_full
                test_pred_full[pH * sH * self.scale_factor: (pH + 1) * sH * self.scale_factor,
                pW * sW * self.scale_factor: (pW + 1) * sW * self.scale_factor, :] = np.squeeze(test_pred_t)
            ###======== Save Predictions as Images ========###
            test_pred_full = np.floor(255.0 * np.clip(test_pred_full, 0, 1) + 0.5).astype(np.uint8)
            test_pred_full = cv2.cvtColor(test_pred_full, cv2.COLOR_YUV2RGB)
            test_pred_full = test_pred_full[:, :, [2, 1, 0]]
            cv2.imwrite(os.path.join(output_path, "test.png"), test_pred_full)

Is it convenient to tell when to release the code

Thank you very much for the work of your paper, it is really great, is it convenient to tell when to release the code

questions about how to get dataset ready for testing or training on my own

hi, thanks for your fancy work!

I am now trying to put the testset on my own

I download the testset as you say in 4kmedia.org

For HDR video:

extract frames from the 4k video: ffmpeg -i movie_name -color_primaries bt2020 ./4k/LG_Daylight_4K_Demo_BG/%08d.png
read and wirte them to the mat file for inference evaluation

files = os.listdir('./4k/LG_Daylight_4K_Demo_BG/')
imgs = []
files.sort()
for i, fi in enumerate(files):
   ` img = cv2.imread(os.path.join('./4k/LG_Daylight_4K_Demo_BGR/',fi), cv2.IMREAD_UNCHANGED)`
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    #print(img.shape)
    H_Y, H_u, H_v = rgb2yuv_hdr(np.array(img))
    H_Y = H_Y[:,:,np.newaxis]
    H_u = H_u[:,:,np.newaxis]
    H_v = H_v[:,:,np.newaxis]
    img_array = np.concatenate((H_Y, H_u, H_v), axis=2)
    img_array = np.array(img)
    img_array = np.transpose(img_array, (2, 1, 0))
    imgs.append(img_array)
imgs_mat = np.array(imgs)
file = h5py.File('LG_DayLight_HDR.mat','w')
file.create_dataset('HDR_YUV', data = imgs_mat)

for the rgb2yuv_hdr() function, I refer the issue 19 and write them in python:

def rgb2yuv_hdr(linRGB):
    """
    return: H_y, H_u, H_v
    """

    Bitdepth=10

    hdri=linRGB.astype(float)
    hdri = np.clip(hdri,0,10000)
    r,c,_= hdri.shape
    if np.mod(r,2) == 1:
        hdri = hdri[:r-1,:,:]
    if np.mod(c,2) == 1:
        hdri = hdri[:,1:c-1,:]


    # Coding TF
    print(np.max(hdri))
    print(np.min(hdri))
    hdri_divide = hdri / 10000
    print(np.max(hdri_divide))
    print(np.min(hdri_divide))
    hdri_divide[np.where(hdri_divide > 1)] = 1.0
    print(np.max(hdri_divide))
    print(np.min(hdri_divide))
    hdri_divide[np.where(hdri_divide < 0)] = 0.0
    print(np.max(hdri_divide))
    print(np.min(hdri_divide))
    Clip_hdri = hdri_divide

    #Clip_hdri = np.max(0,np.min(hdri/10000,1))
    
    # comment to keep linear values
    # m1=(2610/4096)*0.25;
    # m2=(2523/4096)*128;
    # c1=3424/4096;
    # c2=(2413/4096)*32;
    # c3=(2392/4096)*32;
    # PQTF_hdri=((c1+c2*(Clip_hdri.^m1))./(1+c3*(Clip_hdri.^m1))).^m2;
    PQTF_hdri = Clip_hdri

    # R'G'B to Y'CbCr
    Y = 0.2627*PQTF_hdri[:,:,0] + 0.6780*PQTF_hdri[:,:,2] + 0.0593*PQTF_hdri[:,:,2]
    Cb = (PQTF_hdri[:,:,2]-Y)/1.8814
    Cr = (PQTF_hdri[:,:,0]-Y)/1.4746

    # Quant 10b
    toren = np.power(2, Bitdepth - 8)
    toren_1 = np.power(2, Bitdepth)
    D_Y=np.clip(np.round(toren * (219*Y + 16)), 0, toren_1-1)
    D_Cb=np.clip(np.round(toren * (224*Cb + 128)), 0, toren_1-1)
    D_Cr=np.clip(np.round(toren * (224*Cr + 128)), 0, toren_1-1)
    

    
    # 4:4:4 to 4:4:4
    D_Cb_h, D_Cb_w = D_Cb.shape
    D_Cb_Hor = 64*D_Cb
    D_Cb_Hor=D_Cb_Hor + 384*D_Cb
    D_Cb_Hor=D_Cb_Hor + 64*D_Cb

    
    D_Cr_h, D_Cr_w = D_Cr.shape
    D_Cr_Hor=64*D_Cr
    D_Cr_Hor=D_Cr_Hor + 384*D_Cr
    D_Cr_Hor=D_Cr_Hor + 64*D_Cr
    
    D_Cb_Hor_h, D_Cb_Hor_w = D_Cb_Hor.shape
    D_Cb_ver=0*D_Cb_Hor
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor

    
    D_Cr_Hor_h, D_Cr_Hor_w = D_Cr_Hor.shape
    D_Cr_ver=0*D_Cr_Hor
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor

    """
    # 4:4:4 to 4:2:0
    D_Cb_h, D_Cb_w = D_Cb.shape
    D_Cb_padd = np.concatenate((D_Cb[:,0:1], D_Cb), axis=1)
    D_Cb_Hor = 64*D_Cb_padd[:,0:D_Cb_w:2]
    D_Cb_Hor=D_Cb_Hor + 384*D_Cb[:,0:D_Cb_w:2]
    D_Cb_Hor=D_Cb_Hor + 64*D_Cb[:,1:D_Cb_w:2]

    
    D_Cr_h, D_Cr_w = D_Cr.shape
    D_Cr_padd = np.concatenate((D_Cr[:,0:1], D_Cr), axis=1)
    D_Cr_Hor=64*D_Cr_padd[:,0:D_Cr_w:2]
    D_Cr_Hor=D_Cr_Hor + 384*D_Cr[:,0:D_Cr_w:2]
    D_Cr_Hor=D_Cr_Hor + 64*D_Cr[:,1:D_Cr_w:2]
    
    D_Cb_Hor_h, D_Cb_Hor_w = D_Cb_Hor.shape
    D_Cb_Hor_padd = np.concatenate((D_Cb_Hor[0:1, :], D_Cb_Hor), axis=0)
    D_Cb_ver=0*D_Cb_Hor_padd[0:D_Cb_Hor_h:2,:]
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor[0:D_Cb_Hor_h:2,:]
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor[1:D_Cb_Hor_h:2,:]

    
    D_Cr_Hor_h, D_Cr_Hor_w = D_Cr_Hor.shape
    D_Cr_Hor_padd = np.concatenate((D_Cr_Hor[0:1, :], D_Cr_Hor), axis=0)
    D_Cr_ver=0*D_Cr_Hor_padd[0:D_Cr_Hor_h:2,:]
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor[0:D_Cr_Hor_h:2,:]
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor[1:D_Cr_Hor_h:2,:]
    """

    toren_2 = np.power(0.5, 18)
    D_Cb = (D_Cb_ver+131072.0) * toren_2
    D_Cr = (D_Cr_ver+131072.0) * toren_2

    H_Y=D_Y
    H_u=D_Cb
    H_v=D_Cr

    return H_Y, H_u, H_v

questions I happened is below:

1. if I use 4:4:4 -> 4:2:0, the YUV channel will have the different shape, while I read your mat file, the Y,U,V channel are all the shape 3840x2160.
2. so I decide to use 4:4:4, while I try to save mat and try to read them use code below, the image looks something wrong:

datafile = './LG_DayLight_HDR.mat'
mat = h5py.File(datafile)
imgs = mat['HDR_YUV']
def yuv_to_bgr(img, Bitdepth=10):
    height, width = img.shape[:2]
    D_Y = img[:,:,0]
    D_u = img[:,:,1]
    D_v = img[:,:,2]
    #Inverse quant 10b
    toren = np.power(2, Bitdepth - 8)
    Y = np.clip((D_Y/toren - 16)/219, 0, 1)
    D_Cb = np.clip((D_u/toren - 128)/224, -0.5, 0.5)
    D_Cr = np.clip((D_v/toren - 128)/224, -0.5, 0.5)
    # Y'CbCr to R'G'B'
    # BT 2020,
    A = [[1,0.00000000000000,1.47460000000000],[1,-0.16455312684366,-0.57135312684366],[1,1.88140000000000,0.00000000000000]]
    A = np.array(A)
    
    RGB = np.zeros([height, width, 3])
    RGB[:,:,0] = np.clip(Y + A[0,2]*D_Cr, 0, 1)
    RGB[:,:,1] = np.clip(Y + A[1,1]*D_Cb + A[1,2]*D_Cr, 0, 1)
    RGB[:,:,2] = np.clip(Y + A[2,1]*D_Cb, 0, 1)
    
    RGB = (RGB * 65535).astype(np.uint16)
    BGR = cv2.cvtColor(RGB, cv2.COLOR_RGB2BGR)

    return BGR

import numpy as np
for i, img in enumerate(imgs):
    img = np.transpose(img, (2, 1, 0))
    print(img.shape)
    BGR = yuv_to_bgr(img, 10)
    cv2.imwrite('./hdr_{}.tiff'.format(i), BGR)

Could you please help me out of that problem? How to write to a mat file from a video type?

I just want to re-implement the results you did. But the released test set are so rare. I want to test more dataset and get the quantitative results.

Looking forward to your reply soooo much! @sooyekim @JihyongOh

Padding output shape is different from the mentioned comments in the code

Hi @sooyekim ,
Thanks for the great work.
I am running inference on test.mat files with scaling factor 4.
According to the comments on this line of the code: https://github.com/JihyongOh/JSI-GAN/blob/master/ops.py#L249
If i/p size is (1,100,170,3) then the output should be (1,120,190,3) as value of pad is 20 (41//2), but when i am running the code on my machine i am getting the output shape as (1.140,170,3). Is this behavior desirable?
Below i have attached the PDB log for your reference:

img_pad = tf.pad(x, tf.constant([[0, 0], [pad, pad], [0, 0], [0, 0]])) # [B, H+pad, W+pad, C]
(Pdb) x.shape
TensorShape([Dimension(1), Dimension(100), Dimension(170), Dimension(3)])
(Pdb) img_pad.shape
TensorShape([Dimension(1), Dimension(140), Dimension(170), Dimension(3)])

I got some severe artifact when I test on my own sdr video

Hi~ @sooyekim @JihyongOh @ryul99 Thanks for your great work!

When I test JSI-GAN on my own SDR data, some HDR results shows artifact like following picture. Do you know the reason for this bad case? And do you have any suggestions for solving this problem？
Looking forward to your reply！

case 1

INPUT SDR:
OUTPUT HDR:

case 2

INPUT SDR:
OUTPUT HDR:

Inverse tone mapping without Super Resolution.

Hi @sooyekim ,

Thanks for your great work!

Let's say we just want a model to covert SDR images to HDR images without applying super resolution on them. Can such a model be obtained by modifying the provided code? If yes, then which modifications would you suggest?

Thanks much.

How to apply the generated 1D kernels to the detail layer?

Hi,
Thanks for your great work, I have a question when implementing it with pytorch: How to apply the generated 1D kernels to the detail layer? Could you explain it in detail? Thank you!

Weired results !?

Hi, I've tried to run your code on RGB images. So changed your code like this:

yuv = cv2.imread(data_path_test[index])
yuv = cv2.cvtColor(yuv,cv2.COLOR_BGR2YUV)
img = np.expand_dims(yuv, axis=0)

And then saved output like this:

out = test_pred = np.squeeze(test_pred_full)
out = np.clip(out, 0, 1) * 255
out = out.astype('uint8')
out = cv2.cvtColor(out,cv2.COLOR_YUV2BGR)
cv2.imwrite(f"./results/{index}.jpg",out)

but outputs are weird like this:

Weird .mp4 video after converting raw .yuv video with ffmpeg

Hi @sooyekim @JihyongOh

Thanks for your great work!

I am using following command to convert raw .yuv video to .mp4, but the result is very weird:

ffmpeg -f rawvideo -vcodec rawvideo -s 3840x2160 -r 25 -pix_fmt yuv420p -i ./test_img_dir/JSI-GAN_x2_exp1/result.yuv -c:v libx264 -preset ultrafast -qp 0 ./test_img_dir/JSI-GAN_x2_exp1/result.mp4

Did I do something wrong?

Kind regards,
Oliver

cant get the model weights

hi,
thanks the nice work!
i cant get the model weights,
can you share it to me ?

weiried about the test PSNR

hi, thanks for your great works!

I am now trying to test the pre-trained model you provide.

and when I test the testset you provide I got the PSNR about 35.77

while I test the LG_Daylight video 1000th-1050th frames , the PSNR is around 26-28db.

I download the 4k media from 4kmedia.org and extract frames as you say and save as .mat file to get the quantitative results.

Could you help me with that. thanks a lot! @sooyekim

Calculation of SSIM

Hi,
Thanks for the awesome work.
I can't reproduce the SSIM in the JSI-GAN paper when using Deep-SR-ITM/utils/ssim_index_new.m to calculate the SSIM between GT and Pred. The Pred is obtained by direct inference from JSI-GAN/checkpoint_dir. Specifically, the SSIMs I calculated for JSInet and JSI-GAN are 0.9401 and 0.9330, respectively. The codes are as follows:

gt_img = double(imread(gt_folder))
pred_img = double(imread(pred_folder))
[mssim, ssim_map, mcs, cs_map] = ssim_index_new(gt_img, pred_img, K, win, 1023)

How to make my own dataset only by images not by videos

I want to use my own dataset for training and testing the code, so I want to know if I can make .mat file(Input and GT) only by my own RGB images not frames from videos.
Thank you~

test PSNR is not as same as the paper

hi, It is very appreciated for you to open your work!

I am now trying to re-implement your work and found that

use the pretrained weights to inferecnce the test mat will get the low PSNR:

the instruction I use is:
python main.py --phase test_mat --scale_factor 2 --test_data_path_LR_SDR ./data/test/testset_SDR_x2.mat --test_data_path_HR_HDR ./data/test/testset_HDR.mat

[ 20/ 28]-th images, time: 7.6218(minutes), test_PSNR: -8.98004621[dB]
[ 21/ 28]-th images, time: 7.8429(minutes), test_PSNR: -9.02793094[dB]
[ 22/ 28]-th images, time: 8.0659(minutes), test_PSNR: -9.09201666[dB]
[ 23/ 28]-th images, time: 8.2833(minutes), test_PSNR: -9.27487238[dB]
[ 24/ 28]-th images, time: 8.5779(minutes), test_PSNR: -9.01922608[dB]
[ 25/ 28]-th images, time: 8.9090(minutes), test_PSNR: -8.93554880[dB]
[ 26/ 28]-th images, time: 9.2477(minutes), test_PSNR: -8.83121465[dB]
[ 27/ 28]-th images, time: 9.5102(minutes), test_PSNR: -8.97071717[dB]
######### Average Test PSNR: -9.21918013[dB] #########
######### Estimated Inference Time (per 4K frame): 2.20851422[s] #########

the pretrained weights I use is JSI-GAN_x2_exp1

I think if I do something wrong to get these results?

looking forward to your reply!

Convert different aspekt ratio than 16/9

Hey I try to convert pictures that are different aspect ratios than 16:9

eg. my pictures have the Ratio 2:1 (2048:1024px) is there a way to convert them without cuting or resizing them :)?

greetings :)

Please add requirements.txt or environment.yml

Thank you for the nice work and for opening the codes!
But there is no requirements.txt or environment.yml, I have trouble in installing the dependencies.
Could you please add requirements.txt or environment.yml?

Ra Hinge gan loss definition

Hi @sooyekim ,
Thanks for the great work
I had a doubt regarding the Ra Hinge Gan loss definition, which has been used here
Shouldn't it be loss = (real_loss + fake_loss) / 2 , Since it's relativistic average hinge gan loss.
Just for your reference, i saw implementation of RaGan here
It will be very helpful if you could please clarify.