Giter Site home page Giter Site logo

jihyongoh / jsi-gan Goto Github PK

View Code? Open in Web Editor NEW
59.0 59.0 16.0 49 KB

[AAAI 2020] Official repository of JSI-GAN.

MATLAB 5.82% Python 94.18%
aaai aaai2020 convolutional-neural-networks deep-learning divide-and-conquer generative-adversarial-network inverse-tone-mapping joint-models jsi-gan sr-itm super-resolution

jsi-gan's Introduction

Hi there πŸ‘‹

Updates Visitor Badge

Anurag's GitHub stats

  • πŸ‘¨πŸ»β€πŸ’» I am currently an assistant professor at CMLab (Creative Vision and Multimedia Lab.) in Chung-Ang Univ. (CAU).
  • πŸ‘¨πŸ»β€βš• Please visit my personal homepage (here).
  • πŸ”¬ I primarily focus on a variety of deep-learning-based Computer Vision research areas, such as:
   Neural Radiance Fields (NeRF)
   Video Frame Interpolation / Super Resolution / Deblurring / Colorization  
   Optical Flow Estimation
   Computational Photography
   SDR-to-HDR Inverse Tone Mapping
   Generative AI; Diffusion Models, GANs
   GAN/CNN-based Synthetic Aperture Radar (SAR) Target Recognition/Generation
  • πŸ’» If you are interested in collaborating with me, please don't hesitate to send the email below.
  • πŸ“§ Contact: [email protected]

jsi-gan's People

Contributors

jihyongoh avatar kaist-viclab avatar ryul99 avatar sooyekim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jsi-gan's Issues

HR-SDR datasets

Hi, thanks for your work!

I am very interested in your work related to SR-ITM, such as Multi-purpose CNN, Deep SR-ITM, and JSI-GAN.

I want to repeat and follow up on your work. It may need the high-resolution SDR datasets(HR-SDR datasets).

I download your provided datasets from this link, but these datasets only have LRx2-SDR, LRx4-SDR, and HR-HDR frames. Could you provide HR-SDR datasets?

Thanks!

will the inference code be released?

I test the lena use 'predictor' function, I get result is less ideal. what's wrong my 'predictor'?

    def predictor(self, input_path, output_path):
        # saver to save model
        tf.global_variables_initializer().run()

        self.saver = tf.train.Saver()
        self.load(self.checkpoint_dir)  # for testing JSI-GAN

        """" Test """
        data_path_test = glob.glob(os.path.join(input_path, '*.png'))

        """ Testing """
        patch_boundary = 10  # set patch boundary to reduce edge effect around patch edges
        for index in range(len(data_path_test)):
            import cv2

            img = cv2.imread(data_path_test[index], -1) #BGR
            img = img[:, :, [2, 1, 0]] #RGB
            img = cv2.cvtColor(img, cv2.COLOR_RGB2YUV)
            img = np.expand_dims(img, axis=0)

            data_sz = img.shape
            test_pred_full = np.zeros((data_sz[1] * self.scale_factor, data_sz[2] * self.scale_factor, data_sz[3]))
            img = np.array(img, dtype=np.double) / 255.
            data_test = np.clip(img, 0, 1)

            ###======== Divide Into Patches ========###
            for p in range(self.test_patch[0] * self.test_patch[1]):
                pH = p // self.test_patch[1]
                pW = p % self.test_patch[1]
                sH = data_sz[1] // self.test_patch[0]
                sW = data_sz[2] // self.test_patch[1]
                # process data considering patch boundary
                H_low_ind, H_high_ind, W_low_ind, W_high_ind = \
                    get_HW_boundary(patch_boundary, data_sz[1], data_sz[2], pH, sH, pW, sW)
                data_test_p = data_test[:, H_low_ind: H_high_ind, W_low_ind: W_high_ind, :]
                ###======== Run Session ========###
                st = time.time()
                test_pred_o = self.sess.run(self.test_pred, feed_dict={self.test_input_ph: data_test_p})
                # trim patch boundary
                test_pred_t = trim_patch_boundary(test_pred_o, patch_boundary, data_sz[1], data_sz[2], pH, sH, pW, sW, self.scale_factor)
                # store in pred_full
                test_pred_full[pH * sH * self.scale_factor: (pH + 1) * sH * self.scale_factor,
                pW * sW * self.scale_factor: (pW + 1) * sW * self.scale_factor, :] = np.squeeze(test_pred_t)
            ###======== Save Predictions as Images ========###
            test_pred_full = np.floor(255.0 * np.clip(test_pred_full, 0, 1) + 0.5).astype(np.uint8)
            test_pred_full = cv2.cvtColor(test_pred_full, cv2.COLOR_YUV2RGB)
            test_pred_full = test_pred_full[:, :, [2, 1, 0]]
            cv2.imwrite(os.path.join(output_path, "test.png"), test_pred_full)

questions about how to get dataset ready for testing or training on my own

hi, thanks for your fancy work!

I am now trying to put the testset on my own

I download the testset as you say in 4kmedia.org

For HDR video:

  1. extract frames from the 4k video: ffmpeg -i movie_name -color_primaries bt2020 ./4k/LG_Daylight_4K_Demo_BG/%08d.png
  2. read and wirte them to the mat file for inference evaluation
files = os.listdir('./4k/LG_Daylight_4K_Demo_BG/')
imgs = []
files.sort()
for i, fi in enumerate(files):
   ` img = cv2.imread(os.path.join('./4k/LG_Daylight_4K_Demo_BGR/',fi), cv2.IMREAD_UNCHANGED)`
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    #print(img.shape)
    H_Y, H_u, H_v = rgb2yuv_hdr(np.array(img))
    H_Y = H_Y[:,:,np.newaxis]
    H_u = H_u[:,:,np.newaxis]
    H_v = H_v[:,:,np.newaxis]
    img_array = np.concatenate((H_Y, H_u, H_v), axis=2)
    img_array = np.array(img)
    img_array = np.transpose(img_array, (2, 1, 0))
    imgs.append(img_array)
imgs_mat = np.array(imgs)
file = h5py.File('LG_DayLight_HDR.mat','w')
file.create_dataset('HDR_YUV', data = imgs_mat)

for the rgb2yuv_hdr() function, I refer the issue 19 and write them in python:

def rgb2yuv_hdr(linRGB):
    """
    return: H_y, H_u, H_v
    """

    Bitdepth=10

    hdri=linRGB.astype(float)
    hdri = np.clip(hdri,0,10000)
    r,c,_= hdri.shape
    if np.mod(r,2) == 1:
        hdri = hdri[:r-1,:,:]
    if np.mod(c,2) == 1:
        hdri = hdri[:,1:c-1,:]


    # Coding TF
    print(np.max(hdri))
    print(np.min(hdri))
    hdri_divide = hdri / 10000
    print(np.max(hdri_divide))
    print(np.min(hdri_divide))
    hdri_divide[np.where(hdri_divide > 1)] = 1.0
    print(np.max(hdri_divide))
    print(np.min(hdri_divide))
    hdri_divide[np.where(hdri_divide < 0)] = 0.0
    print(np.max(hdri_divide))
    print(np.min(hdri_divide))
    Clip_hdri = hdri_divide

    #Clip_hdri = np.max(0,np.min(hdri/10000,1))
    
    # comment to keep linear values
    # m1=(2610/4096)*0.25;
    # m2=(2523/4096)*128;
    # c1=3424/4096;
    # c2=(2413/4096)*32;
    # c3=(2392/4096)*32;
    # PQTF_hdri=((c1+c2*(Clip_hdri.^m1))./(1+c3*(Clip_hdri.^m1))).^m2;
    PQTF_hdri = Clip_hdri

    # R'G'B to Y'CbCr
    Y = 0.2627*PQTF_hdri[:,:,0] + 0.6780*PQTF_hdri[:,:,2] + 0.0593*PQTF_hdri[:,:,2]
    Cb = (PQTF_hdri[:,:,2]-Y)/1.8814
    Cr = (PQTF_hdri[:,:,0]-Y)/1.4746

    # Quant 10b
    toren = np.power(2, Bitdepth - 8)
    toren_1 = np.power(2, Bitdepth)
    D_Y=np.clip(np.round(toren * (219*Y + 16)), 0, toren_1-1)
    D_Cb=np.clip(np.round(toren * (224*Cb + 128)), 0, toren_1-1)
    D_Cr=np.clip(np.round(toren * (224*Cr + 128)), 0, toren_1-1)
    

    
    # 4:4:4 to 4:4:4
    D_Cb_h, D_Cb_w = D_Cb.shape
    D_Cb_Hor = 64*D_Cb
    D_Cb_Hor=D_Cb_Hor + 384*D_Cb
    D_Cb_Hor=D_Cb_Hor + 64*D_Cb

    
    D_Cr_h, D_Cr_w = D_Cr.shape
    D_Cr_Hor=64*D_Cr
    D_Cr_Hor=D_Cr_Hor + 384*D_Cr
    D_Cr_Hor=D_Cr_Hor + 64*D_Cr
    
    D_Cb_Hor_h, D_Cb_Hor_w = D_Cb_Hor.shape
    D_Cb_ver=0*D_Cb_Hor
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor

    
    D_Cr_Hor_h, D_Cr_Hor_w = D_Cr_Hor.shape
    D_Cr_ver=0*D_Cr_Hor
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor

    """
    # 4:4:4 to 4:2:0
    D_Cb_h, D_Cb_w = D_Cb.shape
    D_Cb_padd = np.concatenate((D_Cb[:,0:1], D_Cb), axis=1)
    D_Cb_Hor = 64*D_Cb_padd[:,0:D_Cb_w:2]
    D_Cb_Hor=D_Cb_Hor + 384*D_Cb[:,0:D_Cb_w:2]
    D_Cb_Hor=D_Cb_Hor + 64*D_Cb[:,1:D_Cb_w:2]

    
    D_Cr_h, D_Cr_w = D_Cr.shape
    D_Cr_padd = np.concatenate((D_Cr[:,0:1], D_Cr), axis=1)
    D_Cr_Hor=64*D_Cr_padd[:,0:D_Cr_w:2]
    D_Cr_Hor=D_Cr_Hor + 384*D_Cr[:,0:D_Cr_w:2]
    D_Cr_Hor=D_Cr_Hor + 64*D_Cr[:,1:D_Cr_w:2]
    
    D_Cb_Hor_h, D_Cb_Hor_w = D_Cb_Hor.shape
    D_Cb_Hor_padd = np.concatenate((D_Cb_Hor[0:1, :], D_Cb_Hor), axis=0)
    D_Cb_ver=0*D_Cb_Hor_padd[0:D_Cb_Hor_h:2,:]
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor[0:D_Cb_Hor_h:2,:]
    D_Cb_ver=D_Cb_ver + 256*D_Cb_Hor[1:D_Cb_Hor_h:2,:]

    
    D_Cr_Hor_h, D_Cr_Hor_w = D_Cr_Hor.shape
    D_Cr_Hor_padd = np.concatenate((D_Cr_Hor[0:1, :], D_Cr_Hor), axis=0)
    D_Cr_ver=0*D_Cr_Hor_padd[0:D_Cr_Hor_h:2,:]
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor[0:D_Cr_Hor_h:2,:]
    D_Cr_ver=D_Cr_ver + 256*D_Cr_Hor[1:D_Cr_Hor_h:2,:]
    """

    toren_2 = np.power(0.5, 18)
    D_Cb = (D_Cb_ver+131072.0) * toren_2
    D_Cr = (D_Cr_ver+131072.0) * toren_2

    H_Y=D_Y
    H_u=D_Cb
    H_v=D_Cr

    return H_Y, H_u, H_v

questions I happened is below:

1. if I use 4:4:4 -> 4:2:0, the YUV channel will have the different shape, while I read your mat file, the Y,U,V channel are all the shape 3840x2160.
2. so I decide to use 4:4:4, while I try to save mat and try to read them use code below, the image looks something wrong:

datafile = './LG_DayLight_HDR.mat'
mat = h5py.File(datafile)
imgs = mat['HDR_YUV']
def yuv_to_bgr(img, Bitdepth=10):
    height, width = img.shape[:2]
    D_Y = img[:,:,0]
    D_u = img[:,:,1]
    D_v = img[:,:,2]
    #Inverse quant 10b
    toren = np.power(2, Bitdepth - 8)
    Y = np.clip((D_Y/toren - 16)/219, 0, 1)
    D_Cb = np.clip((D_u/toren - 128)/224, -0.5, 0.5)
    D_Cr = np.clip((D_v/toren - 128)/224, -0.5, 0.5)
    # Y'CbCr to R'G'B'
    # BT 2020,
    A = [[1,0.00000000000000,1.47460000000000],[1,-0.16455312684366,-0.57135312684366],[1,1.88140000000000,0.00000000000000]]
    A = np.array(A)
    
    RGB = np.zeros([height, width, 3])
    RGB[:,:,0] = np.clip(Y + A[0,2]*D_Cr, 0, 1)
    RGB[:,:,1] = np.clip(Y + A[1,1]*D_Cb + A[1,2]*D_Cr, 0, 1)
    RGB[:,:,2] = np.clip(Y + A[2,1]*D_Cb, 0, 1)
    
    RGB = (RGB * 65535).astype(np.uint16)
    BGR = cv2.cvtColor(RGB, cv2.COLOR_RGB2BGR)

    return BGR

import numpy as np
for i, img in enumerate(imgs):
    img = np.transpose(img, (2, 1, 0))
    print(img.shape)
    BGR = yuv_to_bgr(img, 10)
    cv2.imwrite('./hdr_{}.tiff'.format(i), BGR)

Could you please help me out of that problem? How to write to a mat file from a video type?

I just want to re-implement the results you did. But the released test set are so rare. I want to test more dataset and get the quantitative results.

Looking forward to your reply soooo much! @sooyekim @JihyongOh

Padding output shape is different from the mentioned comments in the code

Hi @sooyekim ,
Thanks for the great work.
I am running inference on test.mat files with scaling factor 4.
According to the comments on this line of the code: https://github.com/JihyongOh/JSI-GAN/blob/master/ops.py#L249
If i/p size is (1,100,170,3) then the output should be (1,120,190,3) as value of pad is 20 (41//2), but when i am running the code on my machine i am getting the output shape as (1.140,170,3). Is this behavior desirable?
Below i have attached the PDB log for your reference:

img_pad = tf.pad(x, tf.constant([[0, 0], [pad, pad], [0, 0], [0, 0]])) # [B, H+pad, W+pad, C]
(Pdb) x.shape
TensorShape([Dimension(1), Dimension(100), Dimension(170), Dimension(3)])
(Pdb) img_pad.shape
TensorShape([Dimension(1), Dimension(140), Dimension(170), Dimension(3)])

Inverse tone mapping without Super Resolution.

Hi @sooyekim ,

Thanks for your great work!

Let's say we just want a model to covert SDR images to HDR images without applying super resolution on them. Can such a model be obtained by modifying the provided code? If yes, then which modifications would you suggest?

Thanks much.

Weired results !?

Hi, I've tried to run your code on RGB images. So changed your code like this:

yuv = cv2.imread(data_path_test[index])
yuv = cv2.cvtColor(yuv,cv2.COLOR_BGR2YUV)
img = np.expand_dims(yuv, axis=0)

And then saved output like this:

out = test_pred = np.squeeze(test_pred_full)
out = np.clip(out, 0, 1) * 255
out = out.astype('uint8')
out = cv2.cvtColor(out,cv2.COLOR_YUV2BGR)
cv2.imwrite(f"./results/{index}.jpg",out)

but outputs are weird like this:
0

Weird .mp4 video after converting raw .yuv video with ffmpeg

Hi @sooyekim @JihyongOh

Thanks for your great work!

I am using following command to convert raw .yuv video to .mp4, but the result is very weird:

ffmpeg -f rawvideo -vcodec rawvideo -s 3840x2160 -r 25 -pix_fmt yuv420p -i ./test_img_dir/JSI-GAN_x2_exp1/result.yuv -c:v libx264 -preset ultrafast -qp 0 ./test_img_dir/JSI-GAN_x2_exp1/result.mp4

Screenshot from 2020-09-29 17-44-55

Did I do something wrong?

Kind regards,
Oliver

weiried about the test PSNR

hi, thanks for your great works!

I am now trying to test the pre-trained model you provide.

and when I test the testset you provide I got the PSNR about 35.77

while I test the LG_Daylight video 1000th-1050th frames , the PSNR is around 26-28db.

I download the 4k media from 4kmedia.org and extract frames as you say and save as .mat file to get the quantitative results.

Could you help me with that. thanks a lot! @sooyekim

Calculation of SSIM

Hi,
Thanks for the awesome work.
I can't reproduce the SSIM in the JSI-GAN paper when using Deep-SR-ITM/utils/ssim_index_new.m to calculate the SSIM between GT and Pred. The Pred is obtained by direct inference from JSI-GAN/checkpoint_dir. Specifically, the SSIMs I calculated for JSInet and JSI-GAN are 0.9401 and 0.9330, respectively. The codes are as follows:

gt_img = double(imread(gt_folder))
pred_img = double(imread(pred_folder))
[mssim, ssim_map, mcs, cs_map] = ssim_index_new(gt_img, pred_img, K, win, 1023)

test PSNR is not as same as the paper

hi, It is very appreciated for you to open your work!

I am now trying to re-implement your work and found that

use the pretrained weights to inferecnce the test mat will get the low PSNR:

the instruction I use is:
python main.py --phase test_mat --scale_factor 2 --test_data_path_LR_SDR ./data/test/testset_SDR_x2.mat --test_data_path_HR_HDR ./data/test/testset_HDR.mat

[ 20/ 28]-th images, time: 7.6218(minutes), test_PSNR: -8.98004621[dB]
[ 21/ 28]-th images, time: 7.8429(minutes), test_PSNR: -9.02793094[dB]
[ 22/ 28]-th images, time: 8.0659(minutes), test_PSNR: -9.09201666[dB]
[ 23/ 28]-th images, time: 8.2833(minutes), test_PSNR: -9.27487238[dB]
[ 24/ 28]-th images, time: 8.5779(minutes), test_PSNR: -9.01922608[dB]
[ 25/ 28]-th images, time: 8.9090(minutes), test_PSNR: -8.93554880[dB]
[ 26/ 28]-th images, time: 9.2477(minutes), test_PSNR: -8.83121465[dB]
[ 27/ 28]-th images, time: 9.5102(minutes), test_PSNR: -8.97071717[dB]
######### Average Test PSNR: -9.21918013[dB] #########
######### Estimated Inference Time (per 4K frame): 2.20851422[s] #########

the pretrained weights I use is JSI-GAN_x2_exp1

I think if I do something wrong to get these results?

looking forward to your reply!

Convert different aspekt ratio than 16/9

Hey I try to convert pictures that are different aspect ratios than 16:9

eg. my pictures have the Ratio 2:1 (2048:1024px) is there a way to convert them without cuting or resizing them :)?

greetings :)

Please add requirements.txt or environment.yml

Thank you for the nice work and for opening the codes!
But there is no requirements.txt or environment.yml, I have trouble in installing the dependencies.
Could you please add requirements.txt or environment.yml?

Ra Hinge gan loss definition

Hi @sooyekim ,
Thanks for the great work
I had a doubt regarding the Ra Hinge Gan loss definition, which has been used here
Shouldn't it be loss = (real_loss + fake_loss) / 2 , Since it's relativistic average hinge gan loss.
Just for your reference, i saw implementation of RaGan here
It will be very helpful if you could please clarify.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.