Giter Site home page Giter Site logo

microsoft / cocosnet Goto Github PK

View Code? Open in Web Editor NEW
389.0 11.0 45.0 19.19 MB

Cross-domain Correspondence Learning for Exemplar-based Image Translation. (CVPR 2020 Oral)

License: MIT License

Python 100.00%
image-synthesis image-translation generative-adversarial-network image-manipulation gans cocosnet pytorch computer-vision deep-learning

cocosnet's Introduction

Cross-domain Correspondence Learning for Exemplar-based Image Translation (CVPR 2020 oral, official Pytorch implementation)

Teaser

Pan Zhang, Bo Zhang, Dong Chen, Lu Yuan, and Fang Wen.

Abstract

We present a general framework for exemplar-based image translation, which synthesizes a photo-realistic image from the input in a distinct domain (e.g., semantic segmentation mask, or edge map, or pose keypoints), given an exemplar image. The output has the style (e.g., color, texture) in consistency with the semantically corresponding objects in the exemplar. We propose to jointly learn the cross domain correspondence and the image translation, where both tasks facilitate each other and thus can be learned with weak supervision. The images from distinct domains are first aligned to an intermediate domain where dense correspondence is established. Then, the network synthesizes images based on the appearance of semantically corresponding patches in the exemplar. We demonstrate the effectiveness of our approach in several image translation tasks. Our method is superior to state-of-the-art methods in terms of image quality significantly, with the image style faithful to the exemplar with semantic consistency. Moreover, we show the utility of our method for several applications.

✨ News

2022.12 We propose Paint by Example which allows in the wild image editing according to an examplar based on stable diffusion. One can have a try for our online demo.

2022.8 We recently propose PITI which is a SOTA image-to-image translation method based on prtrained diffusion model.

2021.5 We recently propose CoCosNet v2, which brings more stunning results for high-resolution images. Welcome to have a try.

Demo

Installation

Clone the Synchronized-BatchNorm-PyTorch repository.

cd models/networks/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
cd ../../

Install dependencies:

pip install -r requirements.txt

Inference Using Pretrained Model

1) ADE20k (mask-to-image)

Download the pretrained model from here and save them in checkpoints/ade20k. Then run the command

python test.py --name ade20k --dataset_mode ade20k --dataroot ./imgs/ade20k --gpu_ids 0 --nThreads 0 --batchSize 6 --use_attention --maskmix --warp_mask_losstype direct --PONO --PONO_C

The results are saved in output/test/ade20k. If you don't want to use mask of exemplar image when testing, you can download model from here, save them in checkpoints/ade20k, and run

python test.py --name ade20k --dataset_mode ade20k --dataroot ./imgs/ade20k --gpu_ids 0 --nThreads 0 --batchSize 6 --use_attention --maskmix --noise_for_mask --warp_mask_losstype direct --PONO --PONO_C --which_epoch 90

2) Celebahq (mask-to-face)

Download the pretrained model from here, save them in checkpoints/celebahq, then run the command:

python test.py --name celebahq --dataset_mode celebahq --dataroot ./imgs/celebahq --gpu_ids 0 --nThreads 0 --batchSize 4 --use_attention --maskmix --warp_mask_losstype direct --PONO --PONO_C --warp_bilinear --adaptor_kernel 4

, then the results will be saved in output/test/celebahq.

3) Celebahq (edge-to-face)

Download the pretrained model from here, save them in checkpoints/celebahqedge, then run

python test.py --name celebahqedge --dataset_mode celebahqedge --dataroot ./imgs/celebahqedge --gpu_ids 0 --nThreads 0 --batchSize 4 --use_attention --maskmix --PONO --PONO_C --warp_bilinear --adaptor_kernel 4

the results will be stored in output/test/celebahqedge.

4) DeepFashion (pose-to-image)

Download the pretrained model from here, save them in checkpoints/deepfashion, then run the following command:

python test.py --name deepfashion --dataset_mode deepfashion --dataroot ./imgs/DeepFashion --gpu_ids 0 --nThreads 0 --batchSize 4 --use_attention --PONO --PONO_C --warp_bilinear --no_flip --warp_patch --video_like --adaptor_kernel 4

and the results are saved in output/test/deepfashion.

Training

Pretrained VGG model Download from here, move it to models/. This model is used to calculate training loss.

1) ADE20k (mask-to-image)

  • Dataset Download ADE20k, move ADEChallengeData2016/annotations/ADE_train_*.png to ADEChallengeData2016/images/training/, ADEChallengeData2016/annotations/ADE_val_*.png to `ADEChallengeData2016/images/validation/

  • Retrieval_pairs We use image retrieval to find exemplars for exemplar-based training. Download ade20k_ref.txt and ade20k_ref_test.txt from here, save or replace them in data/

  • Run the command, note dataset_path is your ade20k root, e.g., /data/Dataset/ADEChallengeData2016/images. We use 8 32GB Tesla V100 GPUs for training. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.

    python train.py --name ade20k --dataset_mode ade20k --dataroot dataset_path --niter 100 --niter_decay 100 --use_attention --maskmix --warp_mask_losstype direct --weight_mask 100.0 --PONO --PONO_C --batchSize 32 --vgg_normal_correct --gpu_ids 0,1,2,3,4,5,6,7
  • If you don't want to use mask of the exemplar image when testing, you can run

    python train.py --name ade20k --dataset_mode ade20k --dataroot dataset_path --niter 100 --niter_decay 100 --use_attention --maskmix --noise_for_mask --mask_epoch 150 --warp_mask_losstype direct --weight_mask 100.0 --PONO --PONO_C --vgg_normal_correct --batchSize 32 --gpu_ids 0,1,2,3,4,5,6,7

2) Celebahq (mask-to-face)

  • Dataset Download Celebahq, we combine the parsing mask except glasses. You can download and unzip annotations, then move folder all_parts_except_glasses/ to CelebAMask-HQ/CelebAMask-HQ-mask-anno/
  • Retrieval_pairs We use image retrieval to find exemplars for examplar-based training. Download celebahq_ref.txt and celebahq_ref_test.txt from here, save or replace them in data/
  • Train_Val split We randomly split images to train set and validation set. Download train.txt and val.txt from here, save them in CelebAMask-HQ/
  • Run the command, note dataset_path is your celebahq root, e.g. /data/Dataset/CelebAMask-HQ. In our experiment we use 8 32GB Tesla V100 GPUs for training. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.
    python train.py --name celebahq --dataset_mode celebahq --dataroot dataset_path --niter 30 --niter_decay 30 --which_perceptual 4_2 --weight_perceptual 0.001 --use_attention --maskmix --warp_mask_losstype direct --weight_mask 100.0 --PONO --PONO_C --vgg_normal_correct --fm_ratio 1.0 --warp_bilinear --warp_cycle_w 0.1 --batchSize 32 --gpu_ids 0,1,2,3,4,5,6,7

3) Celebahq (edge-to-face)

  • Dataset Download the dataset
  • Retrieval_pairs same as Celebahq (Mask-to-face)
  • Train_Val split same as Celebahq (Mask-to-face)
  • Run the following command. Note that dataset_path is your celebahq root, e.g. /data/Dataset/CelebAMask-HQ. We use 8 32GB Tesla V100 GPUs to train the network. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.
    python train.py --name celebahqedge --dataset_mode celebahqedge --dataroot dataset_path --niter 30 --niter_decay 30 --which_perceptual 4_2 --weight_perceptual 0.001 --use_attention --maskmix --PONO --PONO_C --vgg_normal_correct --fm_ratio 1.0 --warp_bilinear --warp_cycle_w 1 --batchSize 32 --gpu_ids 0,1,2,3,4,5,6,7

4) DeepFashion (pose-to-image)

  • Dataset Download DeepFashion, we use OpenPose to estimate pose of DeepFashion. Download and unzip openpose results, then move folder pose/ to DeepFashion/
  • Retrieval_pairs Download deepfashion_ref.txt, deepfashion_ref_test.txt and deepfashion_self_pair.txt from here, save or replace them in data/
  • Train_Val split Download train.txt and val.txt from here, save them in DeepFashion/
  • Run the following command. Note that dataset_path is your DeepFashion dataset root, e.g. /data/Dataset/DeepFashion. We use 8 32GB Tesla V100 GPUs to train the network. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.
    python train.py --name deepfashion --dataset_mode deepfashion --dataroot dataset_path --niter 50 --niter_decay 50 --which_perceptual 4_2 --weight_perceptual 0.001 --use_attention --real_reference_probability 0.0 --PONO --PONO_C --vgg_normal_correct --fm_ratio 1.0 --warp_bilinear --warp_self_w 100 --no_flip --warp_patch --video_like --batchSize 32 --gpu_ids 0,1,2,3,4,5,6,7

Citation

If you use this code for your research, please cite our papers.

@inproceedings{zhang2020cross,
  title={Cross-domain Correspondence Learning for Exemplar-based Image Translation},
  author={Zhang, Pan and Zhang, Bo and Chen, Dong and Yuan, Lu and Wen, Fang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5143--5153},
  year={2020}
}

Also, welcome to refer to our CoCosNet v2:

@InProceedings{Zhou_2021_CVPR,
author={Zhou, Xingran and Zhang, Bo and Zhang, Ting and Zhang, Pan and Bao, Jianmin and Chen, Dong and Zhang, Zhongfei and Wen, Fang},
title={CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2021},
pages={11465-11475}
}

Acknowledgments

This code borrows heavily from SPADE. We also thank Jiayuan Mao for his Synchronized Batch Normalization code.

cocosnet's People

Contributors

microsoftopensource avatar pan6zhang avatar panzhang0212 avatar zhangmozhe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cocosnet's Issues

DeepFashion Training Pairs

Hi,
Thanks for your awesome work.
I have a question about the DeepFashion training pairs. I read the code and found that the DeepFashion pairs are established by the 'deepfashion_self_pair.txt' and 'deepfashion_ref.txt'. I want to ask that how do you get these two files (especially the latter). Are there images in the DeepFashion dataset removed (as the number of images is not the same)?

How to train edge to face on my own data

Thanks for the great work! I wonder if i have a face dataset, what do i need to do to prepare it for edge to face training. Do i have to extract segments for all face parts like CelebAMask?

Cycle consistency weight = 0

In the provided training script, opt.warp_cycle_w is set to 0 by default. I wonder what is the weight for cycle correspondence regularization. Also, what is the good way to set flags for opt.warp_patch and opt.two_cycle?

Finally, is it possible to pretrain the correspondence module with the current code? Thanks!

How to transform openpose JSON file to candidate.txt and subset.txt in the pose-to-image?

Hi,

Thanks for sharing the project. I want to use openpose to generate my own dataset for pose-to-image. Openpose will output a JSON file (https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/output.md) which is very different with candidate.txt and subset.txt included in the CoCosNet project.

What's the relationship between the openpose JSON file and candidate.txt & subset.txt? How to transform from one to another?

Thanks.

Domain alignment loss before channel-wise normalization

coor_out['loss_novgg_featpair'] = F.l1_loss(adaptive_feature_seg, adaptive_feature_img_pair) * self.opt.novgg_featpair

In correspondence.py, the domain alignment loss seems to be computed before the channel-wise normalization. While it doesn't affect the model inference, it might cause trivial solutions during training (as suggested by Eq. 9 of the paper). Not sure if it is a bug to be fixed.

Many thanks!

SWD metric

would you mind share the repo for the SWD metric evaluation? I use this repo https://github.com/koshian2/swd-pytorch to evaluate the results, while the swd score of the pre-trained model is quite different from the paper (112.4 on ADE20K). I think it is caused by the different repo of SWD. @panzhang0212

wrong padding of AdaptiveFeatureGenerator

Hi, I noticed that for AdaptiveFeatureGenerator, the encoder is implemented using padding of Conv,
self.layer1 = norm_layer(nn.Conv2d(opt.spade_ic, ndf, kw, stride=1, padding=pw))
However, the warping images have strong artifacts at boundry.
image

Instead, when replacing the padding with reflect padding, the result is significantly improved.
image

Because of this, it seems there is a difference between the pretrained model and the testing code. Also, when using reflect padding for training, I often find the training collapses with unknow error after several epochs. Did you notice the same issue?

How does the random flip works in Losses for pseudo exemplar pairs

I try to train CoCosNet in my own dataset,but i find it is random to get the right result because of the probability of flip?.So I want to ask How does the random flip works in Losses for pseudo exemplar pairs?if I choose the exempalr no filp, this difference can influence the final result? Thanks for your answering.

How to get the candidate and subset scores?

I find the pose data (candidate.txt and subset.txt) you used in your model is different from the original openpose, How can I get the candidate.txt and subset.txt? Thanks a lot.

training model for pose transfer on custom dataset

For pose transfer,when we execute data/deepfashion_dataset.py what changes or extra preprocessing steps do we need to make.
From what i have read about deepfashion,
it seems we first need to obtain keypoints using openpose.
Next generate pairs
Apart from that is there anything missing?

train problem

Traceback (most recent call last):
File "train.py", line 58, in
trainer.run_discriminator_one_step(data_i)
File "C:\Users\64883\Desktop\ref_code\CoCosNet-master\trainers\pix2pix_trainer.py", line 70, in run_discriminator_one_step
d_losses = self.pix2pix_model(data, mode='discriminator', GforD=GforD)
File "D:\Anaconda3\envs\paddlepaddle\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\64883\Desktop\ref_code\CoCosNet-master\models\pix2pix_model.py", line 52, in forward
input_label, input_semantics, real_image, self_ref, ref_image, ref_label, ref_semantics = self.preprocess_input(data, )
File "C:\Users\64883\Desktop\ref_code\CoCosNet-master\models\pix2pix_model.py", line 193, in preprocess_input
input_semantics[:,-3:-2,:,:] = glasses
RuntimeError: The expanded size of the tensor (1) must match the existing size (0) at non-singleton dimension 1. Target sizes: [1, 1, 256, 256]. Tensor sizes: [0, 256, 256]

the problem is solved by setting the self.preprocess_input(data, ) to self.preprocess_input(data.copy(), ) in line 52 of pix2pix_model.py.

thanks to #3

contain_dontcare_label?

in the code, with contain_dontcare_label, the input_semantics seems add one channel whose value is zero. why do this?

Evaluation script

Hello!
Can you point me out to a reference implementation for the FID and SFW metrics? I've tried to reproduce the results, but I couldn't achieve the same numbers as you present in the paper. I don't know where is the difference coming from. Thank you!

Error when running ADE20k pre-trained model

Thanks for your wonderful work. I want to use your pre-trained model on my datasets.
using this command

python test.py --name ade20k --dataset_mode ade20k --dataroot ./imgs/ade20k --gpu_ids 0 --nThreads 0 --batchSize 6 --use_attention --maskmix --noise_for_mask --warp_mask_losstype direct --PONO --PONO_C --which_epoch 90

I thought it was not necessary to use ref mask when using a pre-trained model, so I removed all the .png images in imgs/ade20k. But there was an error

I wonder if I can use your pre-trained model if I don't have segmentation images ( masking images), thanks

Out of memory error

What is the minimum capacity of memory for training with the batch_size=1?

I implemented a 2060super GPU on my computer, but this error happened.

demo for makeup transfer?

Hi, do you have any plan to release the code for makeup transfer? Or could you please explain the implementations in detail? Thank you!

KeyError

When I try to run the test.py with deepfashion dataset, there is an error occurred.
image

Data error in celebahqedge

The input edge image style can't match with paper
code generate edge image style
image

paper's edge image style
image

How to use your model train or test other dataset?

Thanks for your awesome work. I want to use my own dataset to test your model. My dataset is an image dataset in the format of 1024X768 images. May I ask how I can pre-process the dataset to use your model? And can your pre-training model be used to test other datasets?

Retrieval pairs for other dataset

Hi, how to generate retrieval pairs for another dataset such as Cityscapes (not given in this repository), can you give me some tips?Thanks!

How to convert ade20k color label to gray values used in CoCosNet

@panzhang0212

Thank You for the codes for CoCosNet, results looks very impressive too.

However, i just wanted to use CoCosNet for new images using your ADE20K model, but when i infer with semantic segmentation model i get color labels as in ADE20K dataset.

As i understand CoCosNet ADE20k model will need gray mask for inference.
When i convert color ADE20K semantic mask using cv2 i don't get the labels used in your examples.

Therefore how do i convert color semantic labels given in ADE20k to gray labels used in your model ?

Best regards,
Peu

how to achieve image translation

Thanks for your work! I wonder how I can achieve image translation using your method? I find that the ref_test.txt just set .jpg and .jpg correlated, but I just have mask image in .png format (like what the demo does), and I find it doesn't work following the inference direction. Can you please tell me what should I do? @panzhang0212
1703063622298

No module named 'models.networks.sync_batchnorm'

When I ran DeepFashion (Posed to image) with the pretraining model, the following error occurred:
Traceback (most recent call last): File "test.py", line 13, in <module> from models.pix2pix_model import Pix2PixModel File "E:\pythonWork\CoCosNet-master\models\pix2pix_model.py", line 6, in <module> import models.networks as networks File "E:\pythonWork\CoCosNet-master\models\networks\__init__.py", line 8, in <module> from models.networks.loss import * File "E:\pythonWork\CoCosNet-master\models\networks\loss.py", line 7, in <module> from models.networks.architecture import VGG19 File "E:\pythonWork\CoCosNet-master\models\networks\architecture.py", line 9, in <module> from models.networks.normalization import SPADE, equal_lr, SPADE_TwoPath File "E:\pythonWork\CoCosNet-master\models\networks\normalization.py", line 10, in <module> from models.networks.sync_batchnorm import SynchronizedBatchNorm2d ModuleNotFoundError: No module named 'models.networks.sync_batchnorm'
How to resolve this missing reference file?

FID on the deepfashion dataset

I calculate the fid score on deepfashion with your pretrained results, but the fid is 13.05, which is not consistent with yours. My fid calculation is based on the test set groundtruth. Can you show me more detail about your evaluation?
image

bad corr result's but good translation result's

I trained on celebahqedge dataset with two gpus,the results of corr is to bad, but translation can generator got good results。I found that only adaptive_kernel and batchsize is different between the two models.
our results
image

your checkpoints results
image

Is WTA_Scale Never Used?

I notice that this function is used in serious work but I notice that it never mentioned in the option.

Attention in SPADEGenerator

Hi, I notice there is an attention model in the SPADEGenerator, which is defined in the CoCosNet/models/networks/architecture.py line 97. What does this model mean? Can you give a reference? Thanks a lot!

running error about ade20k

I want to repeat the training mission of mask to image.

I moved images named ADE_train_.png and ADE_val_.png to the destination folder and use the following command to do this.

CUDA_VISIBLE_DEVICES=0,1 python train.py --name ade20k --dataset_mode ade20k --dataroot /data/ADE20K_2016_07_26/images --niter 100 --niter_decay 100 --use_attention --maskmix --warp_mask_losstype direct --weight_mask 100.0 --PONO --PONO_C --batchSize 4 --vgg_normal_correct --gpu_ids 0,1

then the following error was reported

AssertionError: The label-image pair (/data/ADE20K_2016_07_26/images/training/a/airport/airport/ADE_train_00001069_parts_1.png, /data/ADE20K_2016_07_26/images/training/a/airport/airport/ADE_train_00001069.jpg) do not look like the right pair because the filenames are quite different. Are you sure about the pairing? Please see data/pix2pix_dataset.py to see what is going on, and use --no_pairing_check to bypass this.

I noticed that all of the images in ADE20K are RGB images, so how do you transfer them into gray images? The images located in CoCosNet/imgs/ade20k/training/ in your project are pairs of a RGB image and a gray image.

waiting for your reply

vgg19_conv.pth

vgg19_conv.pth is not provided. Can anybody share it?

RuntimeError: Given groups=1, weight of size [64, 18, 4, 4], expected input[2, 4, 256, 256] to have 18 channels, but got 4 channels instead

we run command
python train.py --name celebahqedge --dataset_mode celebahqedge --dataroot dataset_path --niter 30 --niter_decay 30 --which_perceptual 4_2 --weight_perceptual 0.001 --use_attention --maskmix --PONO --PONO_C --vgg_normal_correct --fm_ratio 1.0 --warp_bilinear --warp_cycle_w 1 --batchSize 1 --gpu_ids 0

but have some errors

/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/functional.py:1558: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
Traceback (most recent call last):
File "train.py", line 58, in
trainer.run_discriminator_one_step(data_i)
File "/home/dc2-user/tomyuan-workstation/CoCosNet/trainers/pix2pix_trainer.py", line 70, in run_discriminator_one_step
d_losses = self.pix2pix_model(data, mode='discriminator', GforD=GforD)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/dc2-user/tomyuan-workstation/CoCosNet/models/pix2pix_model.py", line 75, in forward
input_semantics, real_image, GforD, label=input_label)
File "/home/dc2-user/tomyuan-workstation/CoCosNet/models/pix2pix_model.py", line 289, in compute_discriminator_loss
input_semantics, fake_image, real_image)
File "/home/dc2-user/tomyuan-workstation/CoCosNet/models/pix2pix_model.py", line 352, in discriminate
discriminator_out, seg, cam_logit = self.net'netD'
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/dc2-user/tomyuan-workstation/CoCosNet/models/networks/discriminator.py", line 62, in forward
out, cam_logit = D(input)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/dc2-user/tomyuan-workstation/CoCosNet/models/networks/discriminator.py", line 152, in forward
intermediate_output = submodel(x)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 353, in forward
return self._conv_forward(input, self.weight)
File "/home/dc2-user/.local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 350, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 18, 4, 4], expected input[2, 4, 256, 256] to have 18 channels, but got 4 channels instead

Requires scikit-image==0.14.2

After installing the requirements, I received ImportError: cannot import name '_validate_lengths' upon importing skimage somewhere in the testing script. I found this scikit-image issue which says it's fixed in version 0.14.2. After upgrading scikit-image to 0.14.2, the issue disappears.

why label_except_glasses_tensor is multiplied by 255, while glasses_tensor not?

in get_label_tensor(self, path) of celebahq_dataset.py

transform_label = get_transform(self.opt, params, method=Image.NEAREST, normalize=False)
 label_except_glasses_tensor = transform_label(label_except_glasses) * 255.0
glasses_tensor = transform_label(glasses)`

why label_except_glasses_tensor is multiplied by 255, while glasses_tensor not?

KeyError: '22233.jpg' on celebahqedge test script

Running example test command from the readme on celebahqedge:

python test.py --name celebahqedge --dataset_mode celebahqedge --dataroot ./imgs/celebahqedge/ --gpu_ids 0 --nThreads 0 --batchSize 4 --use_attention --maskmix --PONO --PONO_C --warp_bilinear --adaptor_kernel 4

I get this error:

Traceback (most recent call last):
  File "test.py", line 19, in <module>
    dataloader.dataset[0]
  File "/home/bzion/projects/CoCosNet/data/pix2pix_dataset.py", line 90, in __getitem__
    val = self.ref_dict[key]
KeyError: '22233.jpg'

Digging around, I notice the function get_ref in pix2pix_dataset.py is implicated, and appears to be not implemented yet? Maybe this is the source of the issue.

    def get_ref(self, opt):
        pass

This repo looks very exciting.

Question about Edge to face?

Hi, I followed your code to train and test it but still got many questions:

When I test the edge2face I found that does not need to provide a semantic segmentation mask.
Does that mean that the semantic input is only to generate a clearer edge with canny? ?
If in the case, then edge2face will not have semantic transfer message?
I really have no ideas.

Nice work, some questions about the Perceptual Loss

Nice work and thanks for sharing the code!

I noticed that the cvpr2019 "Deep Exemplar-based Video Colorization" is also a work from your team, the two papers share similar Perceptual Loss, in this work, all the images sent to VGG are in RGB space, however in that work the generated images are in CIELAB space, so how to send it to a pre-trained VGG in RGB space?

Did you retrain VGG in CIELAB space or use other methods to solve this color space inconsistency problem? Can you share your solution? That will be very helpful!

Thanks first.

Question about contextual loss

Nice work and thanks for sharing the code!
I'm a little confused about the contextual loss. As you said in your paper, it should use the low-level feature of VGG, and therefore you choose the relu2_2 up to relu5_2 feature. But it seems that the relu5_2 is actually a high level feature and you should use relu1_2 and relu2_2 instead?
Thanks first.

Weired style synthesis result

Hi, I have tried to implement CoCosNet using SPADE in Generator. When the model finished the training, the output result showed weired result. Below figure shows the translation result for animal to animal translation. Left column is the content images, middle one is the style images and right one is the results. The trained Generator output the entire of style images.

epoch80_fakeStyle_result

Here is the normalization function SPADE following the code. Could you give me some advices? Thank you.

SPADE

How to train pose&face&hand to image?

Hi,

In order to make the translation more vivid for persons, I plan to use face & hand as well as pose which can be detected by Openpose at the same time.

Could you give me an instruction for training, Thanks.

the real_reference_probability and hard_reference_probability

Firstly, Thanks for your patient reply again!when I use CoCosNet train on my own edge-to-image dataset, the first twenty iterations it can generate good results, but when after twenty iterations, the translation result is the same as the exemplar but not use edge image to control pose. I try to find an answer in the paper but I can't. So, I want to ask what does the real_reference_probability and hard_reference_probability work? Can I change it to better match my own dataset?

Questions about the ADE20K dataset

Hi, when I training on the ADE20k dataset, I find that the original semantic mask of the ADE20K is colorized. And you use this code ("label = Image.open(path)") to read the semantic mask. However, the label, loaded by this way, has 3 channels (b x 3 x h x w).

When we convert the label into one-hot label, there will be an error.

/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = -1]: block: [183,0,0], thread: [96,0,0] Assertion indexValue >= 0 && indexValue < tensor.sizes[dim] failed.

It seems that values of the semantic mask are out of the range of the label_nc (182 predefined in your code). Therefore, how should I preprocess these colorized original semantic masks?

CBN_intype

what's the meaning of parameter 'CBN_intype'?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.