Giter Site home page Giter Site logo

microsoft / cocosnet-v2 Goto Github PK

View Code? Open in Web Editor NEW
334.0 18.0 36.0 9.32 MB

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Home Page: https://arxiv.org/abs/2012.02047

License: MIT License

Python 100.00%
pytorch generative-adversarial-network image-synthesis image-translation gans image-manipulation cocosnet computer-vision deep-learning image-generation

cocosnet-v2's Introduction

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation)

teaser

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
CVPR 2021, oral presentation
Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen

Abstract

We present the full-resolution correspondence learning for cross-domain images, which aids image translation. We adopt a hierarchical strategy that uses the correspondence from coarse level to guide the fine levels. At each hierarchy, the correspondence can be efficiently computed via PatchMatch that iteratively leverages the matchings from the neighborhood. Within each PatchMatch iteration, the ConvGRU module is employed to refine the current correspondence considering not only the matchings of larger context but also the historic estimates. The proposed CoCosNet v2, a GRU-assisted PatchMatch approach, is fully differentiable and highly efficient. When jointly trained with image translation, full-resolution semantic correspondence can be established in an unsupervised manner, which in turn facilitates the exemplar-based image translation. Experiments on diverse translation tasks show that CoCosNet v2 performs considerably better than state-of-the-art literature on producing high-resolution images.

✨ News

2022.12 We propose Paint by Example which allows in the wild image editing according to an examplar based on stable diffusion. One can have a try for our online demo.

2022.8 We recently propose PITI which is a SOTA image-to-image translation method based on prtrained diffusion model.

Installation

First please install dependencies for the experiment:

pip install -r requirements.txt

We recommend to install Pytorch version after Pytorch 1.6.0 since we made use of automatic mixed precision for accelerating. (we used Pytorch 1.7.0 in our experiments)

Prepare the dataset

First download the Deepfashion dataset (high resolution version) from this link. Note the file name is img_highres.zip. Unzip the file and rename it as img.
If the password is necessary, please contact this link to access the dataset.
We use OpenPose to estimate pose of DeepFashion(HD). We offer the keypoints detection results used in our experiment in this link. Download and unzip the results file.
Since the original resolution of DeepfashionHD is 750x1101, we use a Python script to process the images to the resolution 512x512. You can find the script in data/preprocess.py. Note you need to download our train-val split lists train.txt and val.txt from this link in this step.
Download the train-val lists from this link, and the retrival pair lists from this link. Note train.txt and val.txt are our train-val lists. deepfashion_ref.txt, deepfashion_ref_test.txt and deepfashion_self_pair.txt are the paring lists used in our experiment. Download them all and move below the folder data/.
Finally create the root folder deepfashionHD, and move the folders img and pose below it. Now the the directory structure is like:

deepfashionHD
│
└─── img
│   │
│   └─── MEN
│   │   │   ...
│   │
│   └─── WOMEN
│       │   ...
│   
└─── pose
│   │
│   └─── MEN
│   │   │   ...
│   │
│   └─── WOMEN
│       │   ...

Inference Using Pretrained Model

The inference results are saved in the folder checkpoints/deepfashionHD/test. Download the pretrained model from this link.
Move the models below the folder checkpoints/deepfashionHD. Then run the following command.

python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

The inference results are saved in the folder checkpoints/deepfashionHD/test.

Training from scratch

Make sure you have prepared the DeepfashionHD dataset as the instruction.
Download the pretrained VGG model from this link, move it to vgg/ folder. We use this model to calculate training loss.

Run the following command for training from scratch.

python train.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --niter 100 --niter_decay 0 --real_reference_probability 0.0 --hard_reference_probability 0.0 --which_perceptual 4_2 --weight_perceptual 0.001 --PONO --PONO_C --vgg_normal_correct --weight_fm_ratio 1.0 --no_flip --video_like --batchSize 16 --gpu_ids 0,1,2,3,4,5,6,7 --netCorr NoVGGHPM --match_kernel 1 --featEnc_kernel 3 --display_freq 500 --print_freq 50 --save_latest_freq 2500 --save_epoch_freq 5 --nThreads 16 --weight_warp_self 500.0 --lr 0.0001 --nef 32 --amp --weight_warp_cycle 1.0 --display_winsize 512 --iteration_count 5 --temperature 0.01 --continue_train --load_size 550 --crop_size 512 --which_epoch 15

Note that --dataroot parameter is your DeepFashionHD dataset root, e.g. dataset/DeepFashionHD.
We use 8 32GB Tesla V100 GPUs to train the network. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.

Citation

If you use this code for your research, please cite our papers.

@InProceedings{Zhou_2021_CVPR,
author={Zhou, Xingran and Zhang, Bo and Zhang, Ting and Zhang, Pan and Bao, Jianmin and Chen, Dong and Zhang, Zhongfei and Wen, Fang},
title={CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2021},
pages={11465-11475}
}

Also, welcome to refer to our CoCosNet v1:

@InProceedings{Zhang_2020_CVPR,
author={Zhang, Pan and Zhang, Bo and Chen, Dong and Yuan, Lu and Wen, Fang},
title={Cross-Domain Correspondence Learning for Exemplar-Based Image Translation},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020},
pages={5143-5153}
}

Acknowledgments

This code borrows heavily from CocosNet and DeepPruner. We also thank SPADE and RAFT.

License

The codes and the pretrained model in this repository are under the MIT license as specified by the LICENSE file.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

cocosnet-v2's People

Contributors

elliottzheng avatar microsoftopensource avatar xingranzh avatar zhangmozhe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cocosnet-v2's Issues

ade20

Hello, can you provide the pre-training model of ade20k? If I add the ade20k dataset for training, an error will be reported?

Concerns about dissimilar real image and ref image pairs

Hi,

This is really an impressive project. But I have some concerns about deepfashion_ref.txt, where a small number of dissimilar real image and ref image pairs are included in training and evaluation, according to the code in function 'get_ref_video_like' of deepfashionHD_dataset.py if I understand correctly. An example used in the evaluation in deepfashion_ref_test.txt is

real_image: img\MEN\Denim\id_00000089\02_7_additional.jpg
ref_image: img\WOMEN\Dresses\id_00001748\03_4_full.jpg.

I wonder if it would hurt the training and evaluation?

Thanks so much.

KeyError: 'dataset/deepfashionHD\\img\\MEN\\Denim\\id_00000089\\02_7_additional.jpg'

When I am running the project from the following command

python ./test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

then I am getting the following error:

Traceback (most recent call last):
  File "./test.py", line 25, in <module>
    for i, data_i in enumerate(dataloader):
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\dataloader.py", line 435, in __next__
    data = self._next_data()
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data
    return self._process_data(data)
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
    data.reraise()
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\_utils.py", line 428, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\_utils\worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "F:\PycharmProjects_04_05_2021\init_CoCosNet_V2_231021\data\pix2pix_dataset.py", line 72, in __getitem__
    val = self.ref_dict[key]
KeyError: 'dataset/deepfashionHD\\img\\MEN\\Denim\\id_00000089\\02_7_additional.jpg'

Can anyone please help me in solving this.

您好,想问我把所有的数据集放在了对应的文件夹下,运行python train那个命令后报错了,可以帮我看看是什么问题吗

Traceback (most recent call last):
File "train.py", line 21, in
dataloader = data.create_dataloader(opt)
File "/home/kas/CoCosNet-v2/data/init.py", line 33, in create_dataloader
instance.initialize(opt)
File "/home/kas/CoCosNet-v2/data/pix2pix_dataset.py", line 33, in initialize
self.ref_dict, self.train_test_folder = self.get_ref(opt)
TypeError: cannot unpack non-iterable NoneType object

xxx_ref.txt and xxx_ref_test.txt in other dataset.

Can you provide configuration files for the other datasets mentioned in the paper? For example xxx_ref.txt and xxx_ref_test.txt, as provided by CoCosNet v1. In particular, configuration files for the Metfaces dataset would be appreciated, otherwise comparisons cannot be made.

Cannot find file and cannot run test

running "Inference Using Pretrained Model" ,but I get an error like the one below

./dataset/deepfashionHD/pose\MEN\Denim\id_00000089\02_7_additional_candidate.txt not found.

Different image size

Many thanks for your work. I am sorry to ask is it possible that the proposed patch matching method can be used for different image size for traning and testing.

google colab

please add a google colab notebook for inference

I got the error when runing the test

When I delete latest_net_D.pth and latest_net_G.pth from thelatest_net_Corr.pth, I run the code. An error has occurred
python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512
error

training on custome dataset

hello,
you used OpenPose to create pose *.txt files
how i can create thous files for another data set ?
i follow the link for OpenPose but i i can't see where the pose *.txt files are made , i can see only the pose images results
thank you

about mask to image

Thank you for this great work!

I want to know whether this code can be used to train a mask2image cocosNetv2. I found some cocosNetv1 mask options have been removed in this version(maskmix, warp_mask_losstype...).

So I want to know if I use this code to train a mask2image cocosNetv2, what changes should I make?

Thanks!

training and testing from scratch

hey when i train the model from random weights during the training i can see some results ( every N epochs) when i run test.py with the new trained models the predictions is white background no image at all

00000001

Runing Error

file in archive is not in a subdirectory archive/: latest_net_D.pth

train Metfaces dataset error

Hello friends, I use your model to train metfaces dataset and create paired label edge label After 100 rounds of training, classify the pictures according to the cosine similarity of VGG extracted features. The test results are as follows. What's the problem? Thank you and wish you success in your scientific research!
image
image

Pretrained model "latest_net_netCorr.pth" corrupted

After following the instructions to run the test.py, the following error pops up
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\caffe2\serialize\inline_container.cc:231] . file in archive is not in a subdirectory archive/: latest_net_D.pth

when execute

net['netCorr'] = util.load_network(net['netCorr'], 'Corr', opt.which_epoch, opt)

in pix2pix_model.py (ln 119)

Does this mean the pth file was corrupted?

I got the keyerror when runing the test

when I run the test, I got the keyerror results.
python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

スクリーンショット 2021-08-16 214214

(the images have been resized to 512)

I got the wrong results when runing the test

when I run the test, I got the wrong results, the scripts as follows:(Note the images have been resized to 512)
python test.py --name deepfashionHD --dataset_mode deepfashionHD
--dataroot /datasets/GANDatasets/deepfashionHD
--PONO --PONO_C --no_flip --batchSize 4 --gpu_ids 0
--netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512
--iteration_count 5 --load_size 512 --crop_size 512

image

test error!

hi,thans for your work,but when I torch.load(),
[enforce fail at ..\caffe2\serialize\inline_container.cc:211] . file in archive is not in a subdirectory archive/: latest_net_D.pth

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.