ha0tang / lggan Goto Github PK

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

License: Other

Shell 1.68% Python 95.53% MATLAB 2.79%

image-translation image-manipulation image-generation cross-view gan generative-adversarial-network generative-model local global pytorch

lggan's Introduction

👯 We are looking self-motivated researcher to join/visit our Group.

Hao Tang

[Homepage] [Google Scholar] [Twitter]

I am currently a postdoctoral researcher at Computer Vision Lab, ETH Zurich, Switzerland.

⚡ News

We released the code of XingVTON and CIT for virtual try-on, the code of TransDA for source-free domain adaptation using Transformer, the code of IEPGAN for 3D pose transfer, the code of TransDepth for monocular depth prediction using Transformer, the code GLANet for unpaired image-to-image translation, the code MHFormer for 3D human pose estimation.

🌱 My Repositories

3D-Aware Image/Video Generation

3D-SGAN (ECCV 2022)

3D Human Pose Estimation

MHFormer (CVPR 2022)

Text-to-Image Synthesis

DF-GAN (CVPR 2022 Oral)
PPE (CVPR 2022)

3D Objection Generation

CGT (AAAI 2022)
IEPGAN (ICCV 2021)
AniFormer (BMVC 2021)

Monocular Depth Prediction

TransDepth (ICCV 2021)
StructuredAttention (CVPR 2018 Spotlight)

Face Anonymisation

AnonyGAN (ICIAP 2021)

Person Image Generation

XingGAN (ECCV 2020)
BiGraphGAN (BMVC 2020 Oral)
C2GAN (ACM MM 2019 Oral)
GestureGAN (ACM MM 2018 Oral & Best Paper Candidate)

Scene Image Generation

LGGAN (CVPR 2020)
DAGAN (ACM MM 2020)
DPGAN (TIP 2021)
SelectionGAN (CVPR 2019 Oral)
CrossMLP (BMVC 2021 Oral)
EdgeGAN
PanoGAN (TMM 2022)

Unsupervised Image Translation

GLANet
AttentionGAN (IJCNN 2019 Oral）
GazeAnimation (ACM MM 2020)
AsymmetricGAN (ACCV 2018 Oral)

Deep Dictionary Learning

DDLCN (WACV 2019 Oral)

Virtual Try-On

HCANet
CIT

Hand Gesture Recognition

HandGestureRecognition (Neurocomputing 2019)

Source-Free Domain Adaptation

TransDA

lggan's People

Contributors

Stargazers

Watchers

Forkers

c1a1o1 ml-lab 1912158597-george yqgans killsking zeta1999 rieryn zabir-nabil huichuanliu yz-sh maoshifu-yang 2100877953 hwpengtristin

lggan's Issues

Failed to reproduce ADE 20k with your provided checkpoint.

Model for ADE can not load the pretrained checkpoint. Please check it. Thank you!

Where is SAU

Could you please point to a part of code which corresponds to SAU module from p. 3.2. "Semantic-Aware Upsampling" ?

About arxiv paper image

Hello. I saw your paper at arxiv.
https://arxiv.org/pdf/1912.12215.pdf

And I have a question.
In Fig. 1, Fig. 16, Fig. 17, Global & Global+Local image are very similar.

If Local affects the Global+Local, it is thought that Local pixel in the white area of Local weight should appear in the result, but there is no such tendency. Am I misunderstanding?

question

sorry to interrupt you,sir.but i have a question which is that i can not get the access to you pretrain datasets and model.Can you share it,please?

size mismatch for conv weight when test_ade.sh

Hi,
Thanks for sharing your work. Btw, when I tried to reproduce using the ADE20K pretrained checkpoint, I came across the following error. I hope you can take a look:

`
LGGAN/semantic_image_synthesis$ sh test_ade.sh
----------------- Options ---------------
aspect_ratio: 1.0
batchSize: 1 [default: 2]
cache_filelist_read: False
cache_filelist_write: False
checkpoints_dir: ./checkpoints
contain_dontcare_label: True
crop_size: 256
dataroot: ./datasets/ade20k [default: ./datasets/cityscapes/]
dataset_mode: ade20k [default: coco]
display_winsize: 256
gpu_ids: 0 [default: 0,1]
how_many: inf
init_type: xavier
init_variance: 0.02
isTrain: False [default: None]
label_nc: 150
load_from_opt_file: False
load_size: 256
max_dataset_size: 9223372036854775807
model: pix2pix
nThreads: 0
name: LGGAN_ade [default: label2coco]
nef: 16
netG: lggan
ngf: 64
no_flip: True
no_instance: True
no_pairing_check: False
norm_D: spectralinstance
norm_E: spectralinstance
norm_G: spectralspadesyncbatch3x3
num_upsampling_layers: normal
output_nc: 3
phase: test
preprocess_mode: resize_and_crop
results_dir: ./results [default: ./results/]
serial_batches: True
use_vae: False
which_epoch: 200 [default: latest]
z_dim: 256
----------------- End -------------------
dataset [ADE20KDataset] of size 2000 was created
Network [LGGANGenerator] was created. Total number of parameters: 114.6 million. To see the architecture, do print(network).
Traceback (most recent call last):
File "test_ade.py", line 20, in
model = Pix2PixModel(opt)
File "/home/you/Work/LGGAN/semantic_image_synthesis/models/pix2pix_model.py", line 25, in init
self.netG, self.netD, self.netE = self.initialize_networks(opt)
File "/home/you/Work/LGGAN/semantic_image_synthesis/models/pix2pix_model.py", line 121, in initialize_networks
netG = util.load_network(netG, 'G', opt.which_epoch, opt)
File "/home/you/Work/LGGAN/semantic_image_synthesis/util/util.py", line 208, in load_network
net.load_state_dict(weights)
File "/home/you/anaconda3/envs/torch1.4-py36-cuda10.1-tf1.14/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for LGGANGenerator:
Unexpected key(s) in state_dict: "deconv5_35.weight", "deconv5_35.bias", "deconv5_36.weight", "deconv5_36.bias", "deconv5_37.weight", "deconv5_37.bias", "deconv5_38.weight", "deconv5_38.bias", "deconv5_39.weight", "deconv5_39.bias", "deconv5_40.weight", "deconv5_40.bias", "deconv5_41.weight", "deconv5_41.bias", "deconv5_42.weight", "deconv5_42.bias", "deconv5_43.weight", "deconv5_43.bias", "deconv5_44.weight", "deconv5_44.bias", "deconv5_45.weight", "deconv5_45.bias", "deconv5_46.weight", "deconv5_46.bias", "deconv5_47.weight", "deconv5_47.bias", "deconv5_48.weight", "deconv5_48.bias", "deconv5_49.weight", "deconv5_49.bias", "deconv5_50.weight", "deconv5_50.bias", "deconv5_51.weight", "deconv5_51.bias".
size mismatch for conv1.weight: copying a param with shape torch.Size([64, 151, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 36, 7, 7]).
size mismatch for deconv9.weight: copying a param with shape torch.Size([3, 156, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 105, 3, 3]).
size mismatch for fc2.weight: copying a param with shape torch.Size([51, 64]) from checkpoint, the shape in current model is torch.Size([35, 64]).
size mismatch for fc2.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([35]).

When can you make the code public?

I have read your article and I want to reproduce your work

RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead

I'm getting an error "RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead" and I don't know why.

Any ideas?

Link to pretrained models for Semantic Image Synthesis are broken

Hey!

Thank you for make the source code available.
The links for the pretrained models are broken. Can you fix this?
Thanks!

disi.unitn.it/~hao.tang/uploads/models/LGGAN/cityscapes_pretrained.tar.gz
disi.unitn.it/~hao.tang/uploads/models/LGGAN/ade_pretrained.tar.gz