junyanz / bicyclegan Goto Github PK

View Code? Open in Web Editor NEW

1.5K 48.0 253.0 30.85 MB

Toward Multimodal Image-to-Image Translation

Home Page: https://junyanz.github.io/BicycleGAN/

License: Other

Python 90.34% Shell 8.06% TeX 1.60%

pytorch pix2pix gans generative-adversarial-network deep-learning

bicyclegan's People

Contributors

Stargazers

Watchers

Forkers

shubhampachori12110095 oppa3109 zilongzhong ml-lab cclauss jliangnku monjovi duke24k jithsjoy felicia126 c1a1o1 kurnianggoro mrgoogol mfeldman143 soccergame lulllabs jaykimbravekjh b2220333 kingofoz yuhangsong deeprrl cooparation minsu-daniel-kim lukeandshuo zjut-jianhuazhang wkentaro edwin-oetelaar bingzhewu jacklee19860111 th4nos caoang327 linhanxiao liumaoshen keyky anazou happyxuwork humengdoudou ufoym world4jason hal2001 pandinosaurus ipa-nhg-dd zhixinshu sachinjm eveningglow liufeng1990 10183308 junhocho yenchenlin kunlqt codes-kzhan pilotbear wangzheallen taktak1 meghanshubhatt hrukalive alienroot71 jkimmason kristofe qiqika an1nur1vision workingcc kevintrannz locussam passtion hlgrprng eman-e suntaopku xinshu ajinkyapuar jianqiangren hzlsophia nick917 scapeqin xjhaoren sonynka tony32769 zheng222 kraken000 suprosanna hbcbh1999 sbaio stoneyang xiaoanshi afcarl redhat12345 nprasad2021 phamconganh mxochicale rohitn chaehunshin iiidimaiii hologerry haipengxiong muxinghan mrlightman5 zhangjunyi1225054736 huangpu1 triper1022 doubletry

bicyclegan's Issues

About conditioning the discriminator

Hi,

In Section 4 Implementation Details

Training details ...
We also find that not conditioning the discriminator D on input A leads to
better results (also discussed in [34]), ...

Does this means that the discriminator has no information to ensure the generated image to be conditioned on image A?
Say we are generating shoes from edges. Being unconditioned on the input image (edges), the discriminator should only be able to tell if the generated shoes are real/fake, but not able to tell if the generated shoes doesn't match the conditions (edge)?
Or some other mechanism is working on the condition part?

Looking forward to your reply, thanks! Nice work!

Training on Piano Roll data

Hey Jun-Yan, thanks for putting this repo together.
I'm trying to train it on piano roll data and have been seeing unexpected behavior: the generator outputs the same image even though the conditions, i.e. noise vector and real A, change.

Any thoughts on what it could be? I've added the loss log, options, output images during training and output images during inference.

loss_log.txt
opt.txt

Model outputs during training(fake_b_encoded, fake_b_random, real_a_encoded,real_b_encoded)

model.set_input(data)
encode = False
z_samples = model.get_z_random(1, opt.nz)
real_A, fake_B, real_B = model.test(z_samples, encode=encode)

Model outputs during inference(real_a, real_b, fake_b)

About disentangled representation of latent code z

Hi, thanks for your amazing work.
I have trained the model with my own dataset, and generated multi-style samples corresponding input with different latent code z.

However, I felt it is difficult to generate the specific style sample. It is very painful that read the mapping of latent code z.
Is it possible to disentangle latent code z by adding mutual information? Or there is any method to solve this?

Error loading state_dict for G_Unet_add_all

I'm getting a size mismatch when trying to load a saved model. Training works fine.
I compared training and test OPTs and can't find any difference that would produce the error below.

model = create_model(opt)
#Loading model bicycle_gan...
#initialize network with xavier
#initialize network with xavier
#model [BiCycleGANModel] was created
model.setup(opt)
RuntimeError: Error(s) in loading state_dict for G_Unet_add_all:
	While copying the parameter named "model.down.0.weight", whose dimensions in the model are torch.Size([64, 9, 4, 4]) and whose dimensions in the checkpoint are torch.Size([64, 49, 4, 4]).
	While copying the parameter named "model.submodule.down.1.weight", whose dimensions in the model are torch.Size([128, 72, 4, 4]) and whose dimensions in the checkpoint are torch.Size([128, 112, 4, 4]).
...

train_opt.txt
test_opt.txt

why "self.real_B_random = self.real_B[half_size:]" but not the first half

For the line below, as the batch size =2. So the two images in self.real_B are different.

self.real_B_random = self.real_B[half_size:]

BicycleGAN/models/bicycle_gan_model.py

Line 100 in 9cd9081

self.real_B_random = self.real_B[half_size:]

BicycleGAN/models/bicycle_gan_model.py

Line 178 in 9cd9081

    
           self.loss_D2, self.losses_D2 = self.backward_D(self.netD2, self.real_data_random, self.fake_data_random)

it seems that the D tries to distinguish between a fake encoded B and a real image, but from a different image file in the code above.

so why not use self.real_B_random = self.real_B[0:half_size] which is the same image instead.

if the 2 images are very random: the 1st encoded image is from a "sneaker" and the 2nd real random image is a "high heel", then such a way is benefit to improve D?

Adding information to BicycleGAN network

As far as I understood the paper, it is possible to add images and additional information to G and D in BicycleGAN. Is this possible to achieve this by using the add_to_input function? If so, where do I set the number and the path of the additional information? Otherwise, which function should be used for this purpose?

Thank you in advance.

Odd batch sizes

I tried to classify some images with pix2pix + CycleGAN and with BicycleGAN. In pix2pix the best results occured for the batch size of 1. However, in BicycleGAN I get an error that the batch size should be even. When removing the corresponding assertion, I get a cuDNN error due to a bad parameter. Is the assertion necessary for aligned datasets?

cuDNN error

I am using training dataset of 203 samples and validation of 70 samples.
Is this error due to very less amount of samples?

Installation issue

We are trying to install your package but run into a couple problems.
What we did is that we install vscode and pyhton plugins. and we made sure its run on conda.
it couldnt process bash command ('bash' is not recognized as an internal or external command, operable program or batch file.)
. So we insstall Cygwin, and enable linux subsystem feature, enable bash, download libusbwin32 and use C:\WINDOWS\system32>lxrun /install in cmd prompt.
Now when i try to bash install_conda VScode prompts me "/bin/bash: /scripts/install_conda.sh: No such file or directory" even though im sure im putting the correct path for forementioned script

Multiple inputs network

Hi,
Thanks for sharing this amazing work!
I have tried to apply BicycleGAN into MRI image translation tasks, and it works well!
Now, I am trying to change the network as a multiple inputs network. My idea is by given multiple corresponding inputs, the output will be more realistic and accurate.
Do you think this is possible to achieve based on bicycleGAN?

Can I use my own data for training?

A resize_ error

Hello! When I run the test_edges2handbags.sh, I met an error. What's the problem?

model [BiCycleGANModel] was created
Loading model bicycle_gan
process input image 000/010
Traceback (most recent call last):
File "./test.py", line 41, in
model.set_input(data)
File "/home/jaheimlee/gitrep/BicycleGAN/models/base_model.py", line 193, in set_input
self.input_A.resize_(input_A.size()).copy_(input_A)
RuntimeError: calling resize_ on a tensor that has non-resizable storage. Clone it first or create a new tensor instead.

Question about backward_G_alone

In this line could you explain why torch.mean(torch.abs(self.mu2 - self.z_random)) not torch.mean(torch.abs(self.z_predict - self.z_random))?

[Question] The bidirectional cycle-consistency losses advantage compared to pix2pix?

Hi, I am really shocked by the awesome work you guys achieved, but yet I get confused about "How BicycleGAN's result differs from pix2pix's result and cyclegan's"

Does BicycleGAN give better result at edge2shoes than pix2pix with same training sets and epochs ?
Does BicycleGAN trains faster using bidirectional cycle-consistency losses ?
In what condition BicycleGAN is better than ... others , and when does not ?

Should this be real_A_random instead of real_A_encoded?

BicycleGAN/models/bicycle_gan_model.py

Line 64 in 944a87d

[self.real_A_encoded, self.fake_B_random], 1)

Running BicycleGAN on CPU

I have an issue regarding the execution of BicycleGAN on the CPU. As far as I understood the code, the parameter gpu_ids has to be set to '-1'. How can I achieve this by running the script train_edges2shoes.sh?

In other words, is there a command similar to: "bash ./scripts/train_edges2shoes.sh --gpu_ids=-1" (without quotation marks) which allows us to run BicycleGAN on the CPU?

Thank you in advance.

Why use parallel_forward

I have a question about the code of discriminator.
In Class D_NLayersMulti
it uses parallel_forward.

But I can't get the point of why using pareallel_forward. Besides, the paper said it used Patch-GAN, however, I could not find any codes to create Patch-GAN.

how was wgan-gp used in the network?

I noted that there is a function cal_gradient_penalty in your code, but it seems that you didn't use it?

RuntimeError: size mismatch , when using my own 512x512 dataset

python train.py --dataroot ./datasets/maps --name maps_bicyclegan --model bicycle_gan --display_id 0 --nThread 0 --loadSize 512 --fineSize 512 --display_winsize 512 --which_direction 'AtoB' --use_dropout --gpu_ids 0

model [BiCycleGANModel] was created
create web directory ./checkpoints/maps_bicyclegan/web...
Traceback (most recent call last):
  File "train.py", line 25, in <module>
    model.update_D(data)
  File "/workdir/BicycleGAN/models/bicycle_gan_model.py", line 120, in update_D
    self.forward()
  File "/workdir/BicycleGAN/models/bicycle_gan_model.py", line 43, in forward
    self.mu, self.logvar = self.netE.forward(self.real_B_encoded)
  File "/workdir/BicycleGAN/models/networks.py", line 697, in forward
    output = self.fc(conv_flat)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 835, in linear
    return torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch at /pytorch/torch/lib/THC/generic/THCTensorMathBlas.cu:243

tested 256x256 works fine, but any other size would raise the exception above
I am not good enough to fix this on my own after hours ...

Why doesn't E_ResNet have activation functions in the fully-connected layers?

I've noticed that self.fc = nn.Sequential(*[nn.Linear(output_ndf, output_nc)]) or self.fcVar = nn.Sequential(*[nn.Linear(output_ndf, output_nc)]) is the last layer of E_ResNet, depending on vaeLike is True or not. So why doesn't it need an activation function before being output?

How to reproduce the data of LPIPS distance?

Your work gets surprising results and I expect to reproduce the data of LPIPS distance that you list in Figure6. Given one input image, you sample 19 outputs. For every input(maps), do you calculate the LPIPS distance between the given image(maps) and corresponding 19 samples(satellite) ? After that, you sum those 19 groups of data and have a average? Is it the same to other 99 input images in your experiment ? I'm confused about this and looking forward to your reply, thank you!

Differences between released model and training script

First, thanks for releasing a clean and well trained code for your paper!
I tried retraining the edges2shoes model using the "scripts/train_edges2shoes.sh" script. However, the validation results generated by my retrained model look clearly inferior to results from your released trained model. So, are there any extra tricks used for the released trained model (e.g. longer training, different hyper parameter or loss settings, ... etc)?
Details: I trained using a single Quadro P6000 GPU (with cuda 9.0 and python 2.7) for the full 60 epochs without changing your code/scripts.

Is possible to generate labels from facades images?

Is possible to generate labels from facades images?
We tried both --which_direction B2A and --which_direction A2B, but the results are the same, is there anything I missed?
Thanks a lot!

the database of day2night

I sent a email to the [email protected], but failed to get the link of the database ,can you send a link of this database to my email [email protected] ,Thank you

undefined name 'OneDirectionTestModel'

flake8 testing of https://github.com/junyanz/BicycleGAN on Python 2.7.14

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./models/models.py:10:17: F821 undefined name 'OneDirectionTestModel'
        model = OneDirectionTestModel()
                ^

out of memory

when I run command that bash ./scripts/train_edges2shoes.sh
I got following error
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu:58
how to solve this problem?

LeakyReLU object has no attribute conv

When testing my own network trained with BicycleGAN, sometimes there appears an error termed "LeakyReLU object has no attribute conv". What is the reason for this behaviour and which part of the code should be adapted to alleviate this error? I suppose that there might be a mistake when defining the paths in the testing script.

Additionally, how should the testing script be adapted when the input images are 512x512 pixels?

I have some question

It is also "Unsupervised or Unpaird Image to Image translation" like CycleGAN ?
When I read the paper roughly, I thought it was supervised.

but I read the code, It is like unsupervised ...

what is this means?

$GMDQ_LVP0YS78IY~@ $@3{W$

I cann't train and use a pre-trained model because of this reason? Have any one fix it ？

About Encoder

line 682 and 683 in class E_ResNet in models/networks.py :
input_ndf = ndf * min(max_ndf, n) # 2**(n-1)
output_ndf = ndf * min(max_ndf, n+1) # 2**n

The code and code comment are not consistent. The multiplier of the number of first filters is not growing exponentially.

What's the difference between D_NLayersMulti and D_NLayers?

It seems that D_NLayersMulti has two discriminators while D_NLayers has only one. But how do two discriminators work?

not sure if the code is written deliberately about the loss

day-night

hello junyanz
Can you share day-night for me?
thank you very much

Unexpected pdb debugger trace statement in commit

Hi, in your commit
219b3f9
there are lots of import pdb; pdb.set_trace() debugger statement within code

   def forward(self):
        # get real images
        half_size = self.opt.batchSize // 2
        half_size = self.opt.batch_size // 2
        import pdb; pdb.set_trace()
        # A1, B1 for encoded; A2, B2 for random

I am not sure if this is intended, or just forgot to remove them after debug?

what is the meaning of fake_B_random and fake_B_encoded

Hi!

Thank you for sharing this wonderful work.
I have one simple question.
In the result visualizations, what is the meaning of fake_B_random and fake_B_encoded?

Thank you

Hi folks, could you try the latest commit? Hopefully, it can address your issues.

Originally posted by @junyanz in #54 (comment)

Yes, now its working great!! Thank you.

python3 and slice indices must be integers...

Tried training (based on my experience with pytorch-CycleGAN-pix2pix), my pytorch setup is based on python 3.6. Got error "slice indices must be integers or None or have an index method" because of the division at line https://github.com/junyanz/BicycleGAN/blob/master/models/bicycle_gan_model.py#L35

I suggest adding explicit conversion to int at this line.

About EOFError: Ran out of input

Hello , sorry to bother you. Could you please help me to fix this problem? When I run the command that bash ./scripts/test_edges2shoes.sh
I got the following error:

File "/home/ysgx/.local/lib/python3.6/site-packages/torch/serialization.py", line 368, in load
return _load(f, map_location, pickle_module)
File "/home/ysgx/.local/lib/python3.6/site-packages/torch/serialization.py", line 532, in _load
magic_number = pickle_module.load(f)
EOFError: Ran out of input

Does that mean there is something wrong with my installation of Pytorch?
I look forward to receiving your reply as soon as possible.
Thank you very much.

skip this point data_size = 1

Dear sir,When I run the script:bash ./scripts/train_edges2shoes.sh,the following RuntimeError occurs.
(epoch: 1, iters: 49400, time: 0.311) , z_encoded_mag: 0.577, G_total: 4.293, G_L1_encoded: 2.367, z_L1: 0.259, KL: 0.076, G_GAN: 1.002, D_GAN: 0.498, G_GAN2: 0.589, D_GAN2: 0.988
(epoch: 1, iters: 49600, time: 0.322) , z_encoded_mag: 0.409, G_total: 2.001, G_L1_encoded: 0.385, z_L1: 0.302, KL: 0.069, G_GAN: 0.794, D_GAN: 0.960, G_GAN2: 0.450, D_GAN2: 1.137
(epoch: 1, iters: 49800, time: 0.311) , z_encoded_mag: 0.939, G_total: 3.441, G_L1_encoded: 1.597, z_L1: 0.373, KL: 0.079, G_GAN: 0.833, D_GAN: 0.774, G_GAN2: 0.560, D_GAN2: 1.015
skip this point data_size = 1
Traceback (most recent call last):
File "./train.py", line 28, in
model.update_G()
File "/home/rharad/junyanz/BicycleGAN/models/bicycle_gan_model.py", line 148, in update_G
self.backward_EG()
File "/home/rharad/junyanz/BicycleGAN/models/bicycle_gan_model.py", line 114, in backward_EG
self.loss_G.backward(retain_graph=True)
File "/home/rharad/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/rharad/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

How can I solve this problem.....

narrow tensor to zero elements fails

When ran the code on a single GPU, I got an error:

...
RuntimeError: start (0) + length (0) exceeds dimension size (0). (narrow at /opt/conda/conda-bld/pytorch_1533672544752/work/aten/src/ATen/native/TensorShape.cpp:157)
frame #0: at::Type::narrow(at::Tensor const&, long, long, long) const + 0x80 (0x7f5e4df3de80 in /data/yahui/anaconda2/envs/pytorch-CycleGAN-and-pix2pix/lib/python3.5/site-packages/torch/lib/libcaffe2.so)
...

I have checked the installation and configuration, both of them are correct.
So, do you have any ideas for the problem?

Issue while running test.py

Hi, I am getting the following error when running the code. I am using python3.6 and have installed the latest pytorch version(1.0).

Could you suggest the issue involved ?

Training cLR-GAN or cVAE-GAN

Hi,
Thank you very much for sharing this fantastic work!
I would like to train a model, cLR-GAN or cVAE-GAN.
I think the default setting is MODEL='bicycle_gan'.
Is there any way to train each model?

Thank you!

About val data

Hi, thanks for your amazing work.
I train the network with my own database, it works well.
However, I don't have so many data for training(just about 700 images).
My question is

How many data should I put in the val folder? Or, the val folder is just for testing the network?
Besides the paired data, I also have some unpaired data, for example, (Many B but lack of A, training direction: A2B), let me know if you are consider an algorithm for such semi-supervised situation.

Thanks.

Is there any reason why relu is the default option?

I wonder why you used relu as a non-linear function as the default option. I know many models are using leaky relu, is there any reason why relu is the default option? Has the experiment proved better results?

An error in `UnetBlock_with_z`

The way you inject noise in UnetBlock_with_z is possibly ineffective.

Let's consider an example:

import torch
import torch.nn as nn

s = 16
x = torch.rand(1, 10, s, s)
p = 0

# your downsampling block
down = nn.Sequential(
    nn.LeakyReLU(0.2),
    nn.Conv2d(18, 10, kernel_size=4, stride=2, padding=p),
    nn.InstanceNorm2d(10)
)

z = torch.rand(1, 8)  # noise
z_img = z.view(z.size(0), z.size(1), 1, 1).expand(z.size(0), z.size(1), x.size(2), x.size(3))

x_and_z = torch.cat([x, z_img], 1)
y = down(x_and_z)

y doesn't depend on z if p is equal to zero.
If p is equal to 1 and s is large (~128) then y changes in some weird way.

I am talking about this part of your code:
https://github.com/junyanz/BicycleGAN/blob/master/models/networks.py#L729

Optimize G only for latent L1 loss

BicycleGAN/models/bicycle_gan_model.py

Line 161 in 5bd76ac

self.optimizer_G.step()

Hi, Junyan,
I am confused why the latent L1 loss is only used to optimize G only, but not used to optimize E?

By the way this line
self.fake_data_random = torch.cat([self.real_A_encoded, self.fake_B_random], 1)
should be
self.fake_data_random = torch.cat([self.real_A_random, self.fake_B_random], 1).
Right?

An Error in train or test `RuntimeError: output with shape [1, 256, 256] doesn't match the broadcast shape [3, 256, 256]`

Hello!
During my using time，i found a confused bug，that is
RuntimeError: output with shape [1, 256, 256] doesn't match the broadcast shape [3, 256, 256]
I don`t known why this error appearance, and can you help me to fix it？
whatever，i should thank you for you great project！

about multi-gpu support

Hi, thanks for your amazing work. I have a problem with the multi-gpu support. The original batchsize is 2 for a real image and another random. In fact it's one pair. If we want to use multi-gpu for faster training, the batch size should be larger. But simply changing the batchsize causes error. May I know is it easy to modify the code for this purpose?

Is the training available now?

If yes, please let us know how to.

question about the loss function

Dear authors,
thanks for your amazing work.

I have a question about the value of the lambdas in the loss function. I understood that they define a certain balance between converging to several images close to the original B, and diverging a bit from the original B.

By default, you chose:

lambda_L1=10.0
lambda_GAN=1.0
lambda_GAN2=1.0
lambda_z=0.5
lambda_kl=0.01

I trained the bicycle gan model with the default parameters. My dataset consist of 90 000 images. Meanwhile some result are impressive, the majority remain a bit blurry (they don t follow the same probabilty distribution as in my original dataset). I guess the lambda_L1 is not high enough to force the network to generate images that follow the original probability distribution. What result could you get if you choose a lambda_L1=100 such as in pix2pix ?

junyanz / bicyclegan Goto Github PK

bicyclegan's People

Contributors

Stargazers

Watchers

Forkers

bicyclegan's Issues

Recommend Projects

Recommend Topics

Recommend Org