Giter Site home page Giter Site logo

junyanz / bicyclegan Goto Github PK

View Code? Open in Web Editor NEW
1.5K 48.0 253.0 30.85 MB

Toward Multimodal Image-to-Image Translation

Home Page: https://junyanz.github.io/BicycleGAN/

License: Other

Python 90.34% Shell 8.06% TeX 1.60%
pytorch pix2pix gans generative-adversarial-network deep-learning

bicyclegan's People

Contributors

cuihaoleo avatar junyanz avatar ploth avatar richzhang avatar yenchenlin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bicyclegan's Issues

About conditioning the discriminator

Hi,

In Section 4 Implementation Details

Training details ...
We also find that not conditioning the discriminator D on input A leads to
better results (also discussed in [34]), ...

Does this means that the discriminator has no information to ensure the generated image to be conditioned on image A?
Say we are generating shoes from edges. Being unconditioned on the input image (edges), the discriminator should only be able to tell if the generated shoes are real/fake, but not able to tell if the generated shoes doesn't match the conditions (edge)?
Or some other mechanism is working on the condition part?

Looking forward to your reply, thanks! Nice work!

Training on Piano Roll data

Hey Jun-Yan, thanks for putting this repo together.
I'm trying to train it on piano roll data and have been seeing unexpected behavior: the generator outputs the same image even though the conditions, i.e. noise vector and real A, change.

Any thoughts on what it could be? I've added the loss log, options, output images during training and output images during inference.

loss_log.txt
opt.txt

Model outputs during training(fake_b_encoded, fake_b_random, real_a_encoded,real_b_encoded)
fake_b_encoded fake_b_random real_a_encoded real_b_encoded

model.set_input(data)
encode = False
z_samples = model.get_z_random(1, opt.nz)
real_A, fake_B, real_B = model.test(z_samples, encode=encode)

Model outputs during inference(real_a, real_b, fake_b)

iModel output during inference

About disentangled representation of latent code z

Hi, thanks for your amazing work.
I have trained the model with my own dataset, and generated multi-style samples corresponding input with different latent code z.

However, I felt it is difficult to generate the specific style sample. It is very painful that read the mapping of latent code z.
Is it possible to disentangle latent code z by adding mutual information? Or there is any method to solve this?

Error loading state_dict for G_Unet_add_all

I'm getting a size mismatch when trying to load a saved model. Training works fine.
I compared training and test OPTs and can't find any difference that would produce the error below.

model = create_model(opt)
#Loading model bicycle_gan...
#initialize network with xavier
#initialize network with xavier
#model [BiCycleGANModel] was created
model.setup(opt)
RuntimeError: Error(s) in loading state_dict for G_Unet_add_all:
	While copying the parameter named "model.down.0.weight", whose dimensions in the model are torch.Size([64, 9, 4, 4]) and whose dimensions in the checkpoint are torch.Size([64, 49, 4, 4]).
	While copying the parameter named "model.submodule.down.1.weight", whose dimensions in the model are torch.Size([128, 72, 4, 4]) and whose dimensions in the checkpoint are torch.Size([128, 112, 4, 4]).
...

train_opt.txt
test_opt.txt

why "self.real_B_random = self.real_B[half_size:]" but not the first half

For the line below, as the batch size =2. So the two images in self.real_B are different.

self.real_B_random = self.real_B[half_size:]

self.real_B_random = self.real_B[half_size:]

self.loss_D2, self.losses_D2 = self.backward_D(self.netD2, self.real_data_random, self.fake_data_random)

it seems that the D tries to distinguish between a fake encoded B and a real image, but from a different image file in the code above.

so why not use self.real_B_random = self.real_B[0:half_size] which is the same image instead.

if the 2 images are very random: the 1st encoded image is from a "sneaker" and the 2nd real random image is a "high heel", then such a way is benefit to improve D?

Adding information to BicycleGAN network

As far as I understood the paper, it is possible to add images and additional information to G and D in BicycleGAN. Is this possible to achieve this by using the add_to_input function? If so, where do I set the number and the path of the additional information? Otherwise, which function should be used for this purpose?

Thank you in advance.

Odd batch sizes

I tried to classify some images with pix2pix + CycleGAN and with BicycleGAN. In pix2pix the best results occured for the batch size of 1. However, in BicycleGAN I get an error that the batch size should be even. When removing the corresponding assertion, I get a cuDNN error due to a bad parameter. Is the assertion necessary for aligned datasets?

cuDNN error

I am using training dataset of 203 samples and validation of 70 samples.
Is this error due to very less amount of samples?
screenshot from 2019-01-09 12-50-09

Installation issue

We are trying to install your package but run into a couple problems.
What we did is that we install vscode and pyhton plugins. and we made sure its run on conda.
it couldnt process bash command ('bash' is not recognized as an internal or external command, operable program or batch file.)
. So we insstall Cygwin, and enable linux subsystem feature, enable bash, download libusbwin32 and use C:\WINDOWS\system32>lxrun /install in cmd prompt.
Now when i try to bash install_conda VScode prompts me "/bin/bash: /scripts/install_conda.sh: No such file or directory" even though im sure im putting the correct path for forementioned script

Multiple inputs network

Hi,
Thanks for sharing this amazing work!
I have tried to apply BicycleGAN into MRI image translation tasks, and it works well!
Now, I am trying to change the network as a multiple inputs network. My idea is by given multiple corresponding inputs, the output will be more realistic and accurate.
Do you think this is possible to achieve based on bicycleGAN?

A resize_ error

Hello! When I run the test_edges2handbags.sh, I met an error. What's the problem?

model [BiCycleGANModel] was created
Loading model bicycle_gan
process input image 000/010
Traceback (most recent call last):
File "./test.py", line 41, in
model.set_input(data)
File "/home/jaheimlee/gitrep/BicycleGAN/models/base_model.py", line 193, in set_input
self.input_A.resize_(input_A.size()).copy_(input_A)
RuntimeError: calling resize_ on a tensor that has non-resizable storage. Clone it first or create a new tensor instead.

Question about backward_G_alone

In this line could you explain why torch.mean(torch.abs(self.mu2 - self.z_random)) not torch.mean(torch.abs(self.z_predict - self.z_random))?

[Question] The bidirectional cycle-consistency losses advantage compared to pix2pix?

Hi, I am really shocked by the awesome work you guys achieved, but yet I get confused about "How BicycleGAN's result differs from pix2pix's result and cyclegan's"

  1. Does BicycleGAN give better result at edge2shoes than pix2pix with same training sets and epochs ?
  2. Does BicycleGAN trains faster using bidirectional cycle-consistency losses ?
  3. In what condition BicycleGAN is better than ... others , and when does not ?

Running BicycleGAN on CPU

I have an issue regarding the execution of BicycleGAN on the CPU. As far as I understood the code, the parameter gpu_ids has to be set to '-1'. How can I achieve this by running the script train_edges2shoes.sh?

In other words, is there a command similar to: "bash ./scripts/train_edges2shoes.sh --gpu_ids=-1" (without quotation marks) which allows us to run BicycleGAN on the CPU?

Thank you in advance.

Why use parallel_forward

I have a question about the code of discriminator.
In Class D_NLayersMulti
it uses parallel_forward.

But I can't get the point of why using pareallel_forward. Besides, the paper said it used Patch-GAN, however, I could not find any codes to create Patch-GAN.

RuntimeError: size mismatch , when using my own 512x512 dataset

python train.py --dataroot ./datasets/maps --name maps_bicyclegan --model bicycle_gan --display_id 0 --nThread 0 --loadSize 512 --fineSize 512 --display_winsize 512 --which_direction 'AtoB' --use_dropout --gpu_ids 0

model [BiCycleGANModel] was created
create web directory ./checkpoints/maps_bicyclegan/web...
Traceback (most recent call last):
  File "train.py", line 25, in <module>
    model.update_D(data)
  File "/workdir/BicycleGAN/models/bicycle_gan_model.py", line 120, in update_D
    self.forward()
  File "/workdir/BicycleGAN/models/bicycle_gan_model.py", line 43, in forward
    self.mu, self.logvar = self.netE.forward(self.real_B_encoded)
  File "/workdir/BicycleGAN/models/networks.py", line 697, in forward
    output = self.fc(conv_flat)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 835, in linear
    return torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch at /pytorch/torch/lib/THC/generic/THCTensorMathBlas.cu:243

tested 256x256 works fine, but any other size would raise the exception above
I am not good enough to fix this on my own after hours ...

How to reproduce the data of LPIPS distance?

Your work gets surprising results and I expect to reproduce the data of LPIPS distance that you list in Figure6. Given one input image, you sample 19 outputs. For every input(maps), do you calculate the LPIPS distance between the given image(maps) and corresponding 19 samples(satellite) ? After that, you sum those 19 groups of data and have a average? Is it the same to other 99 input images in your experiment ? I'm confused about this and looking forward to your reply, thank you!

Differences between released model and training script

First, thanks for releasing a clean and well trained code for your paper!
I tried retraining the edges2shoes model using the "scripts/train_edges2shoes.sh" script. However, the validation results generated by my retrained model look clearly inferior to results from your released trained model. So, are there any extra tricks used for the released trained model (e.g. longer training, different hyper parameter or loss settings, ... etc)?
Details: I trained using a single Quadro P6000 GPU (with cuda 9.0 and python 2.7) for the full 60 epochs without changing your code/scripts.

out of memory

when I run command that bash ./scripts/train_edges2shoes.sh
I got following error
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu:58
how to solve this problem?

LeakyReLU object has no attribute conv

When testing my own network trained with BicycleGAN, sometimes there appears an error termed "LeakyReLU object has no attribute conv". What is the reason for this behaviour and which part of the code should be adapted to alleviate this error? I suppose that there might be a mistake when defining the paths in the testing script.

Additionally, how should the testing script be adapted when the input images are 512x512 pixels?

I have some question

It is also "Unsupervised or Unpaird Image to Image translation" like CycleGAN ?
When I read the paper roughly, I thought it was supervised.

but I read the code, It is like unsupervised ...

what is this means?

GMDQ_LVP0YS78IY~@ $@3{W

I cann't train and use a pre-trained model because of this reason? Have any one fix it ?

About Encoder

line 682 and 683 in class E_ResNet in models/networks.py :
input_ndf = ndf * min(max_ndf, n) # 2**(n-1)
output_ndf = ndf * min(max_ndf, n+1) # 2**n

The code and code comment are not consistent. The multiplier of the number of first filters is not growing exponentially.

day-night

hello junyanz
Can you share day-night for me?
thank you very much

Unexpected pdb debugger trace statement in commit

Hi, in your commit
219b3f9
there are lots of import pdb; pdb.set_trace() debugger statement within code

   def forward(self):
        # get real images
        half_size = self.opt.batchSize // 2
        half_size = self.opt.batch_size // 2
        import pdb; pdb.set_trace()
        # A1, B1 for encoded; A2, B2 for random

I am not sure if this is intended, or just forgot to remove them after debug?

About EOFError: Ran out of input

Hello , sorry to bother you. Could you please help me to fix this problem? When I run the command that bash ./scripts/test_edges2shoes.sh
I got the following error:

File "/home/ysgx/.local/lib/python3.6/site-packages/torch/serialization.py", line 368, in load
return _load(f, map_location, pickle_module)
File "/home/ysgx/.local/lib/python3.6/site-packages/torch/serialization.py", line 532, in _load
magic_number = pickle_module.load(f)
EOFError: Ran out of input

图片
Does that mean there is something wrong with my installation of Pytorch?
I look forward to receiving your reply as soon as possible.
Thank you very much.

skip this point data_size = 1

Dear sir,When I run the script:bash ./scripts/train_edges2shoes.sh,the following RuntimeError occurs.
(epoch: 1, iters: 49400, time: 0.311) , z_encoded_mag: 0.577, G_total: 4.293, G_L1_encoded: 2.367, z_L1: 0.259, KL: 0.076, G_GAN: 1.002, D_GAN: 0.498, G_GAN2: 0.589, D_GAN2: 0.988
(epoch: 1, iters: 49600, time: 0.322) , z_encoded_mag: 0.409, G_total: 2.001, G_L1_encoded: 0.385, z_L1: 0.302, KL: 0.069, G_GAN: 0.794, D_GAN: 0.960, G_GAN2: 0.450, D_GAN2: 1.137
(epoch: 1, iters: 49800, time: 0.311) , z_encoded_mag: 0.939, G_total: 3.441, G_L1_encoded: 1.597, z_L1: 0.373, KL: 0.079, G_GAN: 0.833, D_GAN: 0.774, G_GAN2: 0.560, D_GAN2: 1.015
skip this point data_size = 1
Traceback (most recent call last):
File "./train.py", line 28, in
model.update_G()
File "/home/rharad/junyanz/BicycleGAN/models/bicycle_gan_model.py", line 148, in update_G
self.backward_EG()
File "/home/rharad/junyanz/BicycleGAN/models/bicycle_gan_model.py", line 114, in backward_EG
self.loss_G.backward(retain_graph=True)
File "/home/rharad/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/rharad/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

How can I solve this problem.....

narrow tensor to zero elements fails

When ran the code on a single GPU, I got an error:

...
RuntimeError: start (0) + length (0) exceeds dimension size (0). (narrow at /opt/conda/conda-bld/pytorch_1533672544752/work/aten/src/ATen/native/TensorShape.cpp:157)
frame #0: at::Type::narrow(at::Tensor const&, long, long, long) const + 0x80 (0x7f5e4df3de80 in /data/yahui/anaconda2/envs/pytorch-CycleGAN-and-pix2pix/lib/python3.5/site-packages/torch/lib/libcaffe2.so)
...

I have checked the installation and configuration, both of them are correct.
So, do you have any ideas for the problem?

Issue while running test.py

Hi, I am getting the following error when running the code. I am using python3.6 and have installed the latest pytorch version(1.0).
screenshot from 2019-03-01 19-35-51

Could you suggest the issue involved ?

Training cLR-GAN or cVAE-GAN

Hi,
Thank you very much for sharing this fantastic work!
I would like to train a model, cLR-GAN or cVAE-GAN.
I think the default setting is MODEL='bicycle_gan'.
Is there any way to train each model?

Thank you!

About val data

Hi, thanks for your amazing work.
I train the network with my own database, it works well.
However, I don't have so many data for training(just about 700 images).
My question is

  1. How many data should I put in the val folder? Or, the val folder is just for testing the network?
  2. Besides the paired data, I also have some unpaired data, for example, (Many B but lack of A, training direction: A2B), let me know if you are consider an algorithm for such semi-supervised situation.

Thanks.

Is there any reason why relu is the default option?

I wonder why you used relu as a non-linear function as the default option. I know many models are using leaky relu, is there any reason why relu is the default option? Has the experiment proved better results?

An error in `UnetBlock_with_z`

The way you inject noise in UnetBlock_with_z is possibly ineffective.

Let's consider an example:

import torch
import torch.nn as nn

s = 16
x = torch.rand(1, 10, s, s)
p = 0

# your downsampling block
down = nn.Sequential(
    nn.LeakyReLU(0.2),
    nn.Conv2d(18, 10, kernel_size=4, stride=2, padding=p),
    nn.InstanceNorm2d(10)
)

z = torch.rand(1, 8)  # noise
z_img = z.view(z.size(0), z.size(1), 1, 1).expand(z.size(0), z.size(1), x.size(2), x.size(3))

x_and_z = torch.cat([x, z_img], 1)
y = down(x_and_z)

y doesn't depend on z if p is equal to zero.
If p is equal to 1 and s is large (~128) then y changes in some weird way.

I am talking about this part of your code:
https://github.com/junyanz/BicycleGAN/blob/master/models/networks.py#L729

about multi-gpu support

Hi, thanks for your amazing work. I have a problem with the multi-gpu support. The original batchsize is 2 for a real image and another random. In fact it's one pair. If we want to use multi-gpu for faster training, the batch size should be larger. But simply changing the batchsize causes error. May I know is it easy to modify the code for this purpose?

question about the loss function

Dear authors,
thanks for your amazing work.

I have a question about the value of the lambdas in the loss function. I understood that they define a certain balance between converging to several images close to the original B, and diverging a bit from the original B.

By default, you chose:

lambda_L1=10.0
lambda_GAN=1.0
lambda_GAN2=1.0
lambda_z=0.5
lambda_kl=0.01

I trained the bicycle gan model with the default parameters. My dataset consist of 90 000 images. Meanwhile some result are impressive, the majority remain a bit blurry (they don t follow the same probabilty distribution as in my original dataset). I guess the lambda_L1 is not high enough to force the network to generate images that follow the original probability distribution. What result could you get if you choose a lambda_L1=100 such as in pix2pix ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.