vanhuyz / cyclegan-tensorflow Goto Github PK

View Code? Open in Web Editor NEW

1.2K 45.0 438.0 4.17 MB

An implementation of CycleGan using TensorFlow

License: MIT License

Python 93.81% Shell 2.34% Makefile 3.84%

tensorflow generative-adversarial-network cyclegan gan

cyclegan-tensorflow's Introduction

CycleGAN-TensorFlow

An implementation of CycleGan using TensorFlow (work in progress).

Original paper: https://arxiv.org/abs/1703.10593

Results on test data

apple -> orange

Input	Output		Input	Output		Input	Output

orange -> apple

Input	Output		Input	Output		Input	Output

Environment

TensorFlow 1.0.0
Python 3.6.0

Data preparing

First, download a dataset, e.g. apple2orange

$ bash download_dataset.sh apple2orange

Write the dataset to tfrecords

$ python3 build_data.py

Check $ python3 build_data.py --help for more details.

Training

$ python3 train.py

If you want to change some default settings, you can pass those to the command line, such as:

$ python3 train.py  \
    --X=data/tfrecords/horse.tfrecords \
    --Y=data/tfrecords/zebra.tfrecords

Here is the list of arguments:

usage: train.py [-h] [--batch_size BATCH_SIZE] [--image_size IMAGE_SIZE]
                [--use_lsgan [USE_LSGAN]] [--nouse_lsgan]
                [--norm NORM] [--lambda1 LAMBDA1] [--lambda2 LAMBDA2]
                [--learning_rate LEARNING_RATE] [--beta1 BETA1]
                [--pool_size POOL_SIZE] [--ngf NGF] [--X X] [--Y Y]
                [--load_model LOAD_MODEL]

optional arguments:
  -h, --help            show this help message and exit
  --batch_size BATCH_SIZE
                        batch size, default: 1
  --image_size IMAGE_SIZE
                        image size, default: 256
  --use_lsgan [USE_LSGAN]
                        use lsgan (mean squared error) or cross entropy loss,
                        default: True
  --nouse_lsgan
  --norm NORM           [instance, batch] use instance norm or batch norm,
                        default: instance
  --lambda1 LAMBDA1     weight for forward cycle loss (X->Y->X), default: 10.0
  --lambda2 LAMBDA2     weight for backward cycle loss (Y->X->Y), default:
                        10.0
  --learning_rate LEARNING_RATE
                        initial learning rate for Adam, default: 0.0002
  --beta1 BETA1         momentum term of Adam, default: 0.5
  --pool_size POOL_SIZE
                        size of image buffer that stores previously generated
                        images, default: 50
  --ngf NGF             number of gen filters in first conv layer, default: 64
  --X X                 X tfrecords file for training, default:
                        data/tfrecords/apple.tfrecords
  --Y Y                 Y tfrecords file for training, default:
                        data/tfrecords/orange.tfrecords
  --load_model LOAD_MODEL
                        folder of saved model that you wish to continue
                        training (e.g. 20170602-1936), default: None

Check TensorBoard to see training progress and generated images.

$ tensorboard --logdir checkpoints/${datetime}

If you halted the training process and want to continue training, then you can set the load_model parameter like this.

$ python3 train.py  \
    --load_model 20170602-1936

Here are some funny screenshots from TensorBoard when training orange -> apple:

Notes

If high constrast background colors between input and generated images are observed (e.g. black becomes white), you should restart your training!
Train several times to get the best models.

Export model

You can export from a checkpoint to a standalone GraphDef file as follow:

$ python3 export_graph.py --checkpoint_dir checkpoints/${datetime} \
                          --XtoY_model apple2orange.pb \
                          --YtoX_model orange2apple.pb \
                          --image_size 256

Inference

After exporting model, you can use it for inference. For example:

python3 inference.py --model pretrained/apple2orange.pb \
                     --input input_sample.jpg \
                     --output output_sample.jpg \
                     --image_size 256

Pretrained models

My pretrained models are available at https://github.com/vanhuyz/CycleGAN-TensorFlow/releases

Contributing

Please open an issue if you have any trouble or found anything incorrect in my code :)

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

CycleGAN paper: https://arxiv.org/abs/1703.10593
Official source code in Torch: https://github.com/junyanz/CycleGAN

cyclegan-tensorflow's People

Contributors

Stargazers

Watchers

Forkers

brucelee206 claudegyh phamquangkhang hhappy06 jxlin benjamesbabala jdc08161063 nanfengpo chagge mindis kindlehe elonliu johndpope donghaoye nightinwhite timwintle kiyoshikawasaki junming259 gdseller nagyist louivalley mikigom clinjie rosswendt chaoshangcs renzhiyang andrewlook johnhany dreamofuture niuliang42 mxl1990 yangxue0827 htkseason crazy121 beekbin frankatmech evitself tommy2782 wangyawy1984 0xqq divfor 307509256 sharonzhu xhhong zekesong zhangguozhou teamdandelion kimisissi visionu clarencehoo csdaiwei newlcj93 beimingmaster milomallo leokvw ei1994 xoyoking happyxuwork winwinjjiang palles77 iharsh234 trinhquocnguyen willamd chenzeyuczy m-niu wkhunter nortd agonzgarc manik-goyal alfredormz charlesmaussion alexanderluo youngstu canphp yhldhit hunslater-deeplearning albertlzg lizhi3158 yanyangbaobeiisemma-zz sophorsk wanyongtao1988 xmuofgjk willdamon curisan yongxinchen fzyzcjy ziqichai bin913 wujinlonglovezhangmiao1314 zhaoyingjun pilotbear yuxiaofelicia guillaumeguerin xiaoyun4 kanson1996 hdnse4798 jplnasa5 mu94w xuming76 pandinosaurus

cyclegan-tensorflow's Issues

Data to update G and D net is of the same batch?

As is known, data to update G and D net is of two different batches respectively in the classic gan (here I just consider one of these two gan nets). It's my first time to try TFRecord and so I have one question about it. Is the data of same batch here. If not, I guess data is updated once whether optimization of G or optimization of D is happened?

ngf=64?

Hi, shouldn't default ngf=32, since you're trying to get 32 feature maps after 1st convolution?

A small typo

Thanks for sharing your code, excellent job! I found a small typo in your code.

https://github.com/vanhuyz/CycleGAN-TensorFlow/blob/master/ops.py#L84

conv2 = tf.nn.conv2d(padded2, weights1, strides=[1, 1, 1, 1], padding='VALID')

It should be weights2 instead of weights1, right?

Thanks

RGB to Grayscale Translation

Which parameters in the code should I change ? I have also .bmp files.

Thanks.

Why my training stuck at the beginning

The model was very fluid when I used the test images you provided. But when I used my own images for training, after typing 'python train.py' the model give me no response, nothing but flashing cursor. How did that happen? And how can I fix that?

hello,Thank you for your git. But I do not converge when I have direct training。

（apple2orange）The default parameters used for all of the parameters. Do you need to make some changes?

What is the maximum number of iterations?

I can not understand how to set the max-iterations in the code, Please help me to understand,thanks

How can I use GPU to speed up the process?

When I train the model the lines below would show up:
2017-10-25 21:03:30.854048: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854067: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854071: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854074: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854077: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
But I don't want to use CPU to do the computations how could I change this situation?

The effect of ImagePool

Hello,
What is the effect of using imagepool? This can make the training of discriminator more stable?

Nan in summary histogram for: D_X/fake

Hello,

I have a mistake after ~1900 steps:

File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1226, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Nan in summary histogram for: D_X/fake
[[Node: D_X/fake = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](D_X/fake/tag, D_X_4/output/add/_1361)]]

Loading the checkpoint leads to same problem every time.
I've read somewhere it could mean no convergence is observed.
Here are details of the training:

INFO:root:-----------Step 1400:-------------
INFO:root: G_loss : 4.85528945923
INFO:root: D_Y_loss : 0.0743376016617
INFO:root: F_loss : 4.97825098038
INFO:root: D_X_loss : 0.0959008038044

INFO:root:-----------Step 1900:-------------
INFO:root: G_loss : 4.92536830902
INFO:root: D_Y_loss : 0.184448152781
INFO:root: F_loss : 4.78839635849
INFO:root: D_X_loss : 0.231521636248

INFO:root:-----------Step 2000:-------------
INFO:root: G_loss : 5.98963975906
INFO:root: D_Y_loss : 0.09952506423
INFO:root: F_loss : 5.93069791794
INFO:root: D_X_loss : 0.215163201094

Any advice would be much appreciated!!!

generated image not align with the input

I found sometimes the generated image is not aligned with the input one. Do you know why?

Speed benchmarks

Hi there, thanks for putting this repo together! I'm wondering what kind of throughput people are seeing for training? Im getting about 1 iteration every 3 seconds with a batch size of 1. Seems a bit slow to me. What are other people getting with this implementation?

Add option to load previously-saved model and continue training

Great project! One feature I think would add a lot to it--it would be valuable to be able to load a previously-saved model and continue training.

tf.variable_scope with tf.AUTO_REUSE

what if i change:
with tf.variable_scope(self.name, reuse=self.reuse)
to:
with tf.variable_scope(self.name, reuse=tf.AUTO_REUSE)
will it get same result?

How to handle multiple images in a single time in inference. Py?

ValueError

ValueError: Attempted to map inputs that were not found in graph_def: [input_image:0]

running error

when I run "python train.py ", it warns “ E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:484] The graph couldn't be sorted in topological order.”x2 This message doesn't influence program going.I want to confirm whether this issue is just ok?

tf.gfile.FastGFile does not work on 'b' mode

In here tf.gfile.FastGFile use 'b' mode. While with Python3.4 and Python3.5 it returns an error like: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte.
Using 'rb' mode is okay for me.

Could you tell me the number of your steps when training,and the loss of G,F,DY,DX at last?

Question about implementing of residual block

the implementing of residual block, function RK in file ops.py returns:

...........    
output = input+normalized2
return output

but in the original paper, the addition should followed a ReLU activation function.
I think should like this:

...........    
output = input+normalized2
return tf.nn.relu(output)

tfrecords file???I use these code to make train.tfrecords, it is not work. why???

import tensorflow as tf
import os.path
import matplotlib.image as mpimg
from PIL import Image
SAVE_PATH = "C:\CycleGAN-TensorFlow-master\datasetB_new.tfrecords"
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def load_data(datafile, width, high, method=0, save=False):
train_list = open(datafile,'r')
writer = tf.python_io.TFRecordWriter(SAVE_PATH)
with tf.Session() as sess:
label=0
for line in train_list:
tmp = line.strip().split(' ')
img_path = tmp[0]
image = tf.gfile.FastGFile(img_path, 'rb').read()
image = tf.image.decode_jpeg(image)
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = sess.run(image)
image_raw = image.tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'image_raw': _bytes_feature(image_raw),
'label': _int64_feature(label),
}))
label = label+1
writer.write(example.SerializeToString())

writer.close()

load_data('C:\CycleGAN-TensorFlow-master\samples\monet2photo\monet2photo\trainB.txt', 256,256)

Why do you set the true label as 0.9?

Hi, first of all, thanks for this great work!!

But I'm curious that why do you set the true label as 0.9?

I don't find any description in CycleGAN paper.

Is there any problem that setting true label as 1.0?

Thanks for your explanation :)

ResourceExhaustedError

Hello,when I run your project,raise a error:ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,64,64,256]
[[Node: G_6/R256_2/layer2/instance_norm/moments/sufficient_statistics/Sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](G_6/R256_2/layer2/Conv2D, G_6/R256_2/layer2/instance_norm/moments/StopGradient)]]
[[Node: add_1/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_84861_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
can you help me?

Sorry if it is irrelevant to this project: this TensorFlow binary was not compiled to use: AVX

It may be my environment's problem, but I really no idea how to handle it.I even don't understand what is this mean.

$a@a >python inference.py --model pretrained/man2woman.pb --input data/test.jpg --output data/output.jpg --image_size 256
2017-09-06 16:39:42.586351: I C:\tf_jenkins\home\workspace\nightly-win\M\window
\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports ins
ructions that this TensorFlow binary was not compiled to use: AVX

Is it caused by model or platform??

Reader._preprocess needs to call reshape before convert_dtype

Hi so in continuation with my earlier point about the conversion to [-1,1] not being correct in utils, i have uncovered a bit of an odd behavior in tensorflow that is relevant.

So the tf.image.resize_images function takes in an image and then resizes it using some interpolation function (default is bilinear). This by default will convert your image to float as it needs to interpolate scalars.

The tf.image.convert_image_dtype function will convert your uint8 image in range [0,255] to the floating range [0,1], but only if it is actually in uint8. If you pass in an already floating point image with this call tf.image.convert_image_dtype(image, dtype=tf.float32) , the function **does nothing.

So the current code is (in reader)
`
def _preprocess(self, image):
image = tf.image.resize_images(image, size=(self.image_size, self.image_size))
image = utils.convert2float(image)
image.set_shape([self.image_size, self.image_size, 3])
return image

The first part implicitly converts the image to float, which means the call to conver_image_dtype in utils.convert2float does nothing. This is why the current code sort of works, even though it is wrong. When i changed the /127.5 part in utils to *2, i got super high values and the order of the calls in preprocess is why.

In summary,

in my other issue I raised that you shouldn't divide by 127.5 in utils.conver2float rather you should multiply by 2
after that is fixed the order of the image resizing and conversion needs to be reversed so you convert to float and then resize

Cannot call export_graph with image_size different from default

If your image_size is different from 256, calling export_graph won't work. E.g. if image_size < 256, errors are thrown that certain weights and biases could not be found in the checkpoint.

Solution:
image_size needs to be passed to the CycleGAN constructor, otherwise the network will have the wrong structure. Adding image_size=FLAGS.image_size to the parameters fixed this for me.

Batch inference?

For example, I have a lot of unpaired images from two domain, A and B.
After training, what should I do to transfer A to B with the model?

output argument of inference is ignored

Whatever value you pass to the inference script, the output always gets written to output_sample.jpg.

tensorflow error raised when calculating "error_fake"

Hi,
I try training on my own dataset (for images which are not square) but when TF tries to calculate the discriminator "error_fake" for some reason I get that the input and the weights belong to different graphs and thus I get the following error:

"%s must be from the same graph as %s." % (item, original_item))
ValueError: Tensor("Placeholder_1:0", shape=(1, 300, 640, 3), dtype=float32) must be from the same graph as Tensor("D_Y/C64/weights:0", shape=(4, 4, 3, 64), dtype=float32_ref).

it's weird to me since it's very similar to calculating the "error_real" (same D) only here D works on fake_y and not on y.

any idea on where this is coming from?
thanks

face become unrecognised when changing to a face smile

Hi, I'm trying to make a smile face, using common faces as X and genki4k as Y。
X include 600 pics, Y include 500 pics, all resized to 96*96。
after run about 41k steps, I tried to gen Y from X，but faces are very hard to recognize。
please advice。

INFO:root:-----------Step 41300:-------------
INFO:root: G_loss : 2.0845248699188232
INFO:root: D_Y_loss : 0.11265528202056885
INFO:root: F_loss : 2.6741137504577637
INFO:root: D_X_loss : 0.1405256986618042

How to process images of different sizes?

This repository processes the fixed-size images, however, I need to input and output images of different sizes and make sure that the images remain the same sizes as its original ones.
So can it be modified to support this need? And how to modify it?
Thank you very much.

Batch normalization arguments

Thank you for your wonderful code! Should the updates_collections argument in batch normalization should be set to None as told here?

Sigmoid for lsgan

Why sigmoid is not used in Discriminator last layer for lsgan, I do not understand

How to reuse the pretrained model with different datasets?

I trained a model on a small datasets.
However, I have a much bigger and better datasets now.
Does anyone know how to reuse the pretrained model with the new datasets?

I replaced the former .tfrecord file with the new one, and use
$ tensorboard --logdir checkpoints/20180410-1445
to reload the model, but it didn't work.
It still shows the previous imgs in the tensorboard

Does anyone know how to reuse the pretrained model with different datasets?
Thank you, plz

What's the possible reasons about the wrong image inferenced by my own model?

I used monet2photo downloaded from https://github.com/junyanz/CycleGAN
the trainning step is 9100,
the other parameters are all default,
and the result looks like this

What's the possible reasons?

train.py can process successfully when i use horse2zebra data,but not using my own data

My own data include 20 photos in trainA and 20 photos in trainB . The photos in trainA are Two-channel pictures. and the photos in trainB are Three-channel pictures. How can i change the code to make code process successfully? thank you!

Hi, I trained your model but result not very good

I run apple - > orange but when transfer apple to orange the result is very pool, using the default image size, I also trained the original pytorch version, it generates just nice picture, is I am not training enough epoch?

Model Collapse?

Hi there,

Thanks for your great work! As you mentioned when

high contrast background colors between input and generated images are observed (e.g. black becomes white), you should restart your training!

I have actually observed this problem thru tensorboard at around 15th epoch(see images below), is it due to insufficient training or the model has already collapsed? Cycleloss seems to still have a really slow decreasing trend.

Thanks,

how can I use your code to train aligned data

Hello. I have a question about that how can I use your code to train aligned data?

Computing forward&backward cycle loss twice

I see that the cycle_loss function already computes both forward & backward loss. Was there any performance boost by including it twice in Gan Loss for both x & y? The original paper & implementation seems to be including this only once

Grayscale?

Will in work with grayscale images? What do I need to change?

Thanks.

Training my own database

What should I do to train my own database? Just replace the pictures in data folder or I should do something else?

I want to know whether the tenserflow version's result is as well as the original pytorch version？Thank you very much.

Request to upload other pretrained models. like cezane2photo

If you have pretrained models with you please upload others as well. It will be of great use for those who is doing artistic style transfer comparison between CNN and GAN based models.
great work man 💯
Thanks!

when i

identity mapping loss

Hi,
thank you for sharing this implementation.
do you plan on adding support for the identity mapping loss as described in the original paper?
thanks

How should I stop training?

After 10000 steps to train the mode, the training progress continued. How should I stop it? Or how should I change in the code to change this situation?

utils.convert2float incorrect

Hi I think that the utility function convert2float called in preprocess function of reader is incorrect. I believe the desired functionality is to convert the image from [0,255] int format to [-1,1] float format.

The code is
def convert2float(image): """ Transfrom from int image ([0,255]) to float tensor ([-1.,1.]) """ image = tf.image.convert_image_dtype(image, dtype=tf.float32) return (image/127.5) - 1.0

The issue is in dividing the image by 127.5 after scaling. The tf.image.convert_image_dtype function scales images to [0,1] floats already, so you actually need to multiply by 2 and then subtract 1, not divide by 127.5.

Can CycleGAN deal with input data images with different size?

Should I change images size?

Thank You for your attention.

Padding=2 in Resnet blocks?

Thanks for your good code! May I ask why the padding=2 in your resnet block (which was shaved latter)? Will it make a better results?