Giter Site home page Giter Site logo

cyclegan-tensorflow's Introduction

CycleGAN-TensorFlow

An implementation of CycleGan using TensorFlow (work in progress).

Original paper: https://arxiv.org/abs/1703.10593

Results on test data

apple -> orange

Input Output Input Output Input Output
apple2orange_1 apple2orange_1 apple2orange_2 apple2orange_2 apple2orange_3 apple2orange_3

orange -> apple

Input Output Input Output Input Output
orange2apple_1 orange2apple_1 orange2apple_2 orange2apple_2 orange2apple_3 orange2apple_3

Environment

  • TensorFlow 1.0.0
  • Python 3.6.0

Data preparing

  • First, download a dataset, e.g. apple2orange
$ bash download_dataset.sh apple2orange
  • Write the dataset to tfrecords
$ python3 build_data.py

Check $ python3 build_data.py --help for more details.

Training

$ python3 train.py

If you want to change some default settings, you can pass those to the command line, such as:

$ python3 train.py  \
    --X=data/tfrecords/horse.tfrecords \
    --Y=data/tfrecords/zebra.tfrecords

Here is the list of arguments:

usage: train.py [-h] [--batch_size BATCH_SIZE] [--image_size IMAGE_SIZE]
                [--use_lsgan [USE_LSGAN]] [--nouse_lsgan]
                [--norm NORM] [--lambda1 LAMBDA1] [--lambda2 LAMBDA2]
                [--learning_rate LEARNING_RATE] [--beta1 BETA1]
                [--pool_size POOL_SIZE] [--ngf NGF] [--X X] [--Y Y]
                [--load_model LOAD_MODEL]

optional arguments:
  -h, --help            show this help message and exit
  --batch_size BATCH_SIZE
                        batch size, default: 1
  --image_size IMAGE_SIZE
                        image size, default: 256
  --use_lsgan [USE_LSGAN]
                        use lsgan (mean squared error) or cross entropy loss,
                        default: True
  --nouse_lsgan
  --norm NORM           [instance, batch] use instance norm or batch norm,
                        default: instance
  --lambda1 LAMBDA1     weight for forward cycle loss (X->Y->X), default: 10.0
  --lambda2 LAMBDA2     weight for backward cycle loss (Y->X->Y), default:
                        10.0
  --learning_rate LEARNING_RATE
                        initial learning rate for Adam, default: 0.0002
  --beta1 BETA1         momentum term of Adam, default: 0.5
  --pool_size POOL_SIZE
                        size of image buffer that stores previously generated
                        images, default: 50
  --ngf NGF             number of gen filters in first conv layer, default: 64
  --X X                 X tfrecords file for training, default:
                        data/tfrecords/apple.tfrecords
  --Y Y                 Y tfrecords file for training, default:
                        data/tfrecords/orange.tfrecords
  --load_model LOAD_MODEL
                        folder of saved model that you wish to continue
                        training (e.g. 20170602-1936), default: None

Check TensorBoard to see training progress and generated images.

$ tensorboard --logdir checkpoints/${datetime}

If you halted the training process and want to continue training, then you can set the load_model parameter like this.

$ python3 train.py  \
    --load_model 20170602-1936

Here are some funny screenshots from TensorBoard when training orange -> apple:

train_screenshot

Notes

  • If high constrast background colors between input and generated images are observed (e.g. black becomes white), you should restart your training!
  • Train several times to get the best models.

Export model

You can export from a checkpoint to a standalone GraphDef file as follow:

$ python3 export_graph.py --checkpoint_dir checkpoints/${datetime} \
                          --XtoY_model apple2orange.pb \
                          --YtoX_model orange2apple.pb \
                          --image_size 256

Inference

After exporting model, you can use it for inference. For example:

python3 inference.py --model pretrained/apple2orange.pb \
                     --input input_sample.jpg \
                     --output output_sample.jpg \
                     --image_size 256

Pretrained models

My pretrained models are available at https://github.com/vanhuyz/CycleGAN-TensorFlow/releases

Contributing

Please open an issue if you have any trouble or found anything incorrect in my code :)

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

cyclegan-tensorflow's People

Contributors

akretz avatar cpury avatar fzyzcjy avatar george-ogden avatar ivanukhov avatar johnhany avatar junming259 avatar nanfengpo avatar rkfg avatar teamdandelion avatar timwintle avatar vanhuyz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cyclegan-tensorflow's Issues

Data to update G and D net is of the same batch?

As is known, data to update G and D net is of two different batches respectively in the classic gan (here I just consider one of these two gan nets). It's my first time to try TFRecord and so I have one question about it. Is the data of same batch here. If not, I guess data is updated once whether optimization of G or optimization of D is happened?

ngf=64?

Hi, shouldn't default ngf=32, since you're trying to get 32 feature maps after 1st convolution?

Why my training stuck at the beginning

The model was very fluid when I used the test images you provided. But when I used my own images for training, after typing 'python train.py' the model give me no response, nothing but flashing cursor. How did that happen? And how can I fix that?

How can I use GPU to speed up the process?

When I train the model the lines below would show up:
2017-10-25 21:03:30.854048: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854067: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854071: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854074: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854077: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
But I don't want to use CPU to do the computations how could I change this situation?

The effect of ImagePool

Hello,
What is the effect of using imagepool? This can make the training of discriminator more stable?

Nan in summary histogram for: D_X/fake

Hello,

I have a mistake after ~1900 steps:

File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1226, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Nan in summary histogram for: D_X/fake
[[Node: D_X/fake = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](D_X/fake/tag, D_X_4/output/add/_1361)]]

Loading the checkpoint leads to same problem every time.
I've read somewhere it could mean no convergence is observed.
Here are details of the training:

INFO:root:-----------Step 1400:-------------
INFO:root: G_loss : 4.85528945923
INFO:root: D_Y_loss : 0.0743376016617
INFO:root: F_loss : 4.97825098038
INFO:root: D_X_loss : 0.0959008038044

INFO:root:-----------Step 1900:-------------
INFO:root: G_loss : 4.92536830902
INFO:root: D_Y_loss : 0.184448152781
INFO:root: F_loss : 4.78839635849
INFO:root: D_X_loss : 0.231521636248

INFO:root:-----------Step 2000:-------------
INFO:root: G_loss : 5.98963975906
INFO:root: D_Y_loss : 0.09952506423
INFO:root: F_loss : 5.93069791794
INFO:root: D_X_loss : 0.215163201094

Any advice would be much appreciated!!!

Speed benchmarks

Hi there, thanks for putting this repo together! I'm wondering what kind of throughput people are seeing for training? Im getting about 1 iteration every 3 seconds with a batch size of 1. Seems a bit slow to me. What are other people getting with this implementation?

tf.variable_scope with tf.AUTO_REUSE

what if i change:
with tf.variable_scope(self.name, reuse=self.reuse)
to:
with tf.variable_scope(self.name, reuse=tf.AUTO_REUSE)
will it get same result?

ValueError

ValueError: Attempted to map inputs that were not found in graph_def: [input_image:0]

running error

when I run "python train.py ", it warns “ E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:484] The graph couldn't be sorted in topological order.”x2 This message doesn't influence program going.I want to confirm whether this issue is just ok?

tf.gfile.FastGFile does not work on 'b' mode

In here tf.gfile.FastGFile use 'b' mode. While with Python3.4 and Python3.5 it returns an error like: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte.
Using 'rb' mode is okay for me.

Question about implementing of residual block

the implementing of residual block, function RK in file ops.py returns:

...........    
output = input+normalized2
return output

but in the original paper, the addition should followed a ReLU activation function.
I think should like this:

...........    
output = input+normalized2
return tf.nn.relu(output)

tfrecords file???I use these code to make train.tfrecords, it is not work. why???

import tensorflow as tf
import os.path
import matplotlib.image as mpimg
from PIL import Image
SAVE_PATH = "C:\CycleGAN-TensorFlow-master\datasetB_new.tfrecords"
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def load_data(datafile, width, high, method=0, save=False):
train_list = open(datafile,'r')
writer = tf.python_io.TFRecordWriter(SAVE_PATH)
with tf.Session() as sess:
label=0
for line in train_list:
tmp = line.strip().split(' ')
img_path = tmp[0]
image = tf.gfile.FastGFile(img_path, 'rb').read()
image = tf.image.decode_jpeg(image)
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = sess.run(image)
image_raw = image.tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'image_raw': _bytes_feature(image_raw),
'label': _int64_feature(label),
}))
label = label+1
writer.write(example.SerializeToString())

writer.close()  

load_data('C:\CycleGAN-TensorFlow-master\samples\monet2photo\monet2photo\trainB.txt', 256,256)

Why do you set the true label as 0.9?

Hi, first of all, thanks for this great work!!

But I'm curious that why do you set the true label as 0.9?

I don't find any description in CycleGAN paper.

Is there any problem that setting true label as 1.0?

Thanks for your explanation :)

ResourceExhaustedError

Hello,when I run your project,raise a error:ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,64,64,256]
[[Node: G_6/R256_2/layer2/instance_norm/moments/sufficient_statistics/Sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](G_6/R256_2/layer2/Conv2D, G_6/R256_2/layer2/instance_norm/moments/StopGradient)]]
[[Node: add_1/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_84861_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
can you help me?

Sorry if it is irrelevant to this project: this TensorFlow binary was not compiled to use: AVX

It may be my environment's problem, but I really no idea how to handle it.I even don't understand what is this mean.

$a@a >python inference.py --model pretrained/man2woman.pb --input data/test.jpg --output data/output.jpg --image_size 256
2017-09-06 16:39:42.586351: I C:\tf_jenkins\home\workspace\nightly-win\M\window
\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports ins
ructions that this TensorFlow binary was not compiled to use: AVX

Is it caused by model or platform??

Reader._preprocess needs to call reshape before convert_dtype

Hi so in continuation with my earlier point about the conversion to [-1,1] not being correct in utils, i have uncovered a bit of an odd behavior in tensorflow that is relevant.

So the tf.image.resize_images function takes in an image and then resizes it using some interpolation function (default is bilinear). This by default will convert your image to float as it needs to interpolate scalars.

The tf.image.convert_image_dtype function will convert your uint8 image in range [0,255] to the floating range [0,1], but only if it is actually in uint8. If you pass in an already floating point image with this call tf.image.convert_image_dtype(image, dtype=tf.float32) , the function **does nothing.

So the current code is (in reader)
`
def _preprocess(self, image):
image = tf.image.resize_images(image, size=(self.image_size, self.image_size))
image = utils.convert2float(image)
image.set_shape([self.image_size, self.image_size, 3])
return image

`

The first part implicitly converts the image to float, which means the call to conver_image_dtype in utils.convert2float does nothing. This is why the current code sort of works, even though it is wrong. When i changed the /127.5 part in utils to *2, i got super high values and the order of the calls in preprocess is why.

In summary,

  1. in my other issue I raised that you shouldn't divide by 127.5 in utils.conver2float rather you should multiply by 2
  2. after that is fixed the order of the image resizing and conversion needs to be reversed so you convert to float and then resize

Cannot call export_graph with image_size different from default

If your image_size is different from 256, calling export_graph won't work. E.g. if image_size < 256, errors are thrown that certain weights and biases could not be found in the checkpoint.

Solution:
image_size needs to be passed to the CycleGAN constructor, otherwise the network will have the wrong structure. Adding image_size=FLAGS.image_size to the parameters fixed this for me.

Batch inference?

For example, I have a lot of unpaired images from two domain, A and B.
After training, what should I do to transfer A to B with the model?

tensorflow error raised when calculating "error_fake"

Hi,
I try training on my own dataset (for images which are not square) but when TF tries to calculate the discriminator "error_fake" for some reason I get that the input and the weights belong to different graphs and thus I get the following error:

"%s must be from the same graph as %s." % (item, original_item))
ValueError: Tensor("Placeholder_1:0", shape=(1, 300, 640, 3), dtype=float32) must be from the same graph as Tensor("D_Y/C64/weights:0", shape=(4, 4, 3, 64), dtype=float32_ref).

it's weird to me since it's very similar to calculating the "error_real" (same D) only here D works on fake_y and not on y.

any idea on where this is coming from?
thanks

face become unrecognised when changing to a face smile

Hi, I'm trying to make a smile face, using common faces as X and genki4k as Y。
X include 600 pics, Y include 500 pics, all resized to 96*96。
after run about 41k steps, I tried to gen Y from X,but faces are very hard to recognize。
please advice。

INFO:root:-----------Step 41300:-------------
INFO:root: G_loss : 2.0845248699188232
INFO:root: D_Y_loss : 0.11265528202056885
INFO:root: F_loss : 2.6741137504577637
INFO:root: D_X_loss : 0.1405256986618042

How to process images of different sizes?

This repository processes the fixed-size images, however, I need to input and output images of different sizes and make sure that the images remain the same sizes as its original ones.
So can it be modified to support this need? And how to modify it?
Thank you very much.

Batch normalization arguments

Thank you for your wonderful code! Should the updates_collections argument in batch normalization should be set to None as told here?

Sigmoid for lsgan

Why sigmoid is not used in Discriminator last layer for lsgan, I do not understand

How to reuse the pretrained model with different datasets?

I trained a model on a small datasets.
However, I have a much bigger and better datasets now.
Does anyone know how to reuse the pretrained model with the new datasets?

I replaced the former .tfrecord file with the new one, and use
$ tensorboard --logdir checkpoints/20180410-1445
to reload the model, but it didn't work.
It still shows the previous imgs in the tensorboard

Does anyone know how to reuse the pretrained model with different datasets?
Thank you, plz

Hi, I trained your model but result not very good

I run apple - > orange but when transfer apple to orange the result is very pool, using the default image size, I also trained the original pytorch version, it generates just nice picture, is I am not training enough epoch?

Model Collapse?

Hi there,

Thanks for your great work! As you mentioned when

high contrast background colors between input and generated images are observed (e.g. black becomes white), you should restart your training!

I have actually observed this problem thru tensorboard at around 15th epoch(see images below), is it due to insufficient training or the model has already collapsed? Cycleloss seems to still have a really slow decreasing trend.

screenshot from 2018-02-04 16-31-27

screenshot from 2018-02-04 16-31-09

Thanks,

Computing forward&backward cycle loss twice

I see that the cycle_loss function already computes both forward & backward loss. Was there any performance boost by including it twice in Gan Loss for both x & y? The original paper & implementation seems to be including this only once

Grayscale?

Will in work with grayscale images? What do I need to change?

Thanks.

Training my own database

What should I do to train my own database? Just replace the pictures in data folder or I should do something else?

identity mapping loss

Hi,
thank you for sharing this implementation.
do you plan on adding support for the identity mapping loss as described in the original paper?
thanks

How should I stop training?

After 10000 steps to train the mode, the training progress continued. How should I stop it? Or how should I change in the code to change this situation?

utils.convert2float incorrect

Hi I think that the utility function convert2float called in preprocess function of reader is incorrect. I believe the desired functionality is to convert the image from [0,255] int format to [-1,1] float format.

The code is
def convert2float(image): """ Transfrom from int image ([0,255]) to float tensor ([-1.,1.]) """ image = tf.image.convert_image_dtype(image, dtype=tf.float32) return (image/127.5) - 1.0

The issue is in dividing the image by 127.5 after scaling. The tf.image.convert_image_dtype function scales images to [0,1] floats already, so you actually need to multiply by 2 and then subtract 1, not divide by 127.5.

Padding=2 in Resnet blocks?

Thanks for your good code! May I ask why the padding=2 in your resnet block (which was shaved latter)? Will it make a better results?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.