bamos / dcgan-completion.tensorflow Goto Github PK

View Code? Open in Web Editor NEW

1.3K 1.3K 388.0 71.59 MB

Image Completion with Deep Learning in TensorFlow

License: Other

Python 100.00%

deep-learning tensorflow

dcgan-completion.tensorflow's People

Contributors

Stargazers

Watchers

Forkers

dim25 krasin subokita ml-lab federicov samithaj wavelets bjanssen hbcbh1999 benjamesbabala codeaudit mnrmja007 jeromeyoon tpys linzhineng tartavull dnoursi rudrj2 alimiraftab rouseguy rhythm92 libardo1 mjgrav2001 dwww2012 anilcs13m sangrail ompanda acourtney2015 emilyfay dreadlord1984 somaticapi mtngld scott-vsi xiaojunchang yongbo89 xsongx adaveiitm kissyzhou chagge xuyifeng-nwpu milestonesvn johndpope pursueorigin vyraun 6br deepcompute macgyverwang ahn19 feijoas bigjohnn shohad25 lexmao marshb eternalnation k-du lxj0276 gaurav5670 iflier yangerkun myw8 umariqb xiangliu886 center1 lavalse junmyung jlertle qinmanjun jungel2star oxmah berleant kartikay18 andrelip zjtsunshine htyao89 coocoky dongdongqin zgsxwsdxg shotaspost allwefantasy learn-ai artisdom daedalusx getfetcher rieekan rassakhatsky nalsil wedjaa kangkot techscientist francescoalb yaoconglei orchestor waleedgondal fndxy liukuannn acnokego cpriyank gyangmedia zorazhj h312h

dcgan-completion.tensorflow's Issues

Disappointing result for the demo with 3 images.

I ran the demo from your blog post with 3 images, but the result is disppointing:

My input images are:

I did not re-trained the model, I directly ran the code:

./openface/util/align-dlib.py data/dcgan-completion.tensorflow/data/your-dataset/raw align innerEyesAndBottomLip data/dcgan-completion.tensorflow/data/your-dataset/aligned --size 64

./complete.py ./data/your-test-data/aligned/* --outDir outputImages

cd outputImages convert -delay 10 -loop 0 completed/*.png completion.gif

ValueError: setting an array element with a sequence.

I am getting the following error while running train-dcgan.py

sample_images = np.array(sample).astype(np.float32)
ValueError: setting an array element with a sequence.

Add a trained model/checkpoint for the updated version of the code

Python 2.7 or Python 3.5?

Hi, is this code working on 2.7 or 3.5?

Thanks

Inpainting for different missing areas?

Can this be modified so it ca inpaint an image which has multiple missing blocks? Eg: inpaint everything that is white (0xFFFFFF) in the image?

I mean, how can I use a custom mask image? Can each test sample input have it's own mask, or is it the same mask for the entire testing set?

Data loss error after training a small data set

I trained the model using a smaller dataset because i am using only cpu. Saved the model file into checkpoint directory from logs generated. I am getting the below error when trying to complete images

W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file checkpoint/DCGAN.model-106502: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
File "./complete.py", line 34, in
dcgan.complete(args)
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 225, in complete
isLoaded = self.load(self.checkpoint_dir)
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 375, in load
self.saver.restore(self.sess, ckpt.model_checkpoint_path)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1388, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file checkpoint/DCGAN.model-106502: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_5/tensor_names, save/RestoreV2_5/shape_and_slices)]]

Caused by op u'save/RestoreV2_5', defined at:
File "./complete.py", line 33, in
dcgan = DCGAN(sess, image_size=args.imgSize,checkpoint_dir=args.checkpointDir, lam=args.lam)
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 65, in init
self.build_model()
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 110, in build_model
self.saver = tf.train.Saver(max_to_keep=1)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1000, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 361, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 200, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 441, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1128, in init
self._traceback = _extract_stack()

DataLossError (see above for traceback): Unable to open table file checkpoint/DCGAN.model-106502: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_5/tensor_names, save/RestoreV2_5/shape_and_slices)]]

Cannot reproduce the demo from the blog post

I ran the demo from the blog post with 3 images, but I cannot reproduce the result. Instead I get something that does not look reasonable at all:

I even ran it for 2500 iterations. The loss does not go down, it keeps oscillating. checkpoints here:
https://drive.google.com/open?id=0B86WKpvkt66BeEkxeHdmcEVOTVU

Even a non-adaptive optimization technique should give at least reasonable results.
I did not trained any model (a trained model is provided), I only ran the code:

./openface/util/align-dlib.py data/dcgan-completion.tensorflow/data/your-dataset/raw align innerEyesAndBottomLip data/dcgan-completion.tensorflow/data/your-dataset/aligned --size 64

./complete.py ./data/your-test-data/aligned/* --outDir outputImages

cd outputImages convert -delay 10 -loop 0 completed/*.png completion.gif

Did I miss a step in the blog post?

I use tensorflow 0.11 on a CPU on AWS EC2 c4.4xlarge

Changing image_size to 256 triggers error

I can't seem to be able to change the image size or number of initial filter values, if I do that I get this error:

F C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\stream_executor\cuda\cuda_dnn.cc:430] could not convert BatchDescriptor {count: 4 feature_map_count: 0 spatial: 129 129 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX} to cudnn tensor descriptor: CUDNN_STATUS_BAD_PARAM

ValueError: Variable d_bn1/d_bn1_2/moments/moments_1/mean/ExponentialMovingAverage/ does not exist

when i train model ,errors happened ,...please help me

Traceback (most recent call last):
File "./train-dcgan.py", line 39, in
is_crop=False, checkpoint_dir=FLAGS.checkpoint_dir)
File "/home/page/wp/dcgan-completion.tensorflow/model.py", line 65, in init
self.build_model()
File "/home/page/wp/dcgan-completion.tensorflow/model.py", line 81, in build_model
self.D_, self.D_logits_ = self.discriminator(self.G, reuse=False)
File "/home/page/wp/dcgan-completion.tensorflow/model.py", line 312, in discriminator
h1 = lrelu(self.d_bn1(conv2d(h0, self.df_dim*2, name='d_h1_conv')))
File "/home/page/wp/dcgan-completion.tensorflow/ops.py", line 34, in call
ema_apply_op = self.ema.apply([batch_mean, batch_var])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py", line 375, in apply
colocate_with_primary=(var.op.type in ["Variable", "VariableV2"]))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 123, in create_zeros_slot
colocate_with_primary=colocate_with_primary)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 103, in create_slot
return _create_slot_var(primary, val, '')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 55, in _create_slot_var
slot = variable_scope.get_variable(scope, initializer=val, trainable=False)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 988, in get_variable
custom_getter=custom_getter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 890, in get_variable
custom_getter=custom_getter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 348, in get_variable
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 333, in _true_getter
caching_device=caching_device, validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 657, in _get_single_variable
"VarScope?" % name)
ValueError: Variable d_bn1/d_bn1_2/moments/moments_1/mean/ExponentialMovingAverage/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

Poor results

I am sorry to bother you, but I meet the problem same as the previous participators.

I test the first 4 images of celebA, but after 4100 iters I just can get this

I use the checkout file you provide, and the lr 0.0001 momentum 0.9 lam=0.1

The log of loss is here:
(0, 5438.4629)
(50, 2176.6125)
(100, 1832.1901)
(150, 1647.3655)
(200, 1559.5751)
(250, 1565.9722)
(300, 1507.9558)
(350, 1586.2904)
(400, 1487.1926)
(450, 1490.8186)
(500, 1456.236)
(550, 1440.8938)
(600, 1424.374)
(650, 1498.3505)
(700, 1465.35)
(750, 1462.46)
(800, 1405.9574)
(850, 1421.2158)
(900, 1497.1049)
(950, 1448.3206)
(1000, 1482.5293)
(1050, 1445.6565)
(1100, 1425.0696)
(1150, 1528.2311)
(1200, 1545.6395)
(1250, 1394.381)
(1300, 1532.6442)
(1350, 1491.2181)
(1400, 1401.6718)
(1450, 1382.696)
(1500, 1414.021)
(1550, 1593.165)
(1600, 1423.8958)
(1650, 1464.3767)
(1700, 1390.084)
(1750, 1476.6337)
(1800, 1416.7273)
(1850, 1529.7141)
(1900, 1451.7915)
(1950, 1529.489)
(2000, 1481.5079)
(2050, 1436.3569)
(2100, 1358.0055)
(2150, 1358.7209)
(2200, 1506.1733)
(2250, 1393.1555)
(2300, 1408.2441)
(2350, 1483.0356)
(2400, 1377.6108)
(2450, 1516.509)
(2500, 1486.5796)
(2550, 1420.6681)
(2600, 1368.9692)
(2650, 1395.9053)
(2700, 1456.8112)
(2750, 1511.428)
(2800, 1358.7203)
(2850, 1498.2517)
(2900, 1416.1931)
(2950, 1419.6449)
(3000, 1402.9338)
(3050, 1506.2152)
(3100, 1469.8186)
(3150, 1427.0465)
(3200, 1401.9683)
(3250, 1452.6249)
(3300, 1466.2008)
(3350, 1347.4689)
(3400, 1382.7289)
(3450, 1494.2131)
(3500, 1429.3224)
(3550, 1384.4293)
(3600, 1468.8229)
(3650, 1356.1833)
(3700, 1413.0737)
(3750, 1358.0089)
(3800, 1429.6335)
(3850, 1473.816)
(3900, 1482.1053)
(3950, 1367.022)
(4000, 1408.011)
(4050, 1411.2065)
(4100, 1369.4713)

error while loading the pretrained model

when i run the demo using the pretrained model ,errors happened ,the log says the graph do not include d_bn1/moving_mean,i check the graph by using "print graph.get_operations()",and do not find d_bn1.i am not sure what happended...please help me

Traceback (most recent call last):
File "complete.py", line 35, in
dcgan.complete(args)
File "/home/oeasy/pobei/dcgan-completion.tensorflow-master/model.py", line 225, in complete
isLoaded = self.load(self.checkpoint_dir)
File "/home/oeasy/pobei/dcgan-completion.tensorflow-master/model.py", line 376, in load
self.saver.restore(self.sess, ckpt.model_checkpoint_path)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1388, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Tensor name "d_bn1/moving_variance" not found in checkpoint files checkpoint/DCGAN.model-106502
[[Node: save/RestoreV2_3 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_3/tensor_names, save/RestoreV2_3/shape_and_slices)]]

Caused by op u'save/RestoreV2_3', defined at:
File "complete.py", line 34, in
checkpoint_dir=args.checkpointDir, lam=args.lam)
File "/home/oeasy/pobei/dcgan-completion.tensorflow-master/model.py", line 65, in init
self.build_model()
File "/home/oeasy/pobei/dcgan-completion.tensorflow-master/model.py", line 110, in build_model
self.saver = tf.train.Saver(max_to_keep=1)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1000, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 361, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 200, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 441, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1128, in init
self._traceback = _extract_stack()

NotFoundError (see above for traceback): Tensor name "d_bn1/moving_variance" not found in checkpoint files checkpoint/DCGAN.model-106502
[[Node: save/RestoreV2_3 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_3/tensor_names, save/RestoreV2_3/shape_and_slices)]]

NotFoundError (see above for traceback): Key discriminator/d_bn0/beta not found in checkpoint

even though TF>=1.0.0 I am getting NotFoundError (see above for traceback): Key discriminator/d_bn0/beta not found in checkpoint

Got above error while running ./complete.py ./data/your-test-data/aligned/* --outDir outputImages

I am trying to complete one image of size 64x64. (its not aligned)

Please help me out

Error for running the DCGAN training following the blog http://bamos.github.io/2016/08/09/deep-completion/

#Hello, thanks for your awesome tutorial on http://bamos.github.io/2016/08/09/deep-completion/.

Please I tried to follow the "Running DCGAN on your images", but I have a mistake in :.

I worked with python2, tensorflow 1.1.

1 When I tried to run

I cannot see any result in the aligned folder, its empty, its important to mention that I have installed whole libraries needs.

2 The DCGAN training is working perfect.

Thank you so much for your help

Regards :)

makedirs() got an unexpected keyword argument 'exist_ok

Hi I have completed the training process, but for the complete.py file I got this Error

File "./complete.py", line 35, in
dcgan.complete(args)
File "/home/mohammed/dcgan-completion.tensorflow/model.py", line 220, in complete
os.makedirs(os.path.join(config.outDir, 'hats_imgs'), exist_ok=True)
TypeError: makedirs() got an unexpected keyword argument 'exist_ok'

Help me out Please. Thanks

Can this repo be used as denoising

I'm trying to use this repo as a denoising tool, but I find that in your model.py 's complete method, the generated model still need to compare with the original input that in aligned folder. is that means this input cannot detect where the noise is and it has to be pointed out by the mask you defined?

error while training DCGAN

I changed my tensorflow version to 0.10.0, problem solved.

Training Error

Traceback (most recent call last):
File "./train-dcgan.py", line 41, in
dcgan.train(FLAGS)
File "/data4/hao/dcgan-completion.tensorflow/model.py", line 169, in train
sample_images = np.array(sample).astype(np.float32)
ValueError: setting an array element with a sequence.

error with tf 0.12

ValueError: Variable discriminator/d_bn1/discriminator_1/d_bn1/discriminator_1/d_bn1/moments/moments_1/mean/ExponentialMovingAverage/biased does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

Changing batch size throws error

If I try to run the training with a smaller batch size, let's say 4, I get this error:

InvalidArgumentError (see above for traceback): Conv2DSlowBackpropInput: input and out_backprop must have the same batch size

Can the batch size be changed? Should I change any other variables?

Can not reproduce such a good effect

I used celebA which had 200K face pictures as the trainable database.
Firstly I cropped and aligned them ,and resized to 64*64.
Although i followed this blog :

http://bamos.github.io/2016/08/09/deep-completion

and used 15 ,20 and 25 epochs respectively ,but my inpainting effect was worse 。

error: the following arguments are required: imgs

usage: complete.py [-h] [--approach {adam,hmc}] [--lr LR] [--beta1 BETA1]
[--beta2 BETA2] [--eps EPS] [--hmcBeta HMCBETA]
[--hmcEps HMCEPS] [--hmcL HMCL] [--hmcAnneal HMCANNEAL]
[--nIter NITER] [--imgSize IMGSIZE] [--lam LAM]
[--checkpointDir CHECKPOINTDIR] [--outDir OUTDIR]
[--outInterval OUTINTERVAL]
[--maskType {random,center,left,full,grid,lowres}]
[--centerScale CENTERSCALE]
imgs [imgs ...]
I am a college student, want to learn the knowledge, encountered such a problem when the program runs, hope you can help me, thank you very much!!!!!!

issue during image completion

solved.

g does not converge

the loss of generator does not converge,it's value range in 2-6

Error for training the DCGAN

First of all, thank you for your outstanding work.
But, I have some #problem
When I execute this command ： python ./train-dcgan.py --dataset ../../data/yelu_dataset/dcgan-completion.tensorflow/aligned --epoch 20
I have a error
``2017-11-08 19:30:37.367157: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key discriminator/d_bn0/moving_variance not found in checkpoint
2017-11-08 19:30:37.367189: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key discriminator/d_bn0/beta not found in checkpoint
2017-11-08 19:30:37.367417: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key discriminator/d_h4_lin/Matrix not found in checkpoint
2017-11-08 19:30:37.367494: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key discriminator/d_bn0/moving_variance not found in checkpoint
[[Node: save/RestoreV2_3 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_3/tensor_names, save/RestoreV2_3/shape_and_slices)]]
2017-11-08 19:30:37.367736: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key discriminator/d_bn0/moving_mean not found in checkpoint
2017-11-08 19:30:37.367764: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key discriminator/d_bn0/moving_variance not found in checkpoint
[[Node: save/RestoreV2_3 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_3/tensor_names, save/RestoreV2_3/shape_and_slices)]]

Thank you so much~~

Getting "in complete assert(isLoaded) AssertionError" when running complete.py on my own checkpoints directory

Hello Guys,

I have trained the model and saved the .index, .meta and .data file in a different directory. However; when running complete.py I am getting this error:
File "/home/hannahbrahman/ADV ML CMPS 290/GANS/dcgan-completion.tensorflow/model.py", line 262, in complete
assert(isLoaded)
AssertionError

Although on the cloned 'checkpoints' directory it is working fine.
Did anyone have this problem before? I would appreciate any help.

by the way this is the command I used: ./complete.py ./data/selection/* --outDir outputImages --checkpointDir mycheckpoints

there is nothing in the folder “sample” after 25 epoch

after run this command
./train-dcgan.py --dataset ./data/your-dataset/aligned --epoch 20
there is nothing in the folder “sample” as expected

Does anyone have a model trained on more than 20 epochs?

If so could you please upload or send it-- would save a lot of time

Optimize for completion with ADAM instead of gradient descent

where is the imgs files?

Hi, thanks for your work.
I came across the error: the following arguments are required: imgs.
so, where is the imgs files? if need creat by hand, what images should be used?
just common face images?
thanks.

NotFoundError when restore model from the checkpoint

I met the error, has anyone knowns how to solve it?

File "./complete.py", line 41, in <module>
dcgan.complete(args)
File "./ImageCompletion/dcgan-completion.tensorflow/model.py", line 228, in complete
isLoaded = self.load(self.checkpoint_dir)
File "./ImageCompletion/dcgan-completion.tensorflow/model.py", line 384, in load
self.saver.restore(self.sess, ckpt.model_checkpoint_path)
File "./anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1388, in restore
{self.saver_def.filename_tensor_name: save_path})
NotFoundError (see above for traceback): Tensor name "d_bn3_1/scratch/d_bn3_1/scratch/moments/moments_1/variance/ExponentialMovingAverage" not found in checkpoint files checkpoint/DCGAN.model-106502
[[Node: save/RestoreV2_15 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_15/tensor_names, save/RestoreV2_15/shape_and_slices)]]
[[Node: save/RestoreV2_37/_141 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_16_save/RestoreV2_37", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
-

error while using complete.py

thanks for sharing your work.
i had this error while i m trying to use complete.py on the eval images of celebA dataset
your help is really appreciated.

some problem

Sorry to trouble you again,I use 64*64 grayscale images of Chinese characters and trying to repair Chinese characters.But the completed and hats_imgs seem to no change
. When training after a period of training the g_loss is a lot of bigger than d_loss, when I use your method to repair, the loss is very high,I have tried to change learning rate lam momentum .the loss is still very high about 7500，can you help me 。I hope you can understand what I mean, my English is not good.

Image completion is incorrectly using batch norm in training mode

A minor bug, the easiest fix is to use the sampler instead of the generator, which sets train=False in the batch norm. A better solution would be to use tflearn's batch normalization code with their is_training variable so that there doesn't need to be a generator and sampler.

NotFoundError (see above for traceback): Key generator/g_h6/w not found in checkpoint

Hi,

I am getting the above error when i try to run complete.py using the saved check point. The image which I am trying to send for inference is 64*64. I also made the changes in your pull requests. Still it doesn't seem to get fixed. Please help me with the following issue. Thanks in advance.

Error while running ./train-dcgan.py --dataset ./data/your-dataset/aligned --epoch 20

I am trying to run this part of the code, but I getting the following error:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970M, pci bus id: 0000:01:00.0)
Traceback (most recent call last):
File "./train-dcgan.py", line 39, in
is_crop=False, checkpoint_dir=FLAGS.checkpoint_dir)
File "/home/samuel/Desktop/DL_Final_Project/dcgan-completion.tensorflow/model.py", line 65, in init
self.build_model()
File "/home/samuel/Desktop/DL_Final_Project/dcgan-completion.tensorflow/model.py", line 75, in build_model
self.z_sum = tf.histogram_summary("z", self.z)
AttributeError: 'module' object has no attribute 'histogram_summary'

Did anyone had the same problem? If so, how do I solve it?

Adapt the code for custom input format (different input size / channels)

First of all. Thanks a lot for sharing your code!

Do you have a suggestion which parts of the code I would need to adapt in order to make it work with 1-channel images that are larger than 64x64 pixels.
I tried to just change the image_size parameter, but that produces an error, unless I set the is_crop parameter to True, but that again would reduce the input size to 64x64.

I assume that I have to modify the get_image function?

def imread(path,c_dim):
    if c_dim == 3:
        im  = scipy.misc.imread(path, mode='RGB').astype(np.float)
    elif c_dim == 1:
        im  = scipy.misc.imread(path, mode='L').astype(np.float)
    else:
        print('Error: c_dim must be either 3 or 1')
            
    return im

but it still gives me an error:

ValueError: Cannot feed value of shape (64, 64, 128) for Tensor u'gt_labels:0', which has shape '(?, 64, 128, 1)'

Maybe I need to add a singleton dimension for the channels...

ValueError: Variable d_h0_conv/w/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

I got this error when training the model with tensorflow 1.1.0
Seems that this issue related to tf r1.0
I'm a beginner of tensorflow, could you give me some suggestions to fix this issue?

Have you implment the poisson blending in this part?

Hello,

I'm a newcomer of DCGAN.

Because I can not understand how to use poisson blending in this situation when I read Yeh's paper.

I notice that you also describe to use poisson blending to smooth the reconstructed image in your article.

I think that reviewing this project may help me, but I'm not sure which parts in this project are used to implement poisson blending.

many thanks

Regards

In the model.py, line 420, dimension is different with the one in original paper

Hi, Bamos,

In the line 420 of model.py code, self.z_, self.h0_w, self.h0_b = linear(z, self.gf_dim * 8 * 4 * 4, 'g_h0_lin', with_w=True)

self.gf_dim is 64, thus, the dimension of the output should be 64 * 8 * 4 * 4, which is 512 * 4 * 4. In the DCGAN paper the first layer should be 1024 * 4 * 4. Is there a reason for you to change it as 512 from 1024?

Thanks

ValueError: Shapes (64, 41, 21, 64) and (64, 32, 32, 64) are not compatible

I'm trying to train the code on my own data which is of shape [82 x 41 x 1]. I adapted the size of the fully connected layer in the discriminator accordingly and fixed data reading part to cope with single channel data. However the program still crashes with the above error message.
It crashes either on the line that calls for the gradient computation in def build_model(self):

self.grad_complete_loss = tf.gradients(self.complete_loss, self.z)

and if I comment it out it crashes with the same error in def train(self, config) on this line

tf.global_variables_initializer().run()

Any idea where the problem might come from?
I thought it might be due to the weight sharing between the 4 networks (generator, sample, discriminator and discriminator_), althought I made sure that the dimension do agree there as well:

def discriminator(self, image, reuse=False, output_size = 9216):
    if reuse:
        tf.get_variable_scope().reuse_variables()

    h0 = lrelu(conv2d(image, self.df_dim, name='d_h0_conv'))
    h1 = lrelu(self.d_bn1(conv2d(h0, self.df_dim*2, name='d_h1_conv')))
    h2 = lrelu(self.d_bn2(conv2d(h1, self.df_dim*4, name='d_h2_conv')))
    h3 = lrelu(self.d_bn3(conv2d(h2, self.df_dim*8, name='d_h3_conv')))
    h4 = linear(tf.reshape(h3, [-1, output_size]), 1, 'd_h3_lin')   # 64x64 --> 4x4x512 = 8192
    return tf.nn.sigmoid(h4), h4

def generator(self, z, c_dim, output_size = (64,64)):
    self.z_, self.h0_w, self.h0_b = linear(z, self.gf_dim*8*4*4, 'g_h0_lin', with_w=True) # 64x8x4x4 = 512x4x4

    self.h0 = tf.reshape(self.z_, [-1, 4, 4, self.gf_dim * 8]) # 512x4x4
    h0 = tf.nn.relu(self.g_bn0(self.h0))

    self.h1, self.h1_w, self.h1_b = conv2d_transpose(h0,
        [self.batch_size, 8, 8, self.gf_dim*4], name='g_h1', with_w=True) # 256x8x8
    h1 = tf.nn.relu(self.g_bn1(self.h1))

    h2, self.h2_w, self.h2_b = conv2d_transpose(h1,
        [self.batch_size, 16, 16, self.gf_dim*2], name='g_h2', with_w=True) # 128x16x16
    h2 = tf.nn.relu(self.g_bn2(h2))

    h3, self.h3_w, self.h3_b = conv2d_transpose(h2,
        [self.batch_size, 32, 32, self.gf_dim*1], name='g_h3', with_w=True) # 64x32x32
    h3 = tf.nn.relu(self.g_bn3(h3))

    h4, self.h4_w, self.h4_b = conv2d_transpose(h3,
        [self.batch_size, output_size[0], output_size[1], c_dim], name='g_h4', with_w=True) # 1x64x64
        
    return tf.nn.tanh(h4)

def sampler(self, z, cdim, y=None, output_size = (64,64)):
    tf.get_variable_scope().reuse_variables()

    h0 = tf.reshape(linear(z, self.gf_dim*8*4*4, 'g_h0_lin'),
                    [-1, 4, 4, self.gf_dim * 8])
    h0 = tf.nn.relu(self.g_bn0(h0, train=False))

    h1 = conv2d_transpose(h0, [self.batch_size, 8, 8, self.gf_dim*4], name='g_h1')
    h1 = tf.nn.relu(self.g_bn1(h1, train=False))

    h2 = conv2d_transpose(h1, [self.batch_size, 16, 16, self.gf_dim*2], name='g_h2')
    h2 = tf.nn.relu(self.g_bn2(h2, train=False))

    h3 = conv2d_transpose(h2, [self.batch_size, 32, 32, self.gf_dim*1], name='g_h3')
    h3 = tf.nn.relu(self.g_bn3(h3, train=False))

    h4 = conv2d_transpose(h3, [self.batch_size, output_size[0], output_size[1], cdim], name='g_h4')

about python version

hi can you tell me that python version your code can run? because there are so many strange bugs in my python2.7 . And also can you tell me what the tensorflow version? r0.10 0.11 0.12?
thanks very much

single channel images? seems not to like c_dim = 1

Any ideas regarding running the code with single channel PNGs?

2017-10-16 12:35:54.068937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-10-16 12:35:54.068969: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-10-16 12:35:54.069003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:87:00.0)
Traceback (most recent call last):
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py", line 671, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "/opt/conda/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot reshape a tensor with 786432 elements to shape [64,8,8,8,8,1] (262144 elements) for 'Reshape_1' (op: 'Reshape') with input shapes: [64,64,64,3], [6] and with input tensors computed as partial shapes: input[1] = [64,8,8,8,8,1].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./train-dcgan.py", line 39, in
is_crop=False, checkpoint_dir=FLAGS.checkpoint_dir)
File "/root/mydocs/VMWork2016/Tests/dcgan-completion.tensorflow/model.py", line 81, in init
self.build_model()
File "/root/mydocs/VMWork2016/Tests/dcgan-completion.tensorflow/model.py", line 98, in build_model
self.lowres_size, self.lowres, self.c_dim]), [2, 4])
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2451, in reshape
name=name)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2508, in create_op
set_shapes_for_outputs(ret)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1873, in set_shapes_for_outputs
shapes = shape_func(op)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1823, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/opt/conda/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Cannot reshape a tensor with 786432 elements to shape [64,8,8,8,8,1] (262144 elements) for 'Reshape_1' (op: 'Reshape') with input shapes: [64,64,64,3], [6] and with input tensors computed as partial shapes: input[1] = [64,8,8,8,8,1].

Getting error while running the code as directed using tensorflow r 0.12.1

Traceback (most recent call last):

File "./complete.py", line 33, in
dcgan = DCGAN(sess, image_size=args.imgSize,checkpoint_dir=args.checkpointDir, lam=args.lam)
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 65, in init
self.build_model()
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 81, in build_model
self.D_, self.D_logits_ = self.discriminator(self.G, reuse=True)
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/model.py", line 312, in discriminator
h1 = lrelu(self.d_bn1(conv2d(h0, self.df_dim*2, name='d_h1_conv')))
File "/home/anshit/Downloads/dcgan-completion.tensorflow-master/dcgan-completion.tensorflow-master/ops.py", line 34, in call
ema_apply_op = self.ema.apply([batch_mean, batch_var])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py", line 391, in apply
self._averages[var], var, decay, zero_debias=zero_debias))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py", line 70, in assign_moving_average
update_delta = _zero_debias(variable, value, decay)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py", line 177, in _zero_debias
trainable=False)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 1024, in get_variable
custom_getter=custom_getter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 850, in get_variable
custom_getter=custom_getter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 346, in get_variable
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 331, in _true_getter
caching_device=caching_device, validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 650, in _get_single_variable
"VarScope?" % name)

ValueError: Variable d_bn1/d_bn1_2/d_bn1_2/moments/moments_1/mean/ExponentialMovingAverage/biased does not exist, or was not created with tf.get_variable().
Did you mean to set reuse=None in VarScope?

Key generator/g_h5/w not found in checkpoint

When i run complete.py for some choosen 64X64 pictures, it raises a notfounderror: Key generator/g_h5/w not found in checkpoint. Does anyone know what is wrong with it? Please help me out.
Addition : I found i cannnot train a new dataset with the train-dcgan.py, which is very weird.

Why data in one epoch didn't shuffle?

dcgan-completion.tensorflow/model.py

Line 198 in 6b0a06c

data = dataset_files(config.dataset)

Hi, bamos, your code helps me a lot, but I don't understand why there is no shuffle in every epoch?
Thanks.

how many pictures do you use to train

how many face pictures do you use to train? I use 8000 ，is it enough?

Which version of TF is comparable?

I am wondering which version TF Which version of TF is comparable?

error while loading the pretrained model

when i run the demo using the pretrained model ,errors happened ,the log says the graph do not include d_bn1/moving_variance,i check the graph by using "print graph.get_operations()",and do not find d_bn1/moving_variance .i am not sure what happended...please help me

Caused by op u'save/RestoreV2_3', defined at:
File "complete.py", line 34, in
checkpoint_dir=args.checkpointDir, lam=args.lam)
File "/home/oeasy/pobei/dcgan-completion.tensorflow-master/model.py", line 65, in init
self.build_model()
File "/home/oeasy/pobei/dcgan-completion.tensorflow-master/model.py", line 110, in build_model
self.saver = tf.train.Saver(max_to_keep=1)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1000, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 361, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 200, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 441, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1128, in init
self._traceback = _extract_stack()

Update original code to support variable batch sizes

\cc @carpedm20 - ping me if somebody ever fixes this

sorry i don't know how to start a new train

i delete the checkpoint directory and use code:
./train-dcgan.py --dataset ./data/your-dataset/aligned --epoch 20

but everytime it tells me:
"An existing model was not found in the checkpoint directory. Initializing a new one"

but how to start to train a new one? or how to initializing a new one?

can you help me?