Giter Site home page Giter Site logo

glow's Introduction

Status: Archive (code is provided as-is, no updates expected)

Glow

Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"

To use pretrained CelebA-HQ model, make your own manipulation vectors and run our interactive demo, check demo folder.

Requirements

  • Tensorflow (tested with v1.8.0)
  • Horovod (tested with v0.13.8) and (Open)MPI

Run

pip install -r requirements.txt

To setup (Open)MPI, check instructions on Horovod github page.

Download datasets

For small scale experiments, use MNIST/CIFAR-10 (directly downloaded by train.py using keras)

For larger scale experiments, the datasets used are in the Google Cloud locations https://openaipublic.azureedge.net/glow-demo/data/{dataset_name}-tfr.tar. The dataset_names are below, we mention the exact preprocessing / downsampling method for a correct comparison of likelihood.

Quantitative results

  • imagenet-oord - 20GB. Unconditional ImageNet 32x32 and 64x64, as described in PixelRNN/RealNVP papers (we downloaded this processed version).
  • lsun_realnvp - 140GB. LSUN 96x96. Random 64x64 crops taken at processing time, as described in RealNVP.

Qualitative results

  • celeba - 4GB. CelebA-HQ 256x256 dataset, as described in Progressive growing of GAN's. For 1024x1024 version (120GB), use celeba-full-tfr.tar while downloading.
  • imagenet - 20GB. ImageNet 32x32 and 64x64 with class labels. Centre cropped, area downsampled.
  • lsun - 700GB. LSUN 256x256. Centre cropped, area downsampled.

To download and extract celeb for example, run

wget https://openaipublic.azureedge.net/glow-demo/data/celeba-tfr.tar
tar -xvf celeb-tfr.tar

Change hps.data_dir in train.py file to point to the above folder (or use the --data_dir flag when you run train.py)

For lsun, since download can be quite big, you can instead follow the instructions in data_loaders/generate_tfr/lsun.py to generate the tfr file directly from LSUN images. church_outdoor will be the smallest category.

Simple Train with 1 GPU

Run wtih small depth to test

CUDA_VISIBLE_DEVICES=0 python train.py --depth 1

Train with multiple GPUs using MPI and Horovod

Run default training script with 8 GPUs:

mpiexec -n 8 python train.py
Ablation experiments
mpiexec -n 8 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation [0/1/2] --flow_coupling [0/1] --seed [0/1/2] --learntop --lr 0.001

Pretrained models, logs and samples

wget https://openaipublic.azureedge.net/glow-demo/logs/abl-[reverse/shuffle/1x1]-[add/aff].tar
CIFAR-10 Quantitative result
mpiexec -n 8 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
ImageNet 32x32 Quantitative result
mpiexec -n 8 python train.py --problem imagenet-oord --image_size 32 --n_level 3 --depth 48 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
ImageNet 64x64 Quantitative result
mpiexec -n 8 python train.py --problem imagenet-oord --image_size 64 --n_level 4 --depth 48 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
LSUN 64x64 Quantitative result
mpiexec -n 8 python train.py --problem lsun_realnvp --category [bedroom/church_outdoor/tower] --image_size 64 --n_level 3 --depth 48 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8

Pretrained models, logs and samples

wget https://openaipublic.azureedge.net/glow-demo/logs/lsun-rnvp-[bdr/crh/twr].tar
CelebA-HQ 256x256 Qualitative result
mpiexec -n 40 python train.py --problem celeba --image_size 256 --n_level 6 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5
LSUN 96x96 and 128x128 Qualitative result
mpiexec -n 40 python train.py --problem lsun --category [bedroom/church_outdoor/tower] --image_size [96/128] --n_level 5 --depth 64 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5

Logs and samples

wget https://openaipublic.azureedge.net/glow-demo/logs/lsun-bdr-[96/128].tar
Conditional CIFAR-10 Qualitative result
mpiexec -n 8 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --ycond --weight_y=0.01
Conditional ImageNet 32x32 Qualitative result
mpiexec -n 8 python train.py --problem imagenet --image_size 32 --n_level 3 --depth 48 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --ycond --weight_y=0.01

glow's People

Contributors

1jsingh avatar christopherhesse avatar dpkingma avatar prafullasd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

glow's Issues

load graph_optimized.pb error no EdgeBias

I pip install the blocksparse. But there is also error:
raise ValueError('No op named %s in defined operations.' % node.op)
ValueError: No op named EdgeBias in defined operations.

Log determinant term off by constant for 5-bit images

Hi,

I understand that this might be a small issue and that you guys don't report bits/dim for celebA-HQ (256x256x3), but I think it's worth noting that the log-determinant might be off by a constant since we have an extra division in the preprocessing step (

glow/model.py

Line 156 in 654ddd0

x = tf.floor(x / 2 ** (8 - hps.n_bits_x))
).

If I'm not mistaken, this term is not added to the objective later on. Could you verify this? This might be of interest to future work which could intend on reporting the numbers for celebA-HQ (256x256x3)

Thanks,
Chen

How many epochs will be take when training celeba?

Hi, I'm trying to train a model on celeba datasets with your code. And I trained the model with the command line below:
mpiexec -n 8 python train.py --problem celeba --image_size 256 --n_level 6 --depth 32 --flow_permutation 2 --flow_coupl ing 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5

I just modify the gpu numbers to 8 because I only have a 8 cards server. Now, I have trained for 4days with 1300epochs, however, the results seems not good when I check the example images generate by code. So, could you tell me how many epochs it takes when you trained on celeba dataset? @prafullasd

python model.py File "model.py", line 146 feps[:, index: index+np.prod(shape)], (bs, *shape))) ^ SyntaxError: invalid syntax (base) bash-3.2$ python3 model.py bash: python3: command not found (base) bash-3.2$ python model.py File "model.py", line 146 feps[:, index: index+np.prod(shape)], (bs, *shape))) ^ SyntaxError: invalid syntax

python model.py
File "model.py", line 146
feps[:, index: index+np.prod(shape)], (bs, *shape)))
^
SyntaxError: invalid syntax
(base) bash-3.2$ python3 model.py
bash: python3: command not found
(base) bash-3.2$ python model.py
File "model.py", line 146
feps[:, index: index+np.prod(shape)], (bs, *shape)))
^
SyntaxError: invalid syntax

Typo in the source code

It seems

if arg_scope([get_variable_ddi], trainable=trainable):

should be

with arg_scope([get_variable_ddi], trainable=trainable):

glow/tfops.py

Line 72 in 654ddd0

if arg_scope([get_variable_ddi], trainable=trainable):

The purpose of the logscale_factor=3. in the actnorm function

Hello, I would like to ask what is the purpose of the logscale_factor in the actnorm function here?
I couldn't find any reference in the paper which would explain the reason if this variance modification. As far as I understand this implementation, we recover the paper description by setting logscale_factor=1. It is also clear to me that it just affects the initialization step, but it is interesting to know if this is some kind of trick which helped you or something else. Thanks for feedback.

How to solve 'Input is not Invertible error'?

I am trying to train a GLOW mapping on a custom dataset. However while training, I frequently receive a tensorflow.python.framework.errors_impl.InvalidArgumentError: Input is not invertible error. Upon seeing the logs, I see that the training/validation stats have reached either inf or nan.

I then tried to just reproduce your results for celeba 256x256 Qualitatively. However, I still face such issues. I am lost as to how to debug. I downloaded the celeba-tfr dataset locally.

Command:

python train.py --problem celeba --image_size 256 --n_level 6 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --data_dir=./celeba-tfr --verbose --epochs_full_valid=1 --epochs_full_sample=1 --n_train=30 --n_test=30

Namespace:

Namespace(anchor_size=32, beta1=0.9, category='', dal=1, data_dir='./celeba-tfr', depth=32, direct_it
erator=True, epochs=1000000, epochs_full_sample=1, epochs_full_valid=1, epochs_warmup=10, flow_coupli
ng=0, flow_permutation=2, fmap=1, full_test_its=30, gradient_checkpointing=1, image_size=256, inferen
ce=False, learntop=True, local_batch_init=4, local_batch_test=1, local_batch_train=1, logdir='./logs'
, lr=0.001, n_batch_init=256, n_batch_test=50, n_batch_train=64, n_bins=32.0, n_bits_x=5, n_levels=6,
 n_sample=1, n_test=30, n_train=30, n_y=1, optimizer='adamax', pmap=16, polyak_epochs=1, problem='cel
eba', restore_path='', rnd_crop=False, seed=0, test_its=1, top_shape=[4, 4, 384], train_its=1, verbos
e=True, weight_decay=1.0, weight_y=0.0, width=512, ycond=False)

Trace:

Starting training. Logging to /home/ubuntu/glow_/logs/
epoch n_processed n_images ips dtrain dtest dsample dtot train_results test_results msg
0 179.9140625 [2.5411766 2.5411766 0.        1.       ]
1 64 1 0.0 179.9 88.8 177.1 445.7 [2.5411766 2.5411766 0.        1.       ] [2.7737396 2.7737396 0.
      1.       ]  *
64 5.25806736946106 [2.6743338 2.6743338 0.        1.       ]
2 128 2 0.2 5.3 36.1 161.6 203.0 [2.6743338 2.6743338 0.        1.       ] [nan nan  0.  1.]
128 4.962073087692261 [nan nan  0.  1.]
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/clie
nt/session.py", line 1322, in _do_call
    return fn(*args)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/clie
nt/session.py", line 1307, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/clie
nt/session.py", line 1409, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input is not invertible.
         [[Node: model_3/1/28/invconv/MatrixInverse = MatrixInverse[T=DT_FLOAT, adjoint=false, _devic
e="/job:localhost/replica:0/task:0/device:GPU:0"](model/1/28/invconv/W/read)]]
         [[Node: model_3/5/6/f1/l_1/Shape/_79621 = _Recv[client_terminated=false, recv_device="/job:l
ocalhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0",
send_device_incarnation=1, tensor_name="edge_9830_model_3/5/6/f1/l_1/Shape", tensor_type=DT_INT32, _d
evice="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 414, in <module>
    main(hps)
  File "train.py", line 163, in main
    train(sess, model, hps, logdir, visualise)
  File "train.py", line 274, in train
    visualise(epoch)
  File "train.py", line 50, in draw_samples
    x_samples.append(sample_batch(y, [.0]*n_batch))
  File "train.py", line 33, in sample_batch
    y[i*n_batch:i*n_batch + n_batch], eps[i*n_batch:i*n_batch + n_batch]))
  File "/home/ubuntu/glow_/model.py", line 242, in sample
    return m.sess.run(x_sampled, {Y: _y, m.eps_std: _eps_std})
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input is not invertible.
         [[Node: model_3/1/28/invconv/MatrixInverse = MatrixInverse[T=DT_FLOAT, adjoint=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/1/28/invconv/W/read)]]
         [[Node: model_3/5/6/f1/l_1/Shape/_79621 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_9830_model_3/5/6/f1/l_1/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'model_3/1/28/invconv/MatrixInverse', defined at:
  File "train.py", line 414, in <module>
    main(hps)
  File "train.py", line 156, in main
    model = model.model(sess, hps, train_iterator, test_iterator, data_init)
  File "/home/ubuntu/glow_/model.py", line 239, in model
    x_sampled = f_sample(Y, m.eps_std)
  File "/home/ubuntu/glow_/model.py", line 232, in f_sample
    z = decoder(z, eps_std=eps_std)
  File "/home/ubuntu/glow_/model.py", line 97, in decoder
    z, _ = revnet2d(str(i), z, 0, hps, reverse=True)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
    return func(*args, **current_args)
  File "/home/ubuntu/glow_/model.py", line 342, in revnet2d
    z, logdet = revnet2d_step(str(i), z, logdet, hps, reverse)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
    return func(*args, **current_args)
  File "/home/ubuntu/glow_/model.py", line 411, in revnet2d_step
    "invconv", z, logdet, reverse=True)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
    return func(*args, **current_args)
  File "/home/ubuntu/glow_/model.py", line 467, in invertible_1x1_conv
    _w = tf.matrix_inverse(w)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/gen_linalg_ops.py", line 1049, in matrix_inverse
    "MatrixInverse", input=input, adjoint=adjoint, name=name)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Input is not invertible.
         [[Node: model_3/1/28/invconv/MatrixInverse = MatrixInverse[T=DT_FLOAT, adjoint=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/1/28/invconv/W/read)]]
         [[Node: model_3/5/6/f1/l_1/Shape/_79621 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_9830_model_3/5/6/f1/l_1/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

I suspected it is because of bad learning rates which might make the kernel non-invertible, I played with low LRs, but of no help.

[Question] Training Time

How much time should I expect Glow to train for on ImageNet (224x224) using 1 machine with 8 V100s? Are there any time benchmarks available?

Implementation vs paper

Hi,

First of all, thanks for this amazing work, it has been a pleasure to dive in the paper !

Now more precisely, when looking at the implementation and the implementation of the actnorm module, I can't understand the choice made considering the paper.
In the paper you state that you used a affine transformation of the activation with parameters s and b.
image

But in the implementation it seems to first add the bias b: x = x + b(with actnorm_center), then multiply by s: x = s * (x+b) (with actnorm_scale)

You reverse the code when reverse = True but I feel this might be the opposite.

I surely miss something as you manage to train the model but I am curious about this choice.

Do I miss something?

Thank you in advance for your help !

Victor

Image domain adaptation

Hey, I've read your paper and the results are awesome.
Just out of curiosity, will this model be suited for tasks like image domain adaptation? Or maybe segmentation adaptation?

Syntax Error

When I use the following command on the Cifar10 python-oriented dataset:

mpiexec -n 1 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --ycond --weight_y=0.01

I just get the following error:

File "train.py", line 23
print(*args, **kwargs)
^
SyntaxError: invalid syntax

Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.


mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[25954,1],0]
Exit code: 1

Any ideas as to what could be happening? I've not modified the code in any way, shape, or form.

How long does it take to train a cifar-10 glow model from scratch?

Hi,

I am trying to train the glow model on CIFAR-10.
I only have one gpu.
It seems that it take me one day to train only 50 epochs.
Is that normal?
I notice that in the paper, the shown results of CIFAR-10 is at epoch 1800.
(which means it may take me a month to train the glow model :( !)
I would like to know, whether it is possible for me to reproduce this result with only singe gpu.

Any comments would be appreciated. Thanks very much!

Training CelebA with attributes

I am trying to train celebA with attributes, but it looks like training with attributes is not well supported in the code. I made the appropriate changes (per the comments) to function parse_tfrecord_tf in get_data.py to retrieve and return attr, and also modified train.py function infer like so:

        if hps.direct_iterator:
            # replace with x, y, attr if you're getting CelebA attributes...
            x, y, attr = sess.run(iterator)
        else:
            x, y, attr = iterator()

It appears these changes are not sufficient, as I get errors in model.py's make_batch function. I fix that by retrieving attr from sess.run(data) and modifying the function output, but that only kicks the error to f_loss, so I make what I think are the appropriate extensions to accommodate the attributes, but a new error shows up somewhere else, and with each change I'm making, the more uncertain I am that I'm doing the right thing.

Am I doing something wrong? Can someone advise on how to train with attributes, or is this functionality just not fully supported?

Thanks!

Training stops without an error

I was running training for 2 days and it stoped without an error. I have reaced 31 epoch for celeba problem.

nohup python -u train.py --problem celeba --image_size 256 --n_level 6 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --data_dir /mnt/celeba/mnt/host/celeba-reshard-tfr/ &

Updated os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1', maybe it will be more verbose this time.

cat logs/train.txt
{"n_batch_init": 256, "flow_coupling": 0, "weight_y": 0.0, "restore_path": "", "verbose": false, "n_batch_test": 50, "n_batch_train": 64, "anchor_size": 32, "epochs": 1000000, "epochs_warmup": 10, "n_bits_x": 5, "rnd_crop": false, "category": "", "depth": 32, "learntop": true, "n_bins": 32.0, "beta1": 0.9, "n_train": 50000, "n_test": 3000, "seed": 0, "logdir": "./logs", "lr": 0.001, "full_test_its": 3000, "ycond": false, "n_levels": 6, "fmap": 1, "dal": 1, "top_shape": [4, 4, 384], "local_batch_test": 1, "local_batch_train": 1, "optimizer": "adamax", "polyak_epochs": 1, "pmap": 16, "weight_decay": 1.0, "n_sample": 1, "test_its": 47, "image_size": 256, "data_dir": "/mnt/celeba/mnt/host/celeba-reshard-tfr/", "train_its": 782, "epochs_full_sample": 50, "n_y": 1, "width": 512, "problem": "celeba", "epochs_full_valid": 50, "local_batch_init": 4, "direct_iterator": true, "gradient_checkpointing": 1, "flow_permutation": 2}
{"pred_loss": "1.0000", "train_time": 4076, "bits_x": "2.0117", "n_processed": 50048, "loss": "2.0117", "epoch": 1, "bits_y": "0.0000", "n_images": 782}
{"pred_loss": "1.0000", "train_time": 7960, "bits_x": "1.4431", "n_processed": 100096, "loss": "1.4431", "epoch": 2, "bits_y": "0.0000", "n_images": 1564}
{"pred_loss": "1.0000", "train_time": 11833, "bits_x": "1.3894", "n_processed": 150144, "loss": "1.3894", "epoch": 3, "bits_y": "0.0000", "n_images": 2346}
{"pred_loss": "1.0000", "train_time": 15698, "bits_x": "1.3369", "n_processed": 200192, "loss": "1.3369", "epoch": 4, "bits_y": "0.0000", "n_images": 3128}
{"pred_loss": "1.0000", "train_time": 19555, "bits_x": "1.3023", "n_processed": 250240, "loss": "1.3023", "epoch": 5, "bits_y": "0.0000", "n_images": 3910}
{"pred_loss": "1.0000", "train_time": 23406, "bits_x": "1.2827", "n_processed": 300288, "loss": "1.2827", "epoch": 6, "bits_y": "0.0000", "n_images": 4692}
{"pred_loss": "1.0000", "train_time": 27373, "bits_x": "1.2652", "n_processed": 350336, "loss": "1.2652", "epoch": 7, "bits_y": "0.0000", "n_images": 5474}
{"pred_loss": "1.0000", "train_time": 31235, "bits_x": "1.2522", "n_processed": 400384, "loss": "1.2522", "epoch": 8, "bits_y": "0.0000", "n_images": 6256}
{"pred_loss": "1.0000", "train_time": 35093, "bits_x": "1.2383", "n_processed": 450432, "loss": "1.2383", "epoch": 9, "bits_y": "0.0000", "n_images": 7038}
{"pred_loss": "1.0000", "train_time": 38955, "bits_x": "1.2361", "n_processed": 500480, "loss": "1.2361", "epoch": 10, "bits_y": "0.0000", "n_images": 7820}
{"pred_loss": "1.0000", "train_time": 42828, "bits_x": "1.2206", "n_processed": 550528, "loss": "1.2206", "epoch": 11, "bits_y": "0.0000", "n_images": 8602}
{"pred_loss": "1.0000", "train_time": 46698, "bits_x": "1.2128", "n_processed": 600576, "loss": "1.2128", "epoch": 12, "bits_y": "0.0000", "n_images": 9384}
{"pred_loss": "1.0000", "train_time": 50564, "bits_x": "1.1951", "n_processed": 650624, "loss": "1.1951", "epoch": 13, "bits_y": "0.0000", "n_images": 10166}
{"pred_loss": "1.0000", "train_time": 54421, "bits_x": "1.1983", "n_processed": 700672, "loss": "1.1983", "epoch": 14, "bits_y": "0.0000", "n_images": 10948}
{"pred_loss": "1.0000", "train_time": 58295, "bits_x": "1.1887", "n_processed": 750720, "loss": "1.1887", "epoch": 15, "bits_y": "0.0000", "n_images": 11730}
{"pred_loss": "1.0000", "train_time": 62163, "bits_x": "1.1754", "n_processed": 800768, "loss": "1.1754", "epoch": 16, "bits_y": "0.0000", "n_images": 12512}
{"pred_loss": "1.0000", "train_time": 66025, "bits_x": "1.1826", "n_processed": 850816, "loss": "1.1826", "epoch": 17, "bits_y": "0.0000", "n_images": 13294}
{"pred_loss": "1.0000", "train_time": 69890, "bits_x": "1.1680", "n_processed": 900864, "loss": "1.1680", "epoch": 18, "bits_y": "0.0000", "n_images": 14076}
{"pred_loss": "1.0000", "train_time": 73756, "bits_x": "1.1749", "n_processed": 950912, "loss": "1.1749", "epoch": 19, "bits_y": "0.0000", "n_images": 14858}
{"pred_loss": "1.0000", "train_time": 77620, "bits_x": "1.1742", "n_processed": 1000960, "loss": "1.1742", "epoch": 20, "bits_y": "0.0000", "n_images": 15640}
{"pred_loss": "1.0000", "train_time": 81488, "bits_x": "1.1676", "n_processed": 1051008, "loss": "1.1676", "epoch": 21, "bits_y": "0.0000", "n_images": 16422}
{"pred_loss": "1.0000", "train_time": 85357, "bits_x": "1.1604", "n_processed": 1101056, "loss": "1.1604", "epoch": 22, "bits_y": "0.0000", "n_images": 17204}
{"pred_loss": "1.0000", "train_time": 89222, "bits_x": "1.1595", "n_processed": 1151104, "loss": "1.1595", "epoch": 23, "bits_y": "0.0000", "n_images": 17986}
{"pred_loss": "1.0000", "train_time": 93085, "bits_x": "1.1667", "n_processed": 1201152, "loss": "1.1667", "epoch": 24, "bits_y": "0.0000", "n_images": 18768}
{"pred_loss": "1.0000", "train_time": 96944, "bits_x": "1.1598", "n_processed": 1251200, "loss": "1.1598", "epoch": 25, "bits_y": "0.0000", "n_images": 19550}
{"pred_loss": "1.0000", "train_time": 100799, "bits_x": "1.1596", "n_processed": 1301248, "loss": "1.1596", "epoch": 26, "bits_y": "0.0000", "n_images": 20332}
{"pred_loss": "1.0000", "train_time": 104652, "bits_x": "1.1489", "n_processed": 1351296, "loss": "1.1489", "epoch": 27, "bits_y": "0.0000", "n_images": 21114}
{"pred_loss": "1.0000", "train_time": 108512, "bits_x": "1.1517", "n_processed": 1401344, "loss": "1.1517", "epoch": 28, "bits_y": "0.0000", "n_images": 21896}
{"pred_loss": "1.0000", "train_time": 112365, "bits_x": "1.1525", "n_processed": 1451392, "loss": "1.1525", "epoch": 29, "bits_y": "0.0000", "n_images": 22678}
{"pred_loss": "1.0000", "train_time": 116231, "bits_x": "1.1481", "n_processed": 1501440, "loss": "1.1481", "epoch": 30, "bits_y": "0.0000", "n_images": 23460}
{"pred_loss": "1.0000", "train_time": 120128, "bits_x": "1.1306", "n_processed": 1551488, "loss": "1.1306", "epoch": 31, "bits_y": "0.0000", "n_images": 24242}

Access denied when extracting datasets

curl https://storage.googleapis.com/glow-demo/celeb-tfr.tar

Got the following permissions error:

<Error>
<Code>AccessDenied</Code>
<Message>Access denied.</Message>
<Details>
Anonymous caller does not have storage.objects.get access to glow-demo/celeb-tfr.tar.
</Details>
</Error>

How do use this demo in cpu?

Traceback (most recent call last):
File "/home/hule/myself/project/workspace/glow/demo/server.py", line 1, in
from demo import model
File "/home/hule/myself/project/workspace/glow/demo/model.py", line 85, in
graph_def_optimized.ParseFromString(f.read())
google.protobuf.message.DecodeError: Error parsing message

run 'python model.py' in the demo folder

there is a error
File "model.py", line 88, in
tf.import_graph_def(graph_def_optimized)
File "/home/lhx/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "/home/lhx/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 489, in import_graph_def
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'EdgeBias' in binary running on lhx-910. Make sure the Op and Kernel are registered in the binary running in this process.

ERROR 403 when running wget to get pre-trained model

Hello @prafullasd ,
I tried runing the following command in README to download the pre-trained mode:

wget https://storage.googleapis.com/glow-demo/logs/lsun-bdr-[96/128].tar

but it shows:

HTTP request sent, awaiting response... 403 Forbidden
2018-09-01 23:14:13 ERROR 403: Forbidden.

Would you please check the server? Thanks in advance!

how should I upload a image to server

Hi
I run the code on a server computer. I accessed the server via SSH.
So when I open the demo/web/index.html and try to upload an image in the local. The uploading seems never finish.
So how should I upload an image?
I also try to run python -m http.server 9090 in the demo folder. And accessed it in my local computer via ip:9090/web. The problem still exist.

python model.py error

there is a error.

File "model.py", line 146
feps[:, index: index+np.prod(shape)], (bs, *shape)))

centos 7
python 2.7

Manipulation of vectors

Which file would i need to modify lets say i want to change the face to a crying person or a sleeping person.

Preprocessing

Could you please you upload the scripts for preprocessing images?

load graph_optimized.pb error no EdgeBias???

tf.import_graph_def(graph_def_optimized) load graph_optimized.pb error
##########################################################################
tensorflow 1.8.0 error
hortatech@1080Ti:/projects/glow/demo$ python server.py
2018-07-24 18:11:15.266967: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-07-24 18:11:15.348810: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-07-24 18:11:15.349242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.46GiB
2018-07-24 18:11:15.349256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-07-24 18:11:15.529259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-24 18:11:15.529301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-07-24 18:11:15.529308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-07-24 18:11:15.529496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10119 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "server.py", line 1, in
import model
File "/home/hortatech/projects/glow/demo/model.py", line 88, in
tf.import_graph_def(graph_def_optimized)
File "/home/hortatech/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "/home/hortatech/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 489, in import_graph_def
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'EdgeBias' in binary running on 1080Ti. Make sure the Op and Kernel are registered in the binary running in this process.
#############################################################
tensorflow 1.7.0
hortatech@1080Ti:
/projects/glow/demo$ python server.py
2018-07-24 19:07:14.947274: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-07-24 19:07:15.041920: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-07-24 19:07:15.044762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.46GiB
2018-07-24 19:07:15.044780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-07-24 19:07:15.217369: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-24 19:07:15.217395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-07-24 19:07:15.217402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-07-24 19:07:15.217590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10121 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "server.py", line 1, in
import model
File "/home/hortatech/projects/glow/demo/model.py", line 88, in
tf.import_graph_def(graph_def_optimized)
File "/home/hortatech/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "/home/hortatech/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 570, in import_graph_def
raise ValueError('No op named %s in defined operations.' % node.op)
ValueError: No op named EdgeBias in defined operations.

ModuleNotFoundError: No module named 'zeus.tfops'; 'zeus' is not a package

I tried running the training script with:

mpiexec -n 4 python train.py --data_dir imagenet-tfr --problem imagenet --image_size=32

Which results in the following error:

Traceback (most recent call last):
  File "train.py", line 289, in <module>
    main(hps)
  File "train.py", line 123, in main
    import model
  File "/home/edrick/glow/model.py", line 4, in <module>
    import optim
  File "/home/edrick/glow/optim.py", line 2, in <module>
    import zeus.tfops as Z
ModuleNotFoundError: No module named 'zeus.tfops'; 'zeus' is not a package
Traceback (most recent call last):
  File "train.py", line 289, in <module>
    main(hps)
  File "train.py", line 123, in main
    import model
  File "/home/edrick/glow/model.py", line 4, in <module>
    import optim
  File "/home/edrick/glow/optim.py", line 2, in <module>
    import zeus.tfops as Z
ModuleNotFoundError: No module named 'zeus.tfops'; 'zeus' is not a package
50000 50 4
Train epoch size: 50176
Traceback (most recent call last):
  File "train.py", line 289, in <module>
    main(hps)
  File "train.py", line 123, in main
    import model
  File "/home/edrick/glow/model.py", line 4, in <module>
    import optim
  File "/home/edrick/glow/optim.py", line 2, in <module>
    import zeus.tfops as Z
ModuleNotFoundError: No module named 'zeus.tfops'; 'zeus' is not a package

I installed zeus using pip install zeus which gives me zeus==0.1.1

--n_batch_train 2 fails

Trying to use a bigger batch size, but it fails with

Traceback (most recent call last):
  File "train.py", line 495, in <module>
    main(hps)
  File "train.py", line 226, in main
    train_iterator, test_iterator, data_init = get_data(hps, sess)
  File "train.py", line 100, in get_data
    hps.local_batch_train]  # round down to closest divisor of 50
KeyError: 0

Any plan to make a pytorch version in the future?

Hi,
I've just read your paper and I'm impressed by the good results that you got, especially for a non adversarial model.
Have you planned to make a pytorch porting in the next weeks?
I imagine several things that you implemented are not still available on pytorch, so doing the porting myself would not be an easy task.

about actnorm

The data after actnorm and reverse actnorm is very different from the original data,

##my code begin
a = tf.constant(np.random.randn(3,4,4,3), dtype = tf.float32, name = "a")
net = a

with tf.variable_scope("scope") as scope:
for i in range(30):
net = actnorm("act"+str(i), net)

with tf.variable_scope("scope") as scope:
scope.reuse_variables()
for j in reversed(range(30)):
net = actnorm("act"+str(j), net, reverse = True)

with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
_a, _net = sess.run([a, net])

print("_a - _net is", _a - _net)

##end

##output begin
"_a - _net is"......
[[ 1.5863475e+05 -8.5023642e-02 1.2175249e+04]
[ 1.5863484e+05 4.7517967e-01 1.2174989e+04]
[ 1.5863616e+05 5.8058143e-02 1.2174503e+04]
[ 1.5863489e+05 -5.7814121e-02 1.2174427e+04]]

[[ 1.5863278e+05 1.6660476e-01 1.2175197e+04]
[ 1.5863495e+05 4.9495590e-01 1.2174029e+04]
[ 1.5863642e+05 -2.7934742e-01 1.2174598e+04]
[ 1.5863430e+05 4.6225822e-01 1.2173828e+04]]

[[ 1.5863350e+05 9.5161140e-02 1.2175784e+04]
[ 1.5863411e+05 9.7796828e-02 1.2175356e+04]
[ 1.5863566e+05 1.8912268e-01 1.2171764e+04]
[ 1.5863397e+05 1.6332370e-01 1.2173685e+04]]]]
......
##output end
I want to know how to solve this problem,if this reverse operation can't reverse completely?
thanks

HELP! I got syntaxError

I got SyntaxError.

File "*****/glow-master/model.py", line 311
eps.append(np.reshape(feps[:, index: index+np.prod(shape)], (bs, *shape)))

Is it because I use python 2.7? If I do not want to upgrade to python 3.x, is there any solution?

Thanks in advance!

how to use demo/align_image.py?

OLD USAGE
python align_faces.py --shape-predictor shape_predictor_68_face_landmarks.dat --image images/example_01.jpg

It is useless

How do we use custom data to learn GLOW mappings?

The code at github.com/openai/glow is only for demo and is intended to be reproducible for a class of fixed problems like mnist, cifar, celeba, lsun, etc.

How do I feed it my own dataset? What are the constraints on the tfrecords like no. of shards, etc. How do we create custom tfrecords using generate.py in data_loaders/?

Op type not registered "Edge Bias" in binary running on ...

System:
Ubuntu 16.04
Anaconda Python 3.6
Tensorflow 1.8.0 built from source
GPU Compute Capability 5.0
Cuda 9.2, CuDNN 7.1

After installed the required libraries, I tried on of the demos with:

python videos.py 

in demo.

This is the traceback.

Traceback (most recent call last):
  File "videos.py", line 1, in <module>
    from model import encode, manipulate_range, mix_range
  File "/home/karen/workspace/blog/glow/glow/demo/model.py", line 88, in <module>
    tf.import_graph_def(graph_def_optimized)
  File "/home/karen/anaconda3/envs/glow/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
    return func(*args, **kwargs)
  File "/home/karen/anaconda3/envs/glow/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 489, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'EdgeBias' in binary running on

python model.py

when i run the model.py in the demo folder ,there is a error
tensorflow.python.framework.errors_impl.NotFoundError: /home/lhx/miniconda3/lib/python3.6/site-packages/blocksparse/blocksparse_ops.so: undefined symbol: ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumES3_S3_S3

what is the reason?

Failed to install blocksparse

pip install blocksparse-1.0.0-py2.py3-none-any.whl
Processing ./blocksparse-1.0.0-py2.py3-none-any.whl
Requirement already satisfied: numpy in /home/user/anaconda3/lib/python3.5/site-packages (from blocksparse==1.0.0) (1.14.3)
Requirement already satisfied: scipy in /home/user/anaconda3/lib/python3.5/site-packages (from blocksparse==1.0.0) (1.1.0)
Installing collected packages: blocksparse
Successfully installed blocksparse-1.0.0

python
Python 3.5.5 |Anaconda, Inc.| (default, May 13 2018, 21:12:35)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import blocksparse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/anaconda3/lib/python3.5/site-packages/blocksparse/__init__.py", line 6, in <module>
    bs_module = load_op_library(os.path.join(get_data_files_path(), 'blocksparse_ops.so'))
  File "/home/user/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/user/anaconda3/lib/python3.5/site-packages/blocksparse/blocksparse_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumES3_S3_S3_

TypeError:unbound method <lambda>() must be called with o instance as first argument (got Tensor instance instead)

When I ran this code, I got the error listed as below.

mpiexec -n 1 --allow-run-as-root python train.py --problem celeba --image_size 256 --n_level 6 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5
/usr/local/lib/python2.7/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as register_converters
Rank 0 Batch sizes Train 1 Test 1 Init 4
(3000, 1, 1)
Train epoch size: 49984
('Creating pad', '1_1
[130, 130]')
Traceback (most recent call last):
File "train.py", line 386, in
main(hps)
File "train.py", line 162, in main
model = model.model(sess, hps, train_iterator, test_iterator, data_init)
File "/data/home/ervinchen/share/glow-master/model.py", line 238, in model
test_iterator, data_init, lr, f_loss)
File "/data/home/ervinchen/share/glow-master/model.py", line 25, in abstract_model_xy
loss_train, stats_train = f_loss(train_iterator, True)
File "/data/home/ervinchen/share/glow-master/model.py", line 226, in f_loss
bits_x, bits_y, pred_loss = _f_loss(x, y, is_training, reuse)
File "/data/home/ervinchen/share/glow-master/model.py", line 171, in _f_loss
z, objective = encoder(z, objective)
File "/data/home/ervinchen/share/glow-master/model.py", line 89, in encoder
z, objective = split2d("pool"+str(i), z, objective=objective)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "/data/home/ervinchen/share/glow-master/model.py", line 486, in split2d
objective += pz.logp(z2)
TypeError: unbound method () must be called with o instance as first argument (got Tensor instance instead)

Segmentation fault

when I train the model ,there is a error,
mpiexec -n 2 python train.py --problem imagenet-oord --depth 8 --data_dir /home/lhx/lhx/new_paper/glow/data/mnt/host/imagenet-oord-tfr --epochs 20000 --fmap 16 --epochs_full_valid 10 --n_batch_train 128

[lhx-910:19474] *** Process received signal ***
[lhx-910:19474] Signal: Segmentation fault (11)
[lhx-910:19474] Signal code: Address not mapped (1)
[lhx-910:19474] Failing at address: (nil)
[lhx-910:19474] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f13ce25a390]
[lhx-910:19474] [ 1] /home/lhx/miniconda3/lib/python3.6/site-packages/horovod/common/mpi_lib.cpython-36m-x86_64-linux-gnu.so(+0x22603)[0x7f13cc796603]
[lhx-910:19474] [ 2] /home/lhx/miniconda3/lib/python3.6/site-packages/horovod/common/mpi_lib.cpython-36m-x86_64-linux-gnu.so(+0x24bfa)[0x7f13cc798bfa]
[lhx-910:19474] [ 3] /home/lhx/miniconda3/lib/libstdc++.so.6(+0xafc5c)[0x7f13cbff8c5c]
[lhx-910:19474] [ 4] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f13ce2506ba]
[lhx-910:19474] [ 5] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f13cdf8641d]
[lhx-910:19474] *** End of error message ***

mpiexec noticed that process rank 0 with PID 19474 on node lhx-910 exited on signal 11 (Segmentation fault).

what is the reason?

Segmentation fault (core dumped) on get_manipulators.py

python get_manipulators.py
2018-08-30 19:52:35.634696: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-08-30 19:52:36.314295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties:
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:01:00.0
totalMemory: 11.92GiB freeMemory: 11.81GiB
2018-08-30 19:52:36.314326: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-08-30 19:52:36.512652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-30 19:52:36.512693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0
2018-08-30 19:52:36.512699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N
2018-08-30 19:52:36.512930: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11431 MB memory) -> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0, compute capability: 5.2)
Loaded model
Warm started tf model
Segmentation fault (core dumped)

About manipulating the "Eyeglasses" attribute

I have been trying to use the provided "z_manipulate.npy" vector to add eye glasses to image. But the results I got are really bad. Has anybody tried this out? Is it supposed to be this bad or am I doing it wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.