fullfanta / multimodal_transfer Goto Github PK

tensorflow implementation of 'Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer'

Python 98.87% Shell 1.13%

tensorflow resolution style-transfer

multimodal_transfer's Introduction

Style transfer

This is tensorflow implementation of 'Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer' which generates stylized image in high resulution such as 1024 pixels.

Download program

$ git clone https://github.com/fullfanta/multimodal_style_transfer.git

Train

To train network, I use MS coco dataset.

$ cd multimodal_style_transfer
$ bash get_coco.sh

downloaded image is in 'data/train2014'.

For stylization, pretrained VGG16 is necessary.

$ bash get_vgg16.sh

Then training is SIMPLE.

$ python train.py

If you have multiple GPU cards, use CUDA_VISIBLE_DEVICES to specify GPU card.
Trained model is in summary.

During training, you can see generated images through tensorboard.

$ tensorboard --logdir=summary

Freeze model

$ sh freeze.sh 10000

parameter is iteration number among saved check point files.
It generates pb file which contains weights as contant.

Test

$ python stylize.py --model=models/starry_night.pb --input_image=test_images/jolie.jpg

It generates hierarchical stylized images and save them to 'test_images/jolie_output_1.jpg', 'test_images/jolie_output_2.jpg', and 'test_images/jolie_output_3.jpg'. Their sizes are 256, 512 and 1024 in short edge.
Parameters:

--model : freezed model path
--input_image : image file path to stylize
--hierarchical_short_edges : three short edge length to generate images. (default is 256, 512, 1024)

Examples

	Input	Output(256px)	Output(512px)	Output(1024px)
Angelina Jolie
Dinosour
Ryan
Cheez
Herb

Acknowledgement

For instance normalization, I refer 'https://github.com/ghwatson/faststyle'.
For pretrained VGG16 network, I refer 'https://www.cs.toronto.edu/~frossard/post/vgg16/'.

multimodal_transfer's People

Contributors

Stargazers

Watchers

Forkers

stefanxinhong asanakoy xinshu inkimage fountainhead-gq suke27 c00renut shubhambagwari reyuwei

multimodal_transfer's Issues

How to get training images of 1024x1024?

In the paper, training images of 512x512 are resized from those w,h>=480 in MSCOCO, but how to get 1024x1024 since there is no that high resolution in MSCOCO?

style problem

Hi, I try to run your code, found there will be 3 images output (256,512,1024), are they Singular Transfer result, how can I get Multimodal Transfer result as your paper said
THank you!

Exception loading graph

Hi wanted to try this algorithm to see results on large images.

Running into the following exception when loading the graph:

ValueError: Tensor("Placeholder:0", shape=(1, 1200, 630, 3), dtype=float32) must be from the same graph as name: "style_subnet/Shape_1"

google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'

I was training the model and saved it, now I am trying to load but unable to do. I have seen in the previous posts as well, but reference links are not working.

Code snippet:

#load model

with tf.io.gfile.GFile(args.model, "rb") as f:
    graph_def = tf.compat.v1.GraphDef()
    graph_def.ParseFromString(f.read())

# with tf.Graph().as_default() as graph:
generated_image_1, generated_image_2, generated_image_3, = tf.graph_util.import_graph_def(
        graph_def, 
        input_map={'input_image' : input_tensor, 'short_edge_1' : short_edge_1, 'short_edge_2' : short_edge_2, 'short_edge_3' : short_edge_3}, 
        return_elements=['style_subnet/conv-block/resize_conv_1/output:0', 'enhance_subnet/resize_conv_1/output:0', 'refine_subnet/resize_conv_1/output:0'],  
        producer_op_list=None
    )

Error

Traceback (most recent call last):

  File "stylize.py", line 97, in <module>
    main()
  File "stylize.py", line 57, in main
    graph_def.ParseFromString(f.read())
google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'

Note: If need more information about this, will sure post add it here. Let me know

node 'Placeholder' in input_map does not exist in graph

Traceback (most recent call last):
  File "stylize.py", line 86, in <module>
    main()
  File "stylize.py", line 54, in main
    producer_op_list=None
  File "C:\Users\shubham\myenv\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\shubham\myenv\lib\site-packages\tensorflow_core\python\framework\importer.py", line 405, in import_graph_def
    producer_op_list=producer_op_list)
  File "C:\Users\shubham\myenv\lib\site-packages\tensorflow_core\python\framework\importer.py", line 505, in _import_graph_def_internal
    raise ValueError(str(e))
ValueError: node 'Placeholder' in input_map does not exist in graph (input_map entry: input_image:0->Placeholder:0)

hi,i just wonder how long does it take to complete a training session.I want to check my code against reference training time.so it would be nice if you could tell me how much time the CPU and GPU take respectively.thanks a lot!

Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(cuda_context->context()) (0 vs. 4)

Hi, thx for your great code! However, when I run train.py, I've got Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(cuda_context->context()) (0 vs. 4)

Could you help me?