Giter Site home page Giter Site logo

fullfanta / multimodal_transfer Goto Github PK

View Code? Open in Web Editor NEW
34.0 3.0 9.0 82.11 MB

tensorflow implementation of 'Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer'

Python 98.87% Shell 1.13%
tensorflow resolution style-transfer

multimodal_transfer's Introduction

Style transfer

This is tensorflow implementation of 'Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer' which generates stylized image in high resulution such as 1024 pixels.

Download program

$ git clone https://github.com/fullfanta/multimodal_style_transfer.git

Train

To train network, I use MS coco dataset.

$ cd multimodal_style_transfer
$ bash get_coco.sh
  • downloaded image is in 'data/train2014'.

For stylization, pretrained VGG16 is necessary.

$ bash get_vgg16.sh

Then training is SIMPLE.

$ python train.py
  • If you have multiple GPU cards, use CUDA_VISIBLE_DEVICES to specify GPU card.
  • Trained model is in summary.

During training, you can see generated images through tensorboard.

$ tensorboard --logdir=summary

Freeze model

$ sh freeze.sh 10000
  • parameter is iteration number among saved check point files.
  • It generates pb file which contains weights as contant.

Test

$ python stylize.py --model=models/starry_night.pb --input_image=test_images/jolie.jpg
  • It generates hierarchical stylized images and save them to 'test_images/jolie_output_1.jpg', 'test_images/jolie_output_2.jpg', and 'test_images/jolie_output_3.jpg'. Their sizes are 256, 512 and 1024 in short edge.
  • Parameters:
--model : freezed model path
--input_image : image file path to stylize
--hierarchical_short_edges : three short edge length to generate images. (default is 256, 512, 1024)

Examples

Input Output(256px) Output(512px) Output(1024px)
Angelina Jolie
Dinosour
Ryan
Cheez
Herb

Acknowledgement

multimodal_transfer's People

Contributors

fullfanta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

multimodal_transfer's Issues

How to get training images of 1024x1024?

In the paper, training images of 512x512 are resized from those w,h>=480 in MSCOCO, but how to get 1024x1024 since there is no that high resolution in MSCOCO?

style problem

Hi, I try to run your code, found there will be 3 images output (256,512,1024), are they Singular Transfer result, how can I get Multimodal Transfer result as your paper said
THank you!

Exception loading graph

Hi wanted to try this algorithm to see results on large images.

Running into the following exception when loading the graph:

ValueError: Tensor("Placeholder:0", shape=(1, 1200, 630, 3), dtype=float32) must be from the same graph as name: "style_subnet/Shape_1"

google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'

I was training the model and saved it, now I am trying to load but unable to do. I have seen in the previous posts as well, but reference links are not working.

Code snippet:

#load model

with tf.io.gfile.GFile(args.model, "rb") as f:
    graph_def = tf.compat.v1.GraphDef()
    graph_def.ParseFromString(f.read())

# with tf.Graph().as_default() as graph:
generated_image_1, generated_image_2, generated_image_3, = tf.graph_util.import_graph_def(
        graph_def, 
        input_map={'input_image' : input_tensor, 'short_edge_1' : short_edge_1, 'short_edge_2' : short_edge_2, 'short_edge_3' : short_edge_3}, 
        return_elements=['style_subnet/conv-block/resize_conv_1/output:0', 'enhance_subnet/resize_conv_1/output:0', 'refine_subnet/resize_conv_1/output:0'],  
        producer_op_list=None
    )

Error

Traceback (most recent call last):

  File "stylize.py", line 97, in <module>
    main()
  File "stylize.py", line 57, in main
    graph_def.ParseFromString(f.read())
google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'

Note: If need more information about this, will sure post add it here. Let me know

node 'Placeholder' in input_map does not exist in graph

Traceback (most recent call last):
  File "stylize.py", line 86, in <module>
    main()
  File "stylize.py", line 54, in main
    producer_op_list=None
  File "C:\Users\shubham\myenv\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\shubham\myenv\lib\site-packages\tensorflow_core\python\framework\importer.py", line 405, in import_graph_def
    producer_op_list=producer_op_list)
  File "C:\Users\shubham\myenv\lib\site-packages\tensorflow_core\python\framework\importer.py", line 505, in _import_graph_def_internal
    raise ValueError(str(e))
ValueError: node 'Placeholder' in input_map does not exist in graph (input_map entry: input_image:0->Placeholder:0)

trainging time

hi,i just wonder how long does it take to complete a training session.I want to check my code against reference training time.so it would be nice if you could tell me how much time the CPU and GPU take respectively.thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.