Giter Site home page Giter Site logo

pretrained-show-and-tell-model's Introduction

(tl;dr)
2M iterations finetuned checkpoint file | Released under MIT License

1M iterations checkpoint file | Released under MIT License

word_counts.txt (at this repository)

model.ckpt-2000000.index (at this repository. Place it in the same folder as the model checkpoint used.)

model.ckpt-1000000.index (at this repository. Place it in the same folder as the model checkpoint used.)

Show and Tell : A Neural Image Caption Generator

Pretrained model for Tensorflow implementation found at tensorflow/models of the image-to-text paper described at:

"Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge."

Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan.

Full text available at: http://arxiv.org/abs/1609.06647

Contact

Kranthi Kiran GV (KranthiGV | [email protected])

Generating Captions

Steps

  1. Follow the steps at im2txt to clone the repository, install bazel, etc.

  2. Download the desired model checkpoint:
    2M iterations finetuned checkpoint file | Released under MIT License
    1M iterations checkpoint file | Released under MIT License

  3. Clone the repository: git clone https://github.com/KranthiGV/Pretrained-Show-and-Tell-model.git

# Path to checkpoint file.
# Notice there's no data-00000-of-00001 in the CHECKPOINT_PATH environment variable
# Also make sure you place model.ckpt-2000000.index (which is cloned from the repository)
# in the same location as model.ckpt-2000000.data-00000-of-00001
# You can use model.ckpt-1000000.data-00000-of-00001 similarly
CHECKPOINT_PATH="/path/to/model.ckpt-2000000"


# Vocabulary file generated by the preprocessing script.
# Since the tokenizer could be of a different version, use the word_counts.txt file supplied. 
VOCAB_FILE="/path/to/word_counts.txt"

# JPEG image file to caption.
IMAGE_FILE="/path/to/image.jpeg"

# Build the inference binary.
bazel build -c opt im2txt/run_inference

# Run inference to generate captions.
bazel-bin/im2txt/run_inference \
  --checkpoint_path=${CHECKPOINT_PATH} \
  --vocab_file=${VOCAB_FILE} \
  --input_files=${IMAGE_FILE}

Extras

  1. Graph.pbtxt is uploaded on request.
  2. Training stats are uploaded for use with tensorboard.
    tensorboard --logdir="./extras/tensorboard/"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.