Giter Site home page Giter Site logo

murari023 / neuraltalk2-tensorflow Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yiyang92/neuraltalk2-tensorflow

1.0 1.0 0.0 326 KB

Inspired by Andrey Karpathy's Neuraltalk 2 VGG16-LSTM image captioning model

Python 18.50% Jupyter Notebook 81.50%

neuraltalk2-tensorflow's Introduction

Neuraltalk2 in tensorflow

Overview

Tensorflow Implementation of image captioning model, inspired by Andrej Karpathy's neuraltalk2. Also added option to use prepared attribute vectors during training.

Usage

Training:

You will need to download image net weights for VGG16 first:https://yadi.sk/d/V6Rfzfei3TdKCH

Specify your mscoco directory in utils/parameters.py and launch:

python main.py --gpu 'your gpu'

Note: train/validation split can be changed simply by setting gen_val_captions parameter. Default is set to 4000 so we will have ~120000 in training set.

Note2: You will need to launch preprocess.py script first to obtain images hdf5 file. It is done for speed up image loading during fine-tuning the model.

Parameters

Parameters can be set directly in in utils/parameters.py file. (or specify through command line parameters). For example:

python main.py --gpu 0 --embed_dim 256 --dec_hid 512 --epochs 50 --temperature 0.6 --gen_name test --dec_lstm_drop 0.7 --lr 0.001 --checkpoint test1 --coco_dir "/home/username/mscoco/coco/" --optimizer Adam --sample_gen greedy

Generation

Two options:

  1. Using main.py

After some training just launch:

python main.py --gpu 'your gpu' --mode inference

If you used fine-tuning will need just to add --fine_tune to the parameters:

python main.py --gpu 'your gpu' --mode inference --fine_tune

It will produce json file ready to use with mscoco evaluation tool

  1. Using separate gen_caption.py script. Can be used to generate captions for any images.

For list of required parameters:

python gen_caption.py -h

For example:

python -i gen_caption.py --img_path ./images/COCO_val2014_000000233527.jpg --checkpoint ./checkpoints/gaussian_nocv.ckpt --params_path ./pickles/params_Normal_False_gaussian_nocv_False

Where:

  • --params_path: saved Parameters class, can be saved by calling main.py --save_params
  • --checkpoint: saved checkpoint
  • --img_path: path for image
  • -i: for launching python in interactive mode so captions can be generated by calling generator.generate_caption(img_path). This can be also used in ipython notebook

Specific requirements

  • tensorflow >= 1.4.1

Other files

  • prepare_cluster_vectors_train_val.ipynb - takes MSCOCO dataset json files and generates cluster vectors
  • prepare_test_vectors.ipynb - gets test set cluster vector file, prepared using tf.models API and generates cluster vector
  • gen_caption_example.ipynb - generate caption for some photo (without cluster vectors inputs)

neuraltalk2-tensorflow's People

Contributors

yiyang92 avatar

Stargazers

Murari Mandal avatar

Watchers

Murari Mandal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.