Giter Site home page Giter Site logo

ss-videocaptioning's Introduction

SS-VideoCaptioning

This repository contains the Tensorflow implementation of our model "Semantically Sensible Video Captioning (SSVC)"
[Code] [Paper] [ArXiv]

Main Model

Authors

Md. Mushfiqur Rahman, Thasin Abedin, Khondokar S. S. Prottoy, Ayana Moshruba, Fazlul Hasan Siddiqui

Requirements

Install the following dependencies before running the model

  • Tensorflow 2.0 install
  • tqdm pip install tqdm
  • sklearn pip install -U scikit-learn
  • nltk pip install nltk

Directory structure

-root
  -glove.6B.100d.txt
  -MSVD_captions.csv
  -models_and_utils
    -models.py
    -utils.py
  -data_picle
    -train
      -filename1.pkl
      -filename2.pkl
      ...
    -test
      -filename1.pkl
      -filename2.pkl
      ...
    -validation
      -filename1.pkl
      -filename2.pkl
      ...
    -train.csv
    -test.csv
    -validation.csv

Train and Evaluate

  • Download and extract 'glove.6B.100d.txt' link
  • Download the MSVD dataset and create corresponding pickle files using vid2frames.ipynb. Split the data in train-test-val sets.

    Alternate step: Download and extract 'data_pickle.zip'. This compressed file already contains the pickles files of MSVD dataset

  • run the train.ipynb file

    This file has a detailed list of options. Change the options to adjust the model according to requirements

  • Train and evaluation codes are inside the python notebook

Sample Outputs


SSVC: "A woman is cutting a piece of meat"
GT: "a woman is cutting into the fatty areas of a pork chop"
SS score: 1.0, BLEU1: 1.0, BLEU2: 1.0, BLEU3: 1.0, BLEU4: 1.0


SSVC: "A person is slicing tomato"
GT: "Someone wearing blue rubber gloves is slicing a tomato with a large knife"
SS score: 0.825, BLEU1: 1.0, BLEU2: 1.0, BLEU3: 1.0, BLEU4: 1.0


SSVC: "A woman is cutting a piece of meat"
GT: "a woman is cutting into the fatty areas of a pork chop"
SS score: 0.94, BLEU1: 1.0, BLEU2: 0.84, BLEU3: 0.61, BLEU4: 0.0

ss-videocaptioning's People

Contributors

mushfiqur11 avatar orvee-17 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.