Giter Site home page Giter Site logo

tvseg's Introduction

tvseg

This is for GSoC-2018 project, TV Show Segmentation in collaboration with Red Hen Lab, http://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/the-barnyard/tv-show-segmentation

Requirements

  • Install python 2.7
  • Install pandas
  • Install keras version 2.1.2 with tensorflow backend
  • Install xlrd
  • Install pytorch
  • Install lmdb
  • Install tqdm

First put all videos to be segmented into "videos" folder.

Run sample_frames.py

This is to sample boundary frames as well as randomly sample non-boundary frames.

It puts boundary frames into "images/boundaries" folder and non-boundary frames into 'images/[show_name]' folder.

You need to then manually discard frames from "images/boundaries" that aren't exactly the boundaries.

$ python sample_frames.py

Run create_dataset.py

This creates numpy arrays in the format needed to train siamese network by randomly creating pairs from sampled frames.

It saves these numpy arrays into files train_X_64.npy and train_Y_64.npy

Note: You need 12GB memory to run this, if you don't have enough memory, modify create_dataset.py to create a smaller dataset.

$ python create_dataset.py

Run siamese_net.py

It runs for 5 epochs and saves the model trained into model_test.h5

It gives around 97% accuracy on train and validation data.

$ python siamese_net.py

To test the network on videos: run boundary_detector.py

It runs through all the videos in "videos" folder and puts boundary images into "detected_boundaries/[video_name]" folder. Boundary images are named by their frame numbers.

$ python boundary_detector.py

Just test the provided model, model_test.h5 file

If you don't want to create dataset and train the network, you can just use the provided model_test.h5 file to detect boundaries for videos in "videos" folder.

$ mkdir images
$ mv boundaries images/.
$ python boundary_detector.py

Validated boundary detection

To test boundary detection around pre-annotated boundary frames, i.e, around boundary-500 and boundary+500 region, run boundary_detectory_quick_check.py. This will put boundary frames in "detected_boundaries/[video_name]/[show_name]/[begin/end]" folder.

$ python boundary_detector_quick_check.py

Building a binary classifier

I just gave 2-class classifier a try that just says given a frame, if it's a boundary frame or not. As expected it didn't work, no matter what classifier you use. The reason being severe class imbalance, where we have very few boundary frames, and very very large set of non boundary frames. This type of problem is not learnable as corroborated by the paper, "Severe Class Imbalance: Why Better Algorithms Aren’t the Answer", https://webdocs.cs.ualberta.ca/~holte/Publications/ecml05.pdf

To test you can run below scripts and you notice that the classifier does no better than majority classification

$ python create_classifier_dataset.py
$ python classifier.py

Run YOLO to create annotations

Creates video annotations using YOLO from https://github.com/marvis/pytorch-yolo2.git on videos and writes annotations in "yolo_annotations/[video_file_name].txt" file in the format, [frame_no, label, box_co-ordinates{4 numbers}, prediction_confidence]

Install Yolo

$ git clone https://github.com/sverneka/pytorch-yolo2.git
$ cd pytorch-yolo2

Download pre-trained YOLO weights - 80 class detection

$ wget http://pjreddie.com/media/files/yolo.weights

Copy yolo_annotate.py file from root to pytorch-yolo2

$ cp ../yolo_annotate.py .

Run yolo_annoate.py

$ python yolo_annotate.py

Run video text detector and recognizer

Extracts video screen text for every 10th frame in videos and puts the corresponding .txt files in VidTextExtraction folder. Source code modified from: https://github.com/sravya8/VideoText

Since, there are 2 models, one for detection and other for recognition, I am running them separately as you can't create 2 tensorflow sessions on single GPU through the same process.

First detect text in videos and put the bounding boxes in pickle format in VidTextExtraction folder

$ cd VideoText
$ python videotextdetect.py

Run videotextrecognize.py to read the bounding boxes from pickle files and recognize text from videos and put them into .txt files in the format [frame_number, text], in VidTextExtraction folder

$ python videotextrecognize.py

tvseg's People

Contributors

sverneka avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.