Giter Site home page Giter Site logo

mconverti / ibm-watson-video-indexer Goto Github PK

View Code? Open in Web Editor NEW
1.0 0.0 0.0 3.05 MB

Proof-of-concept sample application that uses IBM Watson and IBM Bluemix services to automatically index video files by analyzing video frames and audio speech.

License: Apache License 2.0

Shell 1.07% JavaScript 48.51% TypeScript 27.90% HTML 19.39% CSS 3.13%
nodejs typescript angular2 ibm-bluemix ibm-watson-services ibm-openwhisk elasticsearch s3-storage video-processing visual-recognition speech-recognition audio-processing ffmpeg indexing-engine media javascript

ibm-watson-video-indexer's Introduction

IBM Watson Video Indexer

This is a proof-of-concept sample application that uses IBM Bluemix services to automatically index video files by analyzing the video frames (tag clasification, face and identity detection, and text recognition) and the audio track (speech to text recognition).

You can run full-text search queries on the metadata (title, description, tags summary and identities summary) to find a video in your library.

You can click a particular video to navigate its details page and then find insights 'inside' the video content leveraging the metadata generated by IBM Watson services (tag clasification, face and identities, OCR and audio speech).

Or you can click Find Insights in the navigation bar to find insights 'inside' the content across all your videos.

To upload a new video just drag-and-drop the file in the drop zone. In the Uploading file dialog enter a Title and Description (optional), click Upload and then wait until the process completes.

Note: OpenWhisk currently has a 5 minutes maximum action timeout system limit. Because of this, the speech and visual analyzers processors might timeout if your video is too long. To avoid this issue, make sure to upload videos that are up to 5 minutes long.

Overview

Built using IBM Bluemix, the application uses:

Application Requirements

  • IBM Bluemix account. Sign up for Bluemix, or use an existing account.
  • Docker Hub account. Sign up for Docker Hub, or use an existing account.
  • Node.js >= 6.7.0

Preparing the environment

1. Create the Bluemix Services

Note: if you have existing instances of these services, you don't need to create new instances. You can simply reuse the existing ones.

  1. Open the IBM Bluemix console.

  2. Create a Compose for Elasticsearch service instance.

  3. Create a Watson Visual Recognition service instance.

  4. Create a Watson Speech to Text service instance.

  5. Create a Cloud Object Storage (S3 API) service instance.

  6. Go to the Cloud Object Storage (S3 API) details page and create a new bucked called media.

2. Configure the Bluemix services credentials

  1. Change to the web/server/lib directory.

  2. Replace the placeholders in the config-elasticsearch-credentials.json file with the Compose for Elasticsearch service instance credentials.

  3. Replace the placeholders in the config-visual-recognition-credentials.json file with the Watson Visual Recognition service instance credentials.

  4. Replace the placeholders in the config-speech-to-text-credentials.json file with the Watson Speech to Text service instance credentials.

  5. Replace the placeholders in the config-s3-credentials.json file with the Cloud Object Storage (S3 API) service instance credentials.

  6. Replace the placeholders in the config-openwhisk-credentials.json file with the OpenWhisk CLI credentials.

3. Build the Docker images for the visual and speach analyzers

These analyzers requires ffmpeg to extract audio, frames and metadata from the video. ffmpeg is not available to an OpenWhisk action written in JavaScript or Swift. Fortunately OpenWhisk allows to write an action as a Docker image and can retrieve this image from Docker Hub.

To build the images, follow these steps:

  1. Change to the processors directory.

  2. Ensure your Docker environment works and that you have logged in Docker hub.

  3. Run the following commands:

./buildAndPushVisualAnalyzer.sh %youruserid%/%yourvisualanalyzerimagename%
./buildAndPushSpeechAnalyzer.sh %youruserid%/%yourspeechanalyzerimagename%

Note: On some systems these commands need to be run with sudo.

  1. After a while, your images will be available in Docker Hub, ready for OpenWhisk.

4. Deploy OpenWhisk actions

  1. Ensure your OpenWhisk command line interface is property configured with:
wsk list

This shows the packages, actions, triggers and rules currently deployed in your OpenWhisk namespace.

  1. Create the visualAnalyzer action.
wsk action create -t 300000 -m 512 --docker visualAnalyzer %youruserid%/%yourvisualanalyzerimagename%
  1. Create the speechAnalyzer action.
wsk action create -t 300000 -m 512 --docker speechAnalyzer %youruserid%/%yourspeechanalyzerimagename%
  1. Create the videoAnalyzer sequence.
wsk action create -t 300000 -m 512 videoAnalyzer --sequence /%yourorganization%/visualAnalyzer,/%yourorganization%/speechAnalyzer

Deploy the Web application

This Web application is used to upload videos, monitor the processing progress, visualize the results and perform full-text search queries to find insights inside the content.

  1. Change to the web directory.

  2. Get the dependencies and build the application:

npm install && npm run build
  1. Push the application to Bluemix:
cf push

That's it! Use the deployed Web application to upload videos, monitor the processing progress, view the results and find insights inside your content!

ibm-watson-video-indexer's People

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.