Giter Site home page Giter Site logo

saifrahmed / scanner Goto Github PK

View Code? Open in Web Editor NEW

This project forked from scanner-research/scanner

0.0 3.0 0.0 54.94 MB

Efficient video analysis at scale

Home Page: http://scanner.run

License: Apache License 2.0

GDB 0.01% CMake 2.83% Shell 1.59% Python 10.83% C++ 82.59% Cuda 1.02% C 1.13%

scanner's Introduction

Scanner: Efficient Video Analysis at Scale GitHub tag Build Status

Scanner is a system for developing applications that efficiently process large video datasets. Scanner has been used for both video analysis and video snythesis tasks, such as:

  • Labeling and data mining large video collections: Scanner is in use at Stanford University as the compute engine for visual data mining applications that detect people, commercials, human poses, etc. in datasets as big as 70,000 hours of TV news (12 billion frames, 20 TB) or 600 feature length movies (106 million frames). We've used Scanner to run these tasks on hundreds of GPUs or thousands of CPUs on Google Compute Engine.
  • VR Video synthesis: scaling the Surround 360 VR video stitching software to 100's of CPUs. This application processes fourteen 2048x2048 input videos to produce 8k omidirectional stereo video output for VR display.

To learn more about Scanner, see the documentation below, check out the various example applications, or read the SIGGRAPH 2018 Technical Paper: "Scanner: Efficient Video Analysis at Scale".

Key Features

Scanner's key features include:

  • Video processing computations as dataflow graphs: Like many modern ML frameworks, Scanner structures video analysis tasks as dataflow graphs whose nodes produce and consume sequences of per-frame data. Scanner's embodiment of the dataflow model includes operators useful for video processing tasks such as sparse frame sampling (e.g., "frames known to contain a face"), sliding window frame access (e.g., stencils for temporal smoothing), and stateful processing across frames (e.g., tracking).

  • Videos as logical tables: To simplify the management of and access to large-numbers of videos, Scanner represents video collections and the pixel-level products of video frame analysis (e.g., flow fields, depth maps, activations) as tables in a data store. Scanner's data store features first-class support for video frame column types to facilitate key performance optimizations, such as storing video in compressed form and providing fast access to sparse lists of video frames.

  • First-class support for GPU acceleration: Since many video processing algorithms benefit from GPU acceleration, Scanner provides first-class support for writing dataflow graph operations that utilize GPU execution. Scanner also leverages specialized GPU hardware for video decoding when available.

  • Fault tolerant, distributed execution: Scanner applications can be run on the cores of a single machine, on a multi-GPU server, or scaled to hundreds of machines (potentially with heterogeneous numbers of GPUs), without significant source-level change. Scanner also provides fault tolerance, so your applications can not only utilize many machines, but use cheaper preemptible machines on cloud computing platforms.

What Scanner is not:

Scanner is not a system for implementing new high-performance image and video processing kernels from scratch. However, Scanner can be used to create scalable video processing applications by composing kernels that already exist as part of popular libraries such as OpenCV, Caffe, TensorFlow, etc. or have been implemented in popular performance-oriented languages like CUDA or Halide. Yes, you can write your dataflow graph operations in Python or C++ too!

Documentation

Scanner's documentation is hosted at scanner.run. Here are a few links to get you started:

Example code

Scanner applications are written using the Python API. Here's an example application that resizes every third frame from a video and then saves the result as an mp4 video (the Quickstart walks through this example in more detail):

from scannerpy import Database, Job

# Ingest a video into the database (create a table with a row per video frame)
db = Database()
db.ingest_videos([('example_table', 'example.mp4')])

# Define a Computation Graph
frame = db.sources.FrameColumn()                                    # Read input frames from database
sampled_frame = db.streams.Stride(input=frame, stride=3)            # Select every third frame
resized = db.ops.Resize(frame=sampled_frame, width=640, height=480) # Resize input frames
output_frame = db.sinks.Column(columns={'frame': resized})          # Save resized frames as new video

# Set parameters of computation graph ops
job = Job(op_args={
    frame: db.table('example_table').column('frame'), # Column to read input frames from
    output_frame: 'resized_example'                   # Table name for computation output
})

# Execute the computation graph and return a handle to the newly produced tables
output_tables = db.run(output=output_frame, jobs=[job], force=True)

# Save the resized video as an mp4 file
output_tables[0].column('frame').save_mp4('resized_video')

If you'd like to see other example applications written with Scanner, check out the Examples directory in this repository.

Contributing

If you'd like to contribute to the development of Scanner, you should first build Scanner from source.

Please submit a pull-request rebased against the most recent version of the master branch and we will review your changes to be merged. Thanks for contributing!

Running tests

You can run the full suite of tests by executing make test in the directory you used to build Scanner. This will run both the C++ tests and the end-to-end tests that verify the python API.

About

Scanner is an active research project, part of a collaboration between Stanford and Carnegie Mellon University. Please contact Alex Poms and Will Crichton with questions.

Scanner was developed with the support of the NSF (IIS-1539069), the Intel Corporation (through the Intel Science and Technology Center for Visual Cloud Computing and the NSF/Intel VEC program), and by Google.

Paper citation

Scanner will appear in the proceedings of SIGGRAPH 2018 as "Scanner: Efficient Video Analysis at Scale" by Poms, Crichton, Hanrahan, and Fatahalian. If you use Scanner in your research, we'd appreciate it if you cite the paper.

scanner's People

Contributors

fpoms avatar willcrichton avatar kayvonf avatar satyaprateek1994 avatar swjz avatar qianl15 avatar jremmons avatar sdulloor avatar

Watchers

James Cloos avatar Saif Ahmed avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.