Giter Site home page Giter Site logo

zhanghengdev / awesome-video-object-detection Goto Github PK

View Code? Open in Web Editor NEW
242.0 13.0 37.0 618 KB

This is a list of awesome articles about object detection from video.

object-detection video-object-detection awesome-list deep-learning deep-neural-networks computer-vision

awesome-video-object-detection's Introduction

Awesome Video-Object-Detection

Intro

This is a list of awesome articles about object detection from video.

Datasets

ImageNet VID Challenge

VisDrone Challenge

Paper list

2016

Seq-NMS for Video Object Detection

[Arxiv]

  • Date: Feb 2016
  • Motivation: Smoothing the final bounding box predictions across time.
  • Summary: Constructing a temporal graph from overlapping bounding box detections across the adjacent frames, and using dynamic programming to select bounding box sequences with the highest overall detection score.

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

[Arxiv] [Code]

  • Date: Apr 2016
  • Summary: Using a video object detection pipeline that involves predicting optical flow first, then propagating image level predictions according to the flow, and finally using a tracking algorithm to select temporally consistent high confidence detections.
  • Performance: 73.8% mAP on ImageNet VID validation.

Object Detection from Video Tubelets with Convolutional Neural Networks

[Arxiv] [Code]

  • Date: Apr 2016

Deep Feature Flow for Video Recognition

[Arxiv] [Code]

  • Date: Nov 2016
  • Performance: 73.0% mAP on ImageNet VID validation at 29 fps on a Titan X GPU.

2017

Object Detection in Videos with Tubelet Proposal Networks

[Arxiv]

  • Date: Feb 2017

Flow-Guided Feature Aggregation for Video Object Detection

[Arxiv] [Code]

  • Date: Mar 2017
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 76.3% mAP at 1.4 fps or 78.4% (combined with Seq-NMS) at 1.1 fps on ImageNet VID validation on a Titan X GPU.

Detect to Track and Track to Detect

[Arxiv] [Summary] [Code]

  • Date: Oct 2017
  • Motivation: Smoothing the final bounding box predictions across time.
  • Summary: Proposing a ConvNet architecture that solves detection and tracking problems jointly and applying a Viterbi algorithm to link the detections across time.
  • Performance: 79.8% mAP on ImageNet VID validation.

Towards High Performance Video Object Detection

[Arxiv]

  • Date: Nov 2017
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 78.6% mAP on ImageNet VID validation at 13 fps on a Titan X GPU.

Video Object Detection with an Aligned Spatial-Temporal Memory

[Arxiv] [Summary] [Code] [Demo]

  • Date: Dec 2017
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 80.5% mAP on ImageNet VID validation.

2018

Object Detection in Videos by High Quality Object Linking

[Arxiv]

  • Date: Jan 2018

Towards High Performance Video Object Detection for Mobiles

[Arxiv]

  • Date: Apr 2018
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 60.2% mAP on ImageNet VID validation at 25.6 fps on mobiles.

Optimizing Video Object Detection via a Scale-Time Lattice

[Arxiv] [Summary] [Code]

  • Date: Apr 2018
  • Performance: 79.4% mAP at 20 fps or 79.0% at 62 fps on ImageNet VID validation on a Titan X GPU.

Object Detection in Video with Spatiotemporal Sampling Networks

[Arxiv] [Summary]

  • Date: Mar 2018
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 78.9% mAP or 80.4% (combined with Seq-NMS) on ImageNet VID validation.

Fully Motion-Aware Network for Video Object Detection

[Paper] [Summary]

  • Date: Stp. 2018
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 78.1% mAP or 80.3% (combined with Seq-NMS) on ImageNet VID validation.

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

[Arxiv] [Summary]

  • Date: Nov 2018
  • Motivation: Smoothing the final bounding box predictions across time.
  • Performance: 83.5% of mAP with FGFA and Deformable ConvNets v2 on ImageNet VID validation.

2019

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

[arXiv]

  • Date: Feb 2019
  • Motivation: Adaptively rescale the input image resolution to improve both accuracy and speed for video object detection.
  • Performance: 75.5% of mAP on ImageNet VID validation for 4 different multi-scale training (600, 480, 360, 240).

Improving Video Object Detection by Seq-Bbox Matching

[pdf]

  • Date: Feb 2019
  • Motivation: Smoothing the final bounding box predictions across time (box-level method).
  • Performance: 80.9% of mAP (offline detection) and 78.2% of mAP (online detection) both at 38 fps on a Titan X GPU.

Comparison table

Paper Date Base detector Backbone Tracking? Optical flow? Online? mAP(%) FPS (Titan X)
Seq-NMS Feb 2016 R-FCN ResNet101 no no no 76.8 2.3
T-CNN Apr 2016 RCNN DeepIDNet+CRAFT yes no no 73.8 -
DFF Nov 2016 R-FCN ResNet101 no yes yes 73.0 29
TPN Feb 2017 TPN GoogLeNet yes no no 68.4 -
FGFA Mar 2017 R-FCN ResNet101 no yes yes 76.3 1.4
FGFA + Seq-NMS 29 Mar 2017 R-FCN ResNet101 no yes no 78.4 1.14
D&T Oct 2017 R-FCN (15 anchors) ResNet101 yes no no 79.8 7.09
STMN Dec 2017 R-FCN ResNet101 no no no 80.5 -
Scale-time-lattice 16 Apr 2018 Faster RCNN (15 anchors) ResNet101 no no no 79.6 20
Scale-time-lattice Apr 2018 Faster RCNN (15 anchors) ResNet101 no no no 79.0 62
SSN (per-frame baseline for STSN) Mar 2018 R-FCN Deformable ResNet101 no no yes 76.0 -
STSN Mar 2018 R-FCN Deformable ResNet101 no no yes 78.9 -
STSN+Seq-NMS Mar 2018 R-FCN Deformable ResNet101 no no no 80.4 -
MANet Sep. 2018 R-FCN ResNet101 no yes yes 78.1 5
MANet+Seq-NMS Sep. 2018 R-FCN ResNet101 no yes no 80.3 -
Tracklet-Conditioned Detection Nov 2018 R-FCN ResNet101 yes no yes 78.1 -
Tracklet-Conditioned Detection+DCNv2 Nov 2018 R-FCN ResNet101 yes no yes 82.0 -
Tracklet-Conditioned Detection+DCNv2+FGFA Nov 2018 R-FCN ResNet101 yes no yes 83.5 -
Seq-Bbox Matching Feb 2019 YOLOv3 darknet53 no no no 80.9 38
Seq-Bbox Matching Feb 2019 YOLOv3 darknet53 no no yes 78.2 38

awesome-video-object-detection's People

Contributors

shanshuo avatar zhanghengdev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.