Giter Site home page Giter Site logo

acar-net's Introduction

ACAR-Net

Actor-Context-Actor Relation Network for Spatio-temporal Action Localization - 1st place solution in AVA-Kinetics Crossover Challenge 2020 .

Code and model will come soon!

Junting Pan Siyu Chen Zheng Shou Jing Shao Hongsheng Li
Junting Pan Siyu Chen Zheng Shou Jing Shao Hongsheng Li

Abstract

Localizing persons and recognizing their actions from videos is a challenging task towards high-level video understanding. Recent advances have been achieved by modeling either “actor-actor” or “actorcontext” relations. However, such direct first-order relations are not sufficient for localizing actions in complicated scenes. Some actors might be indirectly related via objects or background context in the scene. Such indirect relations are crucial for determining the action labels but are mostly ignored by existing work. In this paper, we propose to explicitly model the Actor-Context-Actor Relation, which can capture indirect high-order supportive information for effectively reasoning actors’ actions in complex scenes. To this end, we design an Actor-ContextActor Relation Network (ACAR-Net) which builds upon a novel Highorder Relation Reasoning Operator to model indirect relations for spatiotemporal action localization. Moreover, to allow utilizing more temporal contexts, we extend our framework with an Actor-Context Feature Bank for reasoning long-range high-order relations. Extensive experiments on AVA dataset validate the effectiveness of our ACAR-Net. Ablation studies show advantages of modeling high-order relations over existing first-order relation reasoning methods. The proposed ACAR-Net is also the core module of our 1st place solution in AVA-Kinetics Crossover Challenge 2020.

CVPR 2020 AVA-Kinetics Challenge

Find slides and video presentation of our winning solution on [Google Slides] [Youtube Video] [Bilibili Video] (Starting from 18:20).

Preprint

Find our work on Arxiv.

Please cite with the following Bibtex code:

@article{pan2020actorcontextactor,
  title={Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization},
  author={Junting Pan and Siyu Chen and Zheng Shou and Jing Shao and Hongsheng Li},
  journal={arXiv preprint arXiv:2006.07976},
  year={2020}
}

You may also want to refer to our publication with the more human-friendly Chicago style:

Junting Pan, Siyu Chen, Zheng Shou, Jing Shao, Hongsheng Li. "Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization." Arxiv 2020.

Models

ACAR Net Architecture architecture-fig

Contact

If you have any general doubt about our work or code which may be of interest for other researchers, please use the public issues section on this github repo. Alternatively, drop us an e-mail at [email protected] and [email protected] .

acar-net's People

Contributors

junting avatar siyu-c avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.