Giter Site home page Giter Site logo

csma-net's Introduction

Authors: Yi Zhang, Wassim Hamidouche, Olivier Deforges


Introduction


Figure 1: The architecture of our CSMA-Net. The short names in the figure are detailed as follows: CSMA = the proposed channel-spatial mutual attention module. E2C/C2E = the projection interaction module which transforms the equirectangular (ER) image/cube maps to cube maps/ER image, respectively. ASPP = atrous spatial pyramid pooling module. Enc.ER = the hybrid-ViT-based encoder for ER image. Enc.CM = the Res2Net-based encoder for cube maps. Dec. = the decoder from RCRNet.

In this work, we conduct 360° panoramic salient object detection by taking advantage of both the global and local visual cues of 360° images, with a novel channel-spatial mutual attention network (CSMA-Net). The key component of the CSMA-Net is the proposed CSMA module, which cascades channel-/spatial weighting-based mutual attentions. The objective of our CSMA module is to refine and fuse the bottleneck features from two separate encoders with different planar representa- tions of 360° panorama as inputs, i.e., equirectangular image and cube map. Our CSMA-Net outperforms 10 state-of-the-art segmentation methods based on the proposed 360° SOD benchmark where multiple fine-tuning and testing strategies are applied to the widely-used 360° datasets. Extensive experimental results illustrate the effectiveness and robustness of the proposed CSMA-Net.


Performance


Figure 2: Performance comparison between CSMA-Net and the SOTAs.


Figure 3: Visual results of CSMA-Net and SOTAs. Refer to "Implementation" for whole visualization.


Implementation

The source codes are available at codes.

The pretrained models of our CSMA-Net can be downloaded at CSMA-Net-models.

The results of our CSMA-Net on 360-SOD and 360-SSOD can be downloaded at CSMA-Net-results.


Contact

E-mail address: [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.