Giter Site home page Giter Site logo

tps's Introduction

[ECCV 2022] Domain Adaptive Video Segmentation via Temporal Pseudo Supervision

Highlights

  • TPS is 3x faster than previous DA-VSN in training while achieves SOTA in domain adaptive video segmentation task.

Abstract

Video semantic segmentation has achieved great progress under the supervision of large amounts of labelled training data. However, domain adaptive video segmentation, which can mitigate data labelling constraint by adapting from a labelled source domain toward an unlabelled target domain, is largely neglected. We design temporal pseudo supervision (TPS), a simple and effective method that explores the idea of consistency training for learning effective representations from unlabelled target videos. Unlike traditional consistency training that builds consistency in spatial space, we explore consistency training in spatiotemporal space by enforcing model consistency across augmented video frames which helps learn from more diverse target data. Specifically, we design cross-frame pseudo labelling to provide pseudo supervision from previous video frames while learning from the augmented current video frames. The cross-frame pseudo labelling encourages the network to produce high-certainty predictions which facilitates consistency training with cross-frame augmentation effectively. Extensive experiments over multiple public datasets show that TPS is simpler to implement, much more stable to train, and achieves superior video segmentation accuracy as compared with the state-of-the-art.

Main Results

SYNTHIA-Seq => Cityscapes-Seq

Methods road side. buil. pole light sign vege. sky per. rider car mIoU
Source 56.3 26.6 75.6 25.5 5.7 15.6 71.0 58.5 41.7 17.1 27.9 38.3
DA-VSN 89.4 31.0 77.4 26.1 9.1 20.4 75.4 74.6 42.9 16.1 82.4 49.5
PixMatch 90.2 49.9 75.1 23.1 17.4 34.2 67.1 49.9 55.8 14.0 84.3 51.0
TPS 91.2 53.7 74.9 24.6 17.9 39.3 68.1 59.7 57.2 20.3 84.5 53.8

VIPER => Cityscapes-Seq

Methods road side. buil. fence light sign vege. terr. sky per. car truck bus motor bike mIoU
Source 56.7 18.7 78.7 6.0 22.0 15.6 81.6 18.3 80.4 59.9 66.3 4.5 16.8 20.4 10.3 37.1
PixMatch 79.4 26.1 84.6 16.6 28.7 23.0 85.0 30.1 83.7 58.6 75.8 34.2 45.7 16.6 12.4 46.7
DA-VSN 86.8 36.7 83.5 22.9 30.2 27.7 83.6 26.7 80.3 60.0 79.1 20.3 47.2 21.2 11.4 47.8
TPS 82.4 36.9 79.5 9.0 26.3 29.4 78.5 28.2 81.8 61.2 80.2 39.8 40.3 28.5 31.7 48.9

Note: PixMatch is reproduced with replacing the image segmentation backbone to a video segmentaion one.

Installation

  1. create conda environment
conda create -n TPS python=3.6
conda activate TPS
conda install -c menpo opencv
pip install torch==1.2.0 torchvision==0.4.0
  1. clone the ADVENT repo
git clone https://github.com/valeoai/ADVENT
pip install -e ./ADVENT
  1. clone the current repo
git clone https://github.com/xing0047/TPS.git
pip install -r ./TPS/requirements.txt
  1. resample2d dependency:
python ./TPS/tps/utils/resample2d_package/setup.py build
python ./TPS/tps/utils/resample2d_package/setup.py install

Data Preparation

  1. Cityscapes-Seq
TPS/data/Cityscapes/
TPS/data/Cityscapes/leftImg8bit_sequence/
TPS/data/Cityscapes/gtFine/
  1. VIPER
TPS/data/Viper/
TPS/data/Viper/train/img/
TPS/data/Viper/train/cls/
  1. Synthia-Seq
TPS/data/SynthiaSeq/
TPS/data/SynthiaSeq/SEQS-04-DAWN/

Pretrained Models

Download here and put them under pretrained_models.

Optical Flow Estimation

For quick preparation, please download the estimated optical flow of all datasets here.

Train and Test

  • Train
  cd tps/scripts
  CUDA_VISIBLE_DEVICES=0 python train.py --cfg configs/tps_syn2city.yml
  CUDA_VISIBLE_DEVICES=0 python train.py --cfg configs/tps_viper2city.yml
  • Test (may in parallel with Train)
  cd tps/scripts
  CUDA_VISIBLE_DEVICES=1 python test.py --cfg configs/tps_syn2city.yml
  CUDA_VISIBLE_DEVICES=1 python test.py --cfg configs/tps_viper2city.yml

Acknowledgement

This codebase is heavily borrowed from DA-VSN.

Contact

If you have any questions, feel free to contact: [email protected] or [email protected].

tps's People

Contributors

xing0047 avatar dayan-guan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.