Giter Site home page Giter Site logo

tps's Introduction

[ECCV 2022] Domain Adaptive Video Segmentation via Temporal Pseudo Supervision

Highlights

  • TPS is 3x faster than previous DA-VSN in training while achieves SOTA in domain adaptive video segmentation task.

Abstract

Video semantic segmentation has achieved great progress under the supervision of large amounts of labelled training data. However, domain adaptive video segmentation, which can mitigate data labelling constraint by adapting from a labelled source domain toward an unlabelled target domain, is largely neglected. We design temporal pseudo supervision (TPS), a simple and effective method that explores the idea of consistency training for learning effective representations from unlabelled target videos. Unlike traditional consistency training that builds consistency in spatial space, we explore consistency training in spatiotemporal space by enforcing model consistency across augmented video frames which helps learn from more diverse target data. Specifically, we design cross-frame pseudo labelling to provide pseudo supervision from previous video frames while learning from the augmented current video frames. The cross-frame pseudo labelling encourages the network to produce high-certainty predictions which facilitates consistency training with cross-frame augmentation effectively. Extensive experiments over multiple public datasets show that TPS is simpler to implement, much more stable to train, and achieves superior video segmentation accuracy as compared with the state-of-the-art.

Main Results

SYNTHIA-Seq => Cityscapes-Seq

Methods road side. buil. pole light sign vege. sky per. rider car mIoU
Source 56.3 26.6 75.6 25.5 5.7 15.6 71.0 58.5 41.7 17.1 27.9 38.3
DA-VSN 89.4 31.0 77.4 26.1 9.1 20.4 75.4 74.6 42.9 16.1 82.4 49.5
PixMatch 90.2 49.9 75.1 23.1 17.4 34.2 67.1 49.9 55.8 14.0 84.3 51.0
TPS 91.2 53.7 74.9 24.6 17.9 39.3 68.1 59.7 57.2 20.3 84.5 53.8

VIPER => Cityscapes-Seq

Methods road side. buil. fence light sign vege. terr. sky per. car truck bus motor bike mIoU
Source 56.7 18.7 78.7 6.0 22.0 15.6 81.6 18.3 80.4 59.9 66.3 4.5 16.8 20.4 10.3 37.1
PixMatch 79.4 26.1 84.6 16.6 28.7 23.0 85.0 30.1 83.7 58.6 75.8 34.2 45.7 16.6 12.4 46.7
DA-VSN 86.8 36.7 83.5 22.9 30.2 27.7 83.6 26.7 80.3 60.0 79.1 20.3 47.2 21.2 11.4 47.8
TPS 82.4 36.9 79.5 9.0 26.3 29.4 78.5 28.2 81.8 61.2 80.2 39.8 40.3 28.5 31.7 48.9

Note: PixMatch is reproduced with replacing the image segmentation backbone to a video segmentaion one.

Installation

  1. create conda environment
conda create -n TPS python=3.6
conda activate TPS
conda install -c menpo opencv
pip install torch==1.2.0 torchvision==0.4.0
  1. clone the ADVENT repo
git clone https://github.com/valeoai/ADVENT
pip install -e ./ADVENT
  1. clone the current repo
git clone https://github.com/xing0047/TPS.git
pip install -r ./TPS/requirements.txt
  1. resample2d dependency:
python ./TPS/tps/utils/resample2d_package/setup.py build
python ./TPS/tps/utils/resample2d_package/setup.py install

Data Preparation

  1. Cityscapes-Seq
TPS/data/Cityscapes/
TPS/data/Cityscapes/leftImg8bit_sequence/
TPS/data/Cityscapes/gtFine/
  1. VIPER
TPS/data/Viper/
TPS/data/Viper/train/img/
TPS/data/Viper/train/cls/
  1. Synthia-Seq
TPS/data/SynthiaSeq/
TPS/data/SynthiaSeq/SEQS-04-DAWN/

Pretrained Models

Download here and put them under pretrained_models.

Optical Flow Estimation

For quick preparation, please download the estimated optical flow of all datasets here.

Train and Test

  • Train
  cd tps/scripts
  CUDA_VISIBLE_DEVICES=0 python train.py --cfg configs/tps_syn2city.yml
  CUDA_VISIBLE_DEVICES=0 python train.py --cfg configs/tps_viper2city.yml
  • Test (may in parallel with Train)
  cd tps/scripts
  CUDA_VISIBLE_DEVICES=1 python test.py --cfg configs/tps_syn2city.yml
  CUDA_VISIBLE_DEVICES=1 python test.py --cfg configs/tps_viper2city.yml

Acknowledgement

This codebase is heavily borrowed from DA-VSN.

Contact

If you have any questions, feel free to contact: [email protected] or [email protected].

tps's People

Contributors

dayan-guan avatar xing0047 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

tps's Issues

About the estimated optical flow

Hello @xing0047 , thank you for open-sourcing your work!

I initially tried to use optical flow estimation to get the optical flow files necessary for training, but it seemed complicated.
I wish to experiment with pre-trained optical flow first, but unfortunately, I can't access your link:
https://drive.google.com/file/d/18q6KH-beoBp5jSr1Pl1lMiEcb2te2vxq/view?usp=sharing
https://drive.google.com/file/d/1aOeyBLECPSW_ujMBE9RXKjVhTbhw4L2O/view?usp=sharing
https://drive.google.com/file/d/193uZifde7WiuImwAgshkPTt1Z6zgE3z8/view?usp=sharing
https://drive.google.com/file/d/1USizndlUewVb8Eqh4SV6uNuLCEfV9vzU/view?usp=sharing
These links are all inaccessible. Could you provide a new access link?

RuntimeError: cuda runtime error (77) : an illegal memory access was encountered

I really enjoy reading your work!!
At the same time, I encountered a problem in the operation of
CUDA_ VISIBLE_ DEVICES=0 python train.py --cfg configs/tps_ viper2city.yml

Have you ever encountered such a problem?
Could you please kindly tell me how to solve this problem? Thank you!
@xing0047 @Dayan-Guan

Error :
Traceback (most recent call last):
File "./tps/scripts/train.py", line 160, in
main()
File "./tps/scripts/train.py", line 157, in main
train_domain_adaptation(model, source_loader, target_loader, cfg)
File "/home/customer/Desktop/ZZ/FMFSemi/TransferLearning/examples/domain_adaptation/video_seg/TPS/tps/domain_adaptation/train_video_UDA.py", line 30, in train_domain_adaptation
train_TPS(model, source_loader, target_loader, cfg)
File "/home/customer/Desktop/ZZ/FMFSemi/TransferLearning/examples/domain_adaptation/video_seg/TPS/tps/domain_adaptation/train_video_UDA.py", line 179, in train_TPS
src_pred_aux, src_pred, src_pred_cf_aux, src_pred_cf, src_pred_kf_aux, src_pred_kf = model(src_img_cf.cuda(device), src_img_kf.cuda(device), src_flow, device)
File "/opt/software/anaconda3/envs/TPS/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/customer/Desktop/ZZ/FMFSemi/TransferLearning/examples/domain_adaptation/video_seg/TPS/tps/model/accel_deeplabv2.py", line 153, in forward
pred_aux = self.sf_layer(torch.cat((cf_aux, self.warp_bilinear(kf_aux, flow_cf)), dim=1))
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCGeneral.cpp:371

Compatible with Higher CUDA Versions?

Hi @xing0047 , thanks for open-sourcing your work!

Is the current TPS codebase compatible with higher CUDA versions (for newer PyTorch versions), for example, 11.3?

The current configuration requires torch==1.2.0 torchvision==0.4.0, which is only viable for lower CUDA versions like CUDA 10.0 and 9.2.

I have tried to compile the resample2d package under higher torch versions but it is not successful. It seems highly dependent on torch==1.2.0 torchvision==0.4.0.

Looking forward to your help.

A problem when running the train.py

Hi,

Thanks for your great work!

I met a problem when I run the train.py ,

CUDA_VISIBLE_DEVICES=0 python train.py --cfg configs/tps_syn2city.yml
Traceback (most recent call last):
File "train.py", line 11, in
from tps.model.accel_deeplabv2 import get_accel_deeplab_v2
ModuleNotFoundError: No module named 'tps'

Could you pls give any suggestions?

And The link for downloading optical flow is not accessible, could you pls update them?

Thank you

In train_video_UDA.py, line 251, trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp), if the image flips, but the optical flow does not flip

Hello!
I really enjoy reading your work!!
At the same time, I encountered a problem in the operation of train_video_UDA.py

In line 251 trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp),
Variable trg_prob is the prediction of trg_img_b_wk,
and trg_img_b_wk is obtained by trg_img_b based on a certain probability of flip,
but trg_flow_warp does not seem to be flipped,
We consider such a situation,
If trg_img_b_wk is fliped,
trg_flow_warp is not flipped,
Then trg_prob_warp and trg_img_d_st do not seem consistent? Because the image flips, but the optical flow does not flip. Although the trg_pl in line 256~258 is fliped.

Chinese discription of my question:
在第251行, trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp),
变量trg_prob是trg_img_b_wk的语义分割预测,
而trg_img_b_wk是由trg_img_b根据一定概率flip得到的,
但 trg_flow_warp似乎没有进行翻转,
我们考虑这样一种情况,
如果trg_img_b_wk经过了flip处理,
那么trg_prob_warp和trg_img_d_st的语义貌似不是一致的?因为图像flip了但光流图没有flip。
尽管在第256行对trg_pl进行了flip操作

Cannot find 'resample2d_cuda.cc' while installing

In step 4 of installation:

resample2d dependency:
python ./TPS/tps/utils/resample2d_package/setup.py build
python ./TPS/tps/utils/resample2d_package/setup.py install

It reports fault that

gcc: error: resample2d_cuda.cc: No such file or directory

This can be fixed by enter this directory TPS/tps/utils/resample2d_package, and run

python setup.py build
python setup.py install

A question about the training time and GPU

Hi,
As the highlight claim:“TPS is 3x faster than previous DA-VSN in training while achieves SOTA in domain adaptive video segmentation task.”

I also want to ask a question: what type of GPU is used for TPS and DA-VSN training? How many GPUs are used? How long did they train each other?
Thank you!

import resample2d_cuda error occurred

ImportError: /xxxx/TPS/tps/utils/resample2d_package/resample2d_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E

how can I fixed it?
The environment:
torch==1.2.0
torchvision==0.4.0
python 3.6

Question about the dataset SEQS-04-DAWN

A very nice job! But I also encountered the same problem, and the SEQS-04-DAWN folder I downloaded has two subfolders, RGB and GT, and the RGB folder has two angles, Stereo_Right and Stereo_Left,

And the folders of both angles contain four folders: Omni_B, Omni_F, Omni_L, and Omni_R. What kind of files are used in the document?

In addition, each file only contains 850 pictures, which does not seem to correspond to the 8000 frames mentioned in the article. I would like to ask for your advice, thank you very much! !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.