Giter Site home page Giter Site logo

tfzhou / matnet Goto Github PK

View Code? Open in Web Editor NEW
190.0 6.0 20.0 11.8 MB

Motion-Attentive Transition for Zero-Shot Video Object Segmentation (AAAI2020&TIP2021)

Home Page: https://www.researchgate.net/publication/343623463_MATNet_Motion-Attentive_Transition_Network_for_Zero-Shot_Video_Object_Segmentation

Python 96.81% MATLAB 3.19%
video-object-segmentation optical-flow two-stream multi-object davis-challenge deep-learning attention-mechanism segmentation aaai

matnet's Introduction

Motion-Attentive Transition for Zero-Shot Video Object Segmentation

PWC

PWC

UPDATES:

  • [2021/04/17] Our MATNet achieves state-of-the-art results (64.2 in terms of Mean J) on the MoCA dataset in "Self-supervised Video Object Segmentation by Motion Grouping" by Charig Yang, Hala Lamdouar, Erika Lu, Andrew Zisserman, Weidi Xie. Thanks Charig Yang for providing the segmentation results Google Drive.
  • [2020/06/15] Update results for DAVIS-17 test-dev set!
  • [2020/03/04] Update results for DAVIS-17 validation set!
  • [2019/11/17] Codes released!

This is a PyTorch implementation of our MATNet for unsupervised video object segmentation.

Motion-Attentive Transition for Zero-Shot Video Object Segmentation. [Arxiv] [TIP]

Prerequisites

The training and testing experiments are conducted using PyTorch 1.0.1 with a single GeForce RTX 2080Ti GPU with 11GB Memory.

Other minor Python modules can be installed by running

pip install -r requirements.txt

Train

Clone

git clone --recursive https://github.com/tfzhou/MATNet.git

Download Datasets

In the paper, we use the following two public available dataset for training. Here are some steps to prepare the data:

  • DAVIS-17: we use all the data in the train subset of DAVIS-16. However, please download DAVIS-17 to fit the code. It will automatically choose the subset of DAVIS-16 for training.

  • YoutubeVOS-2018: we sample the training data every 10 frames in YoutubeVOS-2018. We use the dataset version with 6fps rather than 30fps.

  • Create soft links:

    cd data; ln -s your/davis17/path DAVIS2017; ln -s your/youtubevos/path YouTubeVOS_2018;

Prepare Edge Annotations

I have provided some matlab scripts to generate edge annotations from mask. Please run data/run_davis2017.m and data/run_youtube.m.

Prepare HED Results

I have provided the pytorch codes to generate HED results for the two datasets (see 3rdparty/pytorch-hed). Please run run_davis.py and run_youtube.py.

The codes are borrowed from https://github.com/sniklaus/pytorch-hed.

Prepare Optical Flow

I have provided the pytorch codes to generate optical flow results for the two datasets (see 3rdparty/pytorch-pwc). Please run run_davis_flow.py and run_youtubevos_flow.py.

The codes are borrowed from https://github.com/sniklaus/pytorch-pwc. Please follow the setup section to install cupy.

warning: Total size of optical flow results of Youtube-VOS is more than 30GB.

Train

Once all data is prepared, please run python train_MATNet.py for training.

Test

  1. Run python test_MATNet.py to obtain the saliency results on DAVIS-16 val set.
  2. Run python apply_densecrf_davis.py for binary segmentation results.

Segmentation Results

  1. The segmentation results on DAVIS-16 and Youtube-objects can be downloaded from Google Drive.
  2. The segmentation results on DAVIS-17 val can be downloaded from Google Drive. We achieved 58.6 in terms of Mean J&F.
  3. The segmentation results on DAVIS-17 test-dev can be downloaded from Google Drive. We achieved 59.8 in terms of Mean J&F. The method also achieved the second place in DAVIS-20 unsupervised object segmentation challenge. Please refer to paper for more details of our challenge solution.

Pretrained Models

The pre-trained model can be downloaded from Google Drive.

Citation

If you find MATNet useful for your research, please consider citing the following papers:

@inproceedings{zhou2020motion,
  title={Motion-Attentive Transition for Zero-Shot Video Object Segmentation},
  author={Zhou, Tianfei and Wang, Shunzhou and Zhou, Yi and Yao, Yazhou and Li, Jianwu and Shao, Ling},
  booktitle={Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI)},
  year={2020},
  pages={13066--13073}
}

@article{zhou2020matnet,
  title={MATNet: Motion-Attentive Transition Network for Zero-Shot Video Object Segmentation},
  author={Zhou, Tianfei and Li, Jianwu and Wang, Shunzhou and Tao, Ran and Shen, Jianbing},
  journal={IEEE Transactions on Image Processing},
  volume={29},
  pages={8326-8338},
  year={2020}
}

@inproceedings{zhou2021unsupervised,
  author = {Zhou, Tianfei and Li, Jianwu and Li, Xueyi and Shao, Ling},
  title = {Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation},
  booktitle = {CVPR},
  year = {2021}
}

matnet's People

Contributors

dependabot[bot] avatar tfzhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

matnet's Issues

关于您提供的Youtube-object链接下的map

您好,我下载了您提供的Youtube-object链接下的map.其给人直观的视觉是肉眼可见的差,并且测了score与您论文中的结果也不符.请问您是不是放错map了? 希望您能检查一下. 谢谢...

data list

Hi, in your paper, you said you use DAVIS16 with 4K frames and YoutubeVOS with 12K frames. However, I found in DAVIS16 training set, 30 videos only with 2K frames. DAVIS17 training set has about 4K frames. So may I confirm whether you used DAVIS16 or DAVIS17 for traing? And for YoutubeVOS, even sampled every 10 frames, it still has 50K frames. May I know which videos you selected from YoutubeVOS for training, please? Or could you share you training list files to me? Thanks very much.

run_youtube.m

Thank you for your code. Do you have any python code to replace run_youtube.m?

Can't clone the 3rdparty code.

After I run the command:

git clone --recursive https://github.com/tfzhou/MATNet.git

in the folder pytorch-pwc and pytorch-hed in the folder 3rdparty, there is nothing.

How can I get these code, especially the file run_davis2017.py?

Thanks!

Other datasets

Thank you for your great work!
Can I train with my own dataset? If I have video data where no targets appear at the beginning and targets appear in the middle frames, can the model accurately identify when targets start to appear? Or will the model identify the foreground from just the beginning of the video (even if it's not the target I'm expecting)?

run_davis.py and run_youtube.py

sorry, I can not find run_davis.py and run_youtube.py two files or projects in your repo. Are you sure you've uploaded it?I have open the '/3rdparty', but I got nothing, I can not click these folders...

光流图是否在DAVIS数据集上训练过?

你好,我想请问一下,准备数据环节,准备的光流图是直接用pytorch-pwc在其他光流数据集上训练得到的权重预测生成的吗,您有在DAVIS数据集上训练吗,如果有,可不可以提供一下预训练模型呢?

Why are my results much terrible than yours?

Dear author:
I have used your provided code to train the model to study UVOS. Slightly different from yours, my running deploys a more advanced optical net named RAFT to generate flow graphs. However, my Jaccard is 0.762, much lower than your 0.8-odd. Other than RAFT, my running configration is totally cosistent with your published. Could you please tell me what is wrong with my settings? Your reply is desired.

No module named run

Hi, I appreciate for you to share this great repository.

However, I got "No module named run" error when I tried to execute python run_davis.py in MATNet/3rdparty directory.
I cannot find any module or libraries related to "run" module.
Any help would be appreciated.
Thanks

Jiseong Heo

检测目标个数问题

TianFei 您好,请问MATNet网络可是从一个视频中获得多个运动目标的分割吗?谢谢。

Loader Issues

Hi I was training the code and I bumped upon this issue which I wasn't sure why it was happening?

In davis2017_youtubevos_ehem.py line 74 it sets the foreground to 255, but then the random augmentations in custome_transforms.py line 47 is if ((tmp == 0) | (tmp == 1)): It will use Nearest Interpolation. Should this be 255 instead of 1?

Cause when I debugged the loader it seems the output for the groundtruth mask is not 0/255 only.

Thanks for your help in advance

How to evaluate FBMS

@tfzhou Thank you for sharing your excellent work, I was wondering how to test the FBMS dataset, since images and ground truth are not one-to-one correspondence

关于多目标训练的问题

你好,我看您最近更新了在davis17上的结果。 请问,您能公开下关于多目标训练和测试的code吗? 十分感谢

about FBMS dataset

Hi, would you provide your FBMS dataset? because it contains multi-object, so I want to know what operations you use in FBMS or do some others process. It would be nice to provide the test data that you are using. THX!!!

About Optical Flow

First frame doesn't have optical flow picture, how to deal with this?

Missing Files

Hello thanks for your code

When I try to train MATNet after performing all the steps for dataset preparation. I still find the code is looking for these two json files can you please share them?
"train-train-meta.json"
"train-val-meta.json"

Thanks

Youtube-object

您好,对于Youtube-object,您使用的光流是基于高分辨的检测数据集的image计算的吗,因为您给出的Result的分辨率是与检测数据集分辨率对应的。(与使用Youtube-Object Mask的数据集中低分辨率的image是否有差异,我产生的光流基本上都是噪声)另外您的光流计算是基于检测数据集(00001-00002)计算光流还是Mask数据集(00001-00011)计算光流?
我用基于Davis的test代码测试Davis 16结果与论文吻合(忽略第一帧和最后一帧),但是在测试Youtube-object数据集上存在极大的差异,您是否方便公开一下测试代码?(比如car0001的真值中有00201,但是因为00201没有光流,所以您给出的结果没有第201帧的结果)
另外,还有一些私人问题给您发了邮件,期待着您的回复。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.