tfzhou / matnet Goto Github PK

Motion-Attentive Transition for Zero-Shot Video Object Segmentation (AAAI2020&TIP2021)

Home Page: https://www.researchgate.net/publication/343623463_MATNet_Motion-Attentive_Transition_Network_for_Zero-Shot_Video_Object_Segmentation

Python 96.81% MATLAB 3.19%

video-object-segmentation optical-flow two-stream multi-object davis-challenge deep-learning attention-mechanism segmentation aaai

matnet's Introduction

Motion-Attentive Transition for Zero-Shot Video Object Segmentation

UPDATES:

[2021/04/17] Our MATNet achieves state-of-the-art results (64.2 in terms of Mean J) on the MoCA dataset in "Self-supervised Video Object Segmentation by Motion Grouping" by Charig Yang, Hala Lamdouar, Erika Lu, Andrew Zisserman, Weidi Xie. Thanks Charig Yang for providing the segmentation results Google Drive.

[2020/06/15] Update results for DAVIS-17 test-dev set!

[2020/03/04] Update results for DAVIS-17 validation set!

[2019/11/17] Codes released!

This is a PyTorch implementation of our MATNet for unsupervised video object segmentation.

Motion-Attentive Transition for Zero-Shot Video Object Segmentation. [Arxiv] [TIP]

Prerequisites

The training and testing experiments are conducted using PyTorch 1.0.1 with a single GeForce RTX 2080Ti GPU with 11GB Memory.

PyTorch 1.0.1

Other minor Python modules can be installed by running

pip install -r requirements.txt

Train

Clone

git clone --recursive https://github.com/tfzhou/MATNet.git

Download Datasets

In the paper, we use the following two public available dataset for training. Here are some steps to prepare the data:

DAVIS-17: we use all the data in the train subset of DAVIS-16. However, please download DAVIS-17 to fit the code. It will automatically choose the subset of DAVIS-16 for training.
YoutubeVOS-2018: we sample the training data every 10 frames in YoutubeVOS-2018. We use the dataset version with 6fps rather than 30fps.
Create soft links:

cd data; ln -s your/davis17/path DAVIS2017; ln -s your/youtubevos/path YouTubeVOS_2018;

Prepare Edge Annotations

I have provided some matlab scripts to generate edge annotations from mask. Please run data/run_davis2017.m and data/run_youtube.m.

Prepare HED Results

I have provided the pytorch codes to generate HED results for the two datasets (see 3rdparty/pytorch-hed). Please run run_davis.py and run_youtube.py.

The codes are borrowed from https://github.com/sniklaus/pytorch-hed.

Prepare Optical Flow

I have provided the pytorch codes to generate optical flow results for the two datasets (see 3rdparty/pytorch-pwc). Please run run_davis_flow.py and run_youtubevos_flow.py.

The codes are borrowed from https://github.com/sniklaus/pytorch-pwc. Please follow the setup section to install cupy.

warning: Total size of optical flow results of Youtube-VOS is more than 30GB.

Train

Once all data is prepared, please run python train_MATNet.py for training.

Test

Run python test_MATNet.py to obtain the saliency results on DAVIS-16 val set.
Run python apply_densecrf_davis.py for binary segmentation results.

Segmentation Results

The segmentation results on DAVIS-16 and Youtube-objects can be downloaded from Google Drive.
The segmentation results on DAVIS-17 val can be downloaded from Google Drive. We achieved 58.6 in terms of Mean J&F.
The segmentation results on DAVIS-17 test-dev can be downloaded from Google Drive. We achieved 59.8 in terms of Mean J&F. The method also achieved the second place in DAVIS-20 unsupervised object segmentation challenge. Please refer to paper for more details of our challenge solution.

Pretrained Models

The pre-trained model can be downloaded from Google Drive.

Citation

If you find MATNet useful for your research, please consider citing the following papers:

@inproceedings{zhou2020motion,
  title={Motion-Attentive Transition for Zero-Shot Video Object Segmentation},
  author={Zhou, Tianfei and Wang, Shunzhou and Zhou, Yi and Yao, Yazhou and Li, Jianwu and Shao, Ling},
  booktitle={Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI)},
  year={2020},
  pages={13066--13073}
}

@article{zhou2020matnet,
  title={MATNet: Motion-Attentive Transition Network for Zero-Shot Video Object Segmentation},
  author={Zhou, Tianfei and Li, Jianwu and Wang, Shunzhou and Tao, Ran and Shen, Jianbing},
  journal={IEEE Transactions on Image Processing},
  volume={29},
  pages={8326-8338},
  year={2020}
}

@inproceedings{zhou2021unsupervised,
  author = {Zhou, Tianfei and Li, Jianwu and Li, Xueyi and Shao, Ling},
  title = {Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation},
  booktitle = {CVPR},
  year = {2021}
}

matnet's People

Contributors

Stargazers

Watchers

matnet's Issues

关于您提供的Youtube-object链接下的map

您好,我下载了您提供的Youtube-object链接下的map.其给人直观的视觉是肉眼可见的差,并且测了score与您论文中的结果也不符.请问您是不是放错map了? 希望您能检查一下. 谢谢...

Hi, in your paper, you said you use DAVIS16 with 4K frames and YoutubeVOS with 12K frames. However, I found in DAVIS16 training set, 30 videos only with 2K frames. DAVIS17 training set has about 4K frames. So may I confirm whether you used DAVIS16 or DAVIS17 for traing? And for YoutubeVOS, even sampled every 10 frames, it still has 50K frames. May I know which videos you selected from YoutubeVOS for training, please? Or could you share you training list files to me? Thanks very much.

run_youtube.m

Thank you for your code. Do you have any python code to replace run_youtube.m?

Can't clone the 3rdparty code.

After I run the command:

git clone --recursive https://github.com/tfzhou/MATNet.git

in the folder pytorch-pwc and pytorch-hed in the folder 3rdparty, there is nothing.

How can I get these code, especially the file run_davis2017.py?

Thanks!

Other datasets

Thank you for your great work!
Can I train with my own dataset? If I have video data where no targets appear at the beginning and targets appear in the middle frames, can the model accurately identify when targets start to appear? Or will the model identify the foreground from just the beginning of the video (even if it's not the target I'm expecting)?

run_davis.py and run_youtube.py

sorry， I can not find run_davis.py and run_youtube.py two files or projects in your repo. Are you sure you've uploaded it？I have open the '/3rdparty', but I got nothing， I can not click these folders...

hi,i have a question.I want to know how many epoch you trained in DAVIS16 ? Thank you !

请问你的指标测量工具是什么，能给出链接吗

Hi, can you provide your trained boundary results? Thank you!

光流图是否在DAVIS数据集上训练过？

你好，我想请问一下，准备数据环节，准备的光流图是直接用pytorch-pwc在其他光流数据集上训练得到的权重预测生成的吗，您有在DAVIS数据集上训练吗，如果有，可不可以提供一下预训练模型呢？

请问test时是不会用到edge数据的是么？只需要光流和图像,只有训练时才会用到HED图像是么？

Why are my results much terrible than yours?

Dear author:
I have used your provided code to train the model to study UVOS. Slightly different from yours, my running deploys a more advanced optical net named RAFT to generate flow graphs. However, my Jaccard is 0.762, much lower than your 0.8-odd. Other than RAFT, my running configration is totally cosistent with your published. Could you please tell me what is wrong with my settings? Your reply is desired.

No module named run

Hi, I appreciate for you to share this great repository.

However, I got "No module named run" error when I tried to execute python run_davis.py in MATNet/3rdparty directory.
I cannot find any module or libraries related to "run" module.
Any help would be appreciated.
Thanks

Jiseong Heo

检测目标个数问题

TianFei 您好，请问MATNet网络可是从一个视频中获得多个运动目标的分割吗？谢谢。

YouTube-VOS2018的百度网盘的下载地址失效了

Loader Issues

Hi I was training the code and I bumped upon this issue which I wasn't sure why it was happening?

In davis2017_youtubevos_ehem.py line 74 it sets the foreground to 255, but then the random augmentations in custome_transforms.py line 47 is if ((tmp == 0) | (tmp == 1)): It will use Nearest Interpolation. Should this be 255 instead of 1?

Cause when I debugged the loader it seems the output for the groundtruth mask is not 0/255 only.

Thanks for your help in advance

请问为什么难例挖掘时要扣除边界的真值？

您好，真值的边界不应该也属于难例中的一部分吗？并且网络应该更加注重这部分的学习，赋予更高的权重。为什么在做边界的难例挖掘时，要扣除这部分？

How to evaluate FBMS

@tfzhou Thank you for sharing your excellent work, I was wondering how to test the FBMS dataset, since images and ground truth are not one-to-one correspondence

关于多目标训练的问题

你好，我看您最近更新了在davis17上的结果。请问，您能公开下关于多目标训练和测试的code吗？十分感谢

Could you please equip MATNet with DDP？

I have tried, but failed. I do not know where the problem is.

about FBMS dataset

Hi， would you provide your FBMS dataset? because it contains multi-object, so I want to know what operations you use in FBMS or do some others process. It would be nice to provide the test data that you are using. THX!!!

hi, I fail to download the Youtube2018 dataset. Would you mind sharing them?

About Optical Flow

First frame doesn't have optical flow picture, how to deal with this?

Missing Files

Hello thanks for your code

When I try to train MATNet after performing all the steps for dataset preparation. I still find the code is looking for these two json files can you please share them?
"train-train-meta.json"
"train-val-meta.json"

Thanks

Youtube-object

您好，对于Youtube-object，您使用的光流是基于高分辨的检测数据集的image计算的吗，因为您给出的Result的分辨率是与检测数据集分辨率对应的。（与使用Youtube-Object Mask的数据集中低分辨率的image是否有差异，我产生的光流基本上都是噪声）另外您的光流计算是基于检测数据集（00001-00002）计算光流还是Mask数据集（00001-00011）计算光流？
我用基于Davis的test代码测试Davis 16结果与论文吻合（忽略第一帧和最后一帧），但是在测试Youtube-object数据集上存在极大的差异，您是否方便公开一下测试代码？（比如car0001的真值中有00201，但是因为00201没有光流，所以您给出的结果没有第201帧的结果）
另外，还有一些私人问题给您发了邮件，期待着您的回复。

tfzhou / matnet Goto Github PK

matnet's Introduction

Motion-Attentive Transition for Zero-Shot Video Object Segmentation

Prerequisites

Train

Clone

Download Datasets

Prepare Edge Annotations

Prepare HED Results

Prepare Optical Flow

Train

Test

Segmentation Results

Pretrained Models

Citation

matnet's People

Contributors

Stargazers

Watchers

Forkers

matnet's Issues

Recommend Projects

Recommend Topics

Recommend Org