Giter Site home page Giter Site logo

swiftnet's Introduction

SwiftNet

The official PyTorch implementation of SwiftNet:Real-time Video Object Segmentation, which has been accepted by CVPR2021.

Requirements

  • Python >= 3.6
  • Pytorch 1.5
  • Numpy
  • Pillow
  • opencv-python
  • scipy
  • tqdm

Training

  • The training pipeline of Swiftnet is similar with the training pipeline of STM, which can be found in our reproduced STM training code.

Inference

Usage

python eval.py -g 0 -y 17 -s val -D 'path to davis'

Performance

Performance on Davis-17 val set.

backbone J&F J F FPS weights
resnet-18 77.6 75.5 79.7 65 link

Note: The FPS is tested on one P100, which does not include the time of image loading and evaluation cost.

Acknowledgement

This repository is partially founded on the official STM repository.

Citation

If you find this repository helpful and want to cite SwiftNet in your own projects, please use the following citation info.

@inproceedings{wang2021swiftnet,
  title={SwiftNet: Real-time Video Object Segmentation},
  author={Wang, Haochen and Jiang, Xiaolong and Ren, Haibing and Hu, Yao and Bai, Song},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1296--1305},
  year={2021}
}

swiftnet's People

Contributors

haochenheheda avatar trasperj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

swiftnet's Issues

About the motion model to generate the simulated images in pre-training

Thanks for sharing your amazing work! I have some questions about the pre-training stage.
In Sec 4.2.1 of your paper, you mention that 'we maintain an implicit motion model to generate clips with length of 5.' This part seems different from the pre-training of STM. So what is the motion model and how does it works?

Resnet18 weights

Hi,thank you for your paper! Your work is very nice.
I have a question,your link which refers to 'swiftnet_resnet18_old.pth' is none,could you please add this?

Evaluation on Custom Dataset

Hello,

I am working on a project that requires a VOS model. Does the eval.py code work on custom datasets?

Thanks in advance.

About the stored key -value stoted in the memory

The work is very interesting and amazing.
I have a question about the number of frame's key-value stored in the memory and used as reference.
The STM work stored all(or every 5 frames) the key-value of the past frames as reference.
In SwiftNet, is it only need one key-value and only updated when the Variation is detected?

Variation-Aware Trigger?

Hi, thank you for the code & paper.
I am looking through the code and cannot seem to find the implementation of the VAT. It appears to me that the memory scheduling is fixed:

to_memorize = [int(i) for i in np.arange(0, num_frames, step=Mem_every)]

Am I missing something?
Also, does y_t in the paper refer to the binarized "hard" mask or soft mask?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.