Giter Site home page Giter Site logo

ange233 / segment-and-track-anything Goto Github PK

View Code? Open in Web Editor NEW

This project forked from z-x-yang/segment-and-track-anything

0.0 0.0 0.0 17.68 MB

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.

License: GNU Affero General Public License v3.0

Shell 0.02% Python 3.97% Jupyter Notebook 96.01%

segment-and-track-anything's Introduction

Segment and Track Anything (SAM-Track)

Segment and Track Anything is an open-source project that focuses on the segmentation and tracking of any objects in videos, utilizing both automatic and interactive methods. The primary algorithms utilized include the SAM (Segment Anything Model) for automatic/interactive key-frame segmentation and the AOT (Associating Object with Transformers) for efficient multi-object tracking and propagation. The SAM-Track pipeline enables dynamic and automatic detection and segmentation of new objects by SAM, while AOT is responsible for tracking all identified objects.

Demos

Segment-and-Track-Anything Versatile Demo

This video showcases the segmentation and tracking capabilities of SAM-Track in various scenarios, such as street views, AR, cells, animations, aerial shots, and more.

TODO

  • Colab notebook
  • 1.0-Version Interactive WebUI
    • We will develop a function that allows interactive modification of the mask for the first frame of the video based on the user's requirements. We demonstrate the interactive segmentation capabilities of Segment-and-Track-Anything in Demo1 and Demo2.

Demo1 showcases SAM-Track's ability to interactively segment and track individual objects. The user specified that SAM-Track tracked a man playing street basketball.

Interactive Segment-and-Track-Anything Demo1

Demo2 showcases SAM-Track's ability to interactively add specified objects for tracking.The user customized the addition of objects to be tracked on top of the segmentation of everything in the scene using SAM-Track.

Interactive Segment-and-Track-Anything Demo2

Getting Started

Requirements

The Segment-Anything repository has been cloned and renamed as sam, and the aot-benchmark repository has been cloned and renamed as aot.

Please check the dependency requirements in SAM and AOT.

The implementation is tested under python 3.9, as well as pytorch 1.10 and torchvision 0.11. We recommend equivalent or higher pytorch version.

To install SAM:

cd sam; pip install -e .

To install other libs:

pip install numpy opencv-python pycocotools matplotlib Pillow scikit-image

It is recommended to install Pytorch Correlation for accelerating AOT inference.

Model Preparation

Download SAM model to ckpt, the default model is sam_vit_b_01ec64.pth.

Download AOT model to ckpt, the default model is R50_DeAOTL_PRE_YTB_DAV.pth.

You can download the default weights using the command line as shown below.

bash script/download_ckpt.sh

Run Demo

  • The video to be processed can be put in ./assets.
  • Then run demo.ipynb step by step to generate results.
  • The results will be saved as masks for each frame and a gif file for visualization.

The arguments for SAM-Track, AOT and SAM can be manually modified in model_args.py for purpose of using other models or controling the behavior of each model.

WebUI App

Our user-friendly visual interface allows you to easily obtain the results of your experiments. Simply initiate it using the command line.

python app.py

Users can upload the video directly on the UI and use Segtracker to track all objects within that video. We use the depth-map video as a example.

Gradio

Parameters:

  • aot_model: used to select which version of AOT to use for tracking and propagation.
  • sam_gap: used to control how often SAM is used to add newly appearing objects at specified frame intervals. Increase to decrease the frequency of discovering new targets, but significantly improve speed of inference.
  • points_per_side: used to control the number of points per side used for generating masks by sampling a grid over the image. Increasing the size enhances the ability to detect small objects, but larger targets may be segmented into finer granularity.
  • max_obj_num: used to limit the maximum number of objects that SAM-Track can detect and track. A larger number of objects necessitates a greater utilization of memory, with approximately 16GB of memory capable of processing a maximum of 255 objects.

Usage:

  • Start app, use your browser to open the web-link.
  • Click on the input-video window to upload a video.
  • Adjust SAM-Track parameters as needed.
  • Click Seg and Track to get the experiment results.

Credits

Licenses for borrowed code can be found in licenses.md file.

segment-and-track-anything's People

Contributors

z-x-yang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.