Giter Site home page Giter Site logo

lilianedng / clip-flask Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 36.13 MB

CLIP-flask: A Flask app leveraging CLIP for analyzing and visualizing text-image embedding alignments in videos with interactive Plotly plots.

Python 72.24% CSS 5.33% JavaScript 18.17% HTML 4.27%

clip-flask's Introduction

CLIP Video Investigator

clip_video_investigator

Overview

CLIP Video Investigator is a Flask-based web application designed to compare text and image embeddings using the CLIP model. The application integrates OpenCV for video processing and Plotly for data visualization to accomplish the following:

  1. Play a video in a web browser.
  2. Pause and resume video playback.
  3. Compare CLIP embeddings of video frames with text embeddings.
  4. Visualize the similarity between text and image embeddings in real-time using a Plotly plot.
  5. Jump to specific frames by clicking on the Plotly plot.

Why This is Useful

Understanding the relationship between text and image embeddings can provide insights into how well a model generalizes across modalities. By plotting these values in real-time, researchers and engineers can:

  • Identify key frames where text and image embeddings are highly aligned or misaligned.
  • Debug and fine-tune the performance of multimodal models.
  • Gain insights into the temporal evolution of embeddings in video data.
  • Enable more effective search and retrieval tasks for video content.

Features

  • Video Playback: Uses OpenCV to read video frames and displays them in the web browser.

  • Play/Pause: Allows the user to start and stop video playback.

  • Data Visualization: Uses Plotly to plot data related to the video frames.

  • Interactive Plot: Allows the user to click on the plot to jump to specific frames in the video.

  • Reset Functionality: Resets the application to its initial state.

  • Embedding Caching: Pickle files of the text and image frame embeddings are saved for each video in the /embeddings folder. This allows for quicker subsequent analysis by avoiding the need to regenerate these embeddings.

Configuration

A config.yaml file is used to specify various settings for the application:

roboflow_api_key: ""  # Roboflow API key
video_path: ""  # Path to video file
CLIP:
  - wall
  - tile wall
  - large tile wall
  • roboflow_api_key: Your API key for Roboflow.
  • video_path: The path to the video file you want to analyze.
  • CLIP: A list of text inputs for which you want to generate CLIP embeddings.

Folder Layout

clip_investigator/
├── config.yaml
├── scripts/
│   └── example.pkl
├── embeddings/
│   └── clip_app.py
│   └── clip_functions.py
├── static/
│   ├── css/
│   │   └── style.css
│   └── js/
│       └── main.js
└── templates/
    └── index.html
  • clip_app.py: The main Flask application file.
  • config.yaml: Configuration file for specifying settings.
  • embeddings/: Folder where pickle files of text and image embeddings are stored.
  • static/: Contains static files like CSS and JavaScript.
  • templates/: Contains HTML templates.

Installation

Prerequisites

  • Python 3.x
  • Virtualenv (optional but recommended)

Steps

  1. Clone the repository.

    git clone https://github.com/roboflow/clip_video_app.git
  2. Navigate to the project directory.

    cd  clip_video_app
  3. (Optional) Create a virtual environment.

  4. Install the dependencies.

    pip install -r requirements.txt

Usage

You must also be running the roboflow inference server localy!

  1. Update the config.yaml file with your Roboflow API key and the path to your video file (or use sample file in /data folder).

  2. Start the Flask application.

    python scripts/clip_app.py
  3. Open a web browser and navigate to http://localhost:5000.

  4. Use the "Start" and "Stop" buttons to control video playback.

  5. View real-time data related to the video in the Plotly plot below the video.

  6. Click on the Plotly plot to jump to specific video frames.

Troubleshooting

  • WebSocket Errors: If you encounter WebSocket errors, check the browser console for specific error messages. The application has built-in error handling to attempt reconnections.

  • Plotly Click Events: If click events are not detected on the Plotly plot after a reset, reload the page.

clip-flask's People

Contributors

lilianedng avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.