Giter Site home page Giter Site logo

object-detector's Introduction

Text Guided or Image Guided Object Detection

This repository contains the code for implementing text-guided object detection and image-guided object detection.

The text-guided object detector is implmented using GroundingDINO and OWL-ViT from Huggingface Transformers.

The image-guided object detector is implemented using OWL-ViT from Huggingface Transformers and integrated with the text-guided object detector by using BLIP from Huggingface Transfomrers to get text prompt

Setup

  1. Clone the repository

    git clone https://github.com/haizadtarik/object-detector.git
    
  2. Install requirements

    pip install -r requirements.txt
    
  3. Install and setup GroundingDino

    git clone https://github.com/IDEA-Research/GroundingDINO.git
    cd GroundingDINO/
    pip install -e .
    wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
    cd ..
    

Example Usage

  • Text Guided with GroundingDINO

    from detectors import textDetector
    import requests
    
    url = "http://images.cocodataset.org/val2017/000000039769.jpg"
    image_path = requests.get(query_url, stream=True).raw
    text = "cat"
    detector = textDetector("dino")
    boxes, logits, labels, image_target = detector.run(image_path=image_path, text_prompt=text, threshold=0.2)
    image_target.show()
    
  • Text Guided with OWL-ViT

    from detectors import textDetector
    import requests
    
    url = "http://images.cocodataset.org/val2017/000000039769.jpg"
    image_path = requests.get(query_url, stream=True).raw
    text = "cat"
    detector = textDetector("owl")
    boxes, logits, labels, image_target = detector.run(image_path=image_path, text_prompt=text, threshold=0.2)
    image_target.show()
    
  • Image Guided with OWL-ViT

    from detectors import imageDetector
    import requests
    
    url = "http://images.cocodataset.org/val2017/000000039769.jpg"
    image_path = requests.get(query_url, stream=True).raw
    
    query_url = "http://images.cocodataset.org/val2017/000000524280.jpg"
    query_image_path = Image.open(requests.get(query_url, stream=True).raw)
    
    detector = imageDetector()
    boxes, logits, labels, image_target = detector.run(image_path=image_path, query_image_path=query_image_path)
    image_target.show()
    
  • Image guided with blip and dino

    from detectors import imageDetector
    import requests
    
    url = "http://images.cocodataset.org/val2017/000000039769.jpg"
    target_image_path = requests.get(query_url, stream=True).raw
    
    query_url = "http://images.cocodataset.org/val2017/000000524280.jpg"
    query_image_path = Image.open(requests.get(query_url, stream=True).raw)
    
    detector = imageDetector()
    boxes, logits, labels, image_target = detector.run(target_image_path, query_image_path, threshold=0.2, model_name="dino")
    image_target.show()
    

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.