Giter Site home page Giter Site logo

arhosseini77 / brand_attention Goto Github PK

View Code? Open in Web Editor NEW
22.0 3.0 4.0 14.02 MB

Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis

License: Apache License 2.0

Python 100.00%

brand_attention's Introduction


Brand Attention

This repo contains the official implementation for the paper: Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis

Paper . Project Page . Dataset



Brand-Attention Module

Installation

Install Pytorch with :

pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116

Install the requirements with:

pip install -r requirements.txt

Download weights:

mkdir weights
cd weights
gdown 1p6LyrHIAYbz5M94eBTTOcduWF-TzeQ1D # Logo-Detection-Yolov8
gdown 1_0iA_pmC4-IvA7IaRuiJ-QHnRheIEmBe  # ECT-SAL weights
cd ..
cd saliency_prediction
mkdir pretrained_models
cd pretrained_models
gdown 1C9TgPQvFQHJP3byDtBL5X56cXeOl4JE2  # Resnet 50 pretrained weight
cd ..
cd ..

Brand-Logo Detection

Description

This module focuses on detecting brand logos in images using the YOLOv8 model. It utilizes two datasets for training: FoodLogoDet-1500 and LogoDet-3K.

Inference

You can use the following command to run the brand logo detection code:

python main_detection_yolov8.py --model="weights/Logo_Detection_Yolov8.pt" --image="test_images/test.jpg" --save-result
  • If you want to visualize the detection results, include the --save-result flag in the command.

Result

Original Image Brand Logo Detection Result
Original Image Brand Logo Detection

ECT-SAL

ECT-SAL

Description

This module is designed for predicting saliency maps of images, particularly suited for use in ads and packaging. Model leverages the ECSAL dataset for training. You can find the dataset here.

Inference

  • For saliency map prediction, it is essential to provide a corresponding text map. Use the DBNET++ model available here to generate accurate text maps for improved saliency predictions.

Run the script:

python main_saliency_prediction.py --img_path path/to/your/image.jpg --weight_path "weights/ECT_SAL.pth" --tmap path/to/test_text_map_image.jpg --output_path path/to/output/directory

Training

To train your dataset on the ECT-SAL model, follow the instructions provided in the ECT-SAL README.

Result

Original Image Saliency Map
Original Image Saliency Map

Brand-Attention

The Brand Attention Module is a component designed to assess the visibility and attention of a brand within advertisement and packaging images. It combines logo detection and saliency map prediction techniques to quantify the presence and prominence of a brand in a given image.

Inference

  • For saliency map prediction, it is essential to provide a corresponding text map. Use the DBNET++ model available here to generate accurate text maps for improved saliency predictions.
python main_brand_attention.py --img_path path/to/input_image.jpg --tmap path/to/text_map.jpg
  • If the detection is accurate and aligns with the brand logo, press 1 to confirm.

  • In cases where adjustments are needed or discrepancies exist, press 2 to signify the need for refinement.

  • For refinement, use the interactive feature to draw new bounding boxes (bboxes) directly on the image. This hands-on approach allows for precise customization of logo localization.

Result

Input Image
Brand-Attention Score: 23.54

Advertisement image object Attention

This Module is a component designed to assess the visibility and attention of any object you want within advertisement and packaging images. It saliency map prediction techniques to quantify the presence and prominence of that object in a given image.

Inference

  • For saliency map prediction, it is essential to provide a corresponding text map. Use the DBNET++ model available here to generate accurate text maps for improved saliency predictions.
python main_object_attention.py --img_path path/to/input_image.jpg --tmap path/to/text_map.jpg
  • draw bounding boxes around objects of interest, press the Enter key to calculate and obtain the attention score for the selected objects.

Result

Input Image BBox Selected
Input Image BBox Selected
Object Attention Score 11.22%

Acknowledgement

We thank the authors of Transalnet, DBNET, Efficient-Attention for their code repositories.

Citation

@misc{hosseini2024brand,
      title={Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis}, 
      author={Alireza Hosseini and Kiana Hooshanfar and Pouria Omrani and Reza Toosi and Ramin Toosi and Zahra Ebrahimian and Mohammad Ali Akhaee},
      year={2024},
      eprint={2403.02336},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

brand_attention's People

Contributors

arhosseini77 avatar k-hooshanfar avatar pouriaomrani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

brand_attention's Issues

FileNotFoundError

I am trying to run the inference section in colab notebook, but when I try to test other images it gives this error .
Screenshot 2024-06-03 125501

even though the file is present.

Text-Detector Code

Add Text-Detector code + README + google colab version and Update main README of Project

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.