Giter Site home page Giter Site logo

capjamesg / cv-book-svg Goto Github PK

View Code? Open in Web Editor NEW
107.0 6.0 14.0 220 KB

Turn an image of a bookshelf into an interactive SVG.

Home Page: https://capjamesg.github.io/cv-book-svg/

License: MIT License

HTML 80.29% Python 19.71%
books computer-vision image-analysis library personal-library

cv-book-svg's Introduction

Make your bookshelf clickable

Use computer vision to generate an SVG that you can overlay onto a photo of your bookshelf that lets you click on each book to find out more information.

Demo

Try the demo

demo.mov

How it Works

This tool uses computer vision to identify and segment each book spine in an image of a bookshelf. Then, each book spine is sent to GPT-4 with Vision to read the book title and, if possible, the author.

This information is then sent to the Google Books API. The book ISBN, author name, and other meta information is retrieved from this API.

An SVG is then created using the segmented book spines. Each book is assigned a polygon which, when clicked, takes you to the Google Books page associated with a book.

This script uses the following vision tools:

  • Grounding DINO (zero-shot object detection model)
  • Segment Anything (image segmentation model)
  • GPT-4 with Vision API
  • OpenCV Python

It takes around 20 seconds to generate the polygons that map to the location of each book on an M1 Macbook Air. It then takes a few seconds to process each book with the OpenAI GPT-4 with Vision API.

For a bookshelf with 11 books, the script takes around one minute to run.

The script returns a HTML file with an SVG file that is overlaid onto the source image.

How to Use

First, clone this project and install the required dependencies:

git clone https://github.com/capjamesg/cv-book-svg
cd cv-book-svg
pip3 install -r requirements.txt

Then, run the main script:

python3 grounded.py --image=example.jpg --output=annotation.html

This script takes an image as input (PNG, JPEG) and outputs a HTML document.

Limitations

This system may:

  • Not identify all books on a bookshelf (thin books are more likely to not be identified).
  • Generate a link to the wrong Google Books URL (which will happen if a book is not available on Google Books, or if a book has a generic title like "Poems of Emily Dickinson", which could on its own refer to several publications).
  • Mis-identify some books.

Notes

  • video.py contains a work-in-progress system for identifying all unique books in a video.

License

This project is licensed under an MIT license.

Contributing

Found a bug? Have an idea that you'd like to see in the project? Open an Issue in this GitHub repository.

cv-book-svg's People

Contributors

capjamesg avatar drcursor avatar jzcruiser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cv-book-svg's Issues

Feature request: Ability to capture label from a bookshelf

As a book guy, I really love this!

I would love it if we could also read text from a labeled shelf. I have thousands of books. I have 51 shelves of books in my home office - and that's just the home office. Add a few dozen elsewhere in the house.

As a bibliophile,
I want the image processor to also pick up on an optional horizontal label that is affixed to the bookshelf,
so that I can store information about where a book is/belongs in addition to the list of books.

What you've already built is amazing - this is the feature that for me would likely make me take a day off and catalog my books. I'd like to say "and I'll take a shot at coding it", but my backlog of work is pretty large (possibly because I have too many books).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.