Giter Site home page Giter Site logo

sitamgithub-msit / wiseeye Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 261 KB

Home Page: https://huggingface.co/spaces/sitammeur/WiseEye

License: MIT License

Python 100.00%
artificial-intelligence blip gradio gradio-interface huggingface-spaces huggingface-transformers multimodal-data multimodal-deep-learning question-answering

wiseeye's Introduction

WiseEye

Visual Question Answering is the task of answering open-ended questions based on an image. They output natural language responses to natural language questions about the content of an image. This project uses one of the popular multimodal models, blip-vqa-base from the Hugging Face model hub for visual question answering. The used model can be run on both GPUs and CPUs.

Tech Stack

  • Python (for the programming language)
  • Hugging Face Transformers Library (for the visual question answering model)
  • Gradio (for the web application)
  • Hugging Face Spaces (for hosting the gradio application)

Getting Started

To get started with this project, follow the steps below:

  1. Clone the repository: git clone https://github.com/sitamgithub-MSIT/WiseEye.git
  2. Create a virtual environment: python -m venv tutorial-env
  3. Activate the virtual environment: tutorial-env\Scripts\activate
  4. Install the required dependencies: pip install -r requirements.txt
  5. Run the Gradio application: python app.py

Now, open up your local host and you should see the web application running. For more information, refer to the Gradio documentation here. Also, a live version of the application can be found here.

Usage

The web application allows you to input an image and a question. The model will then generate an answer based on the image and the question. It can assist visually impaired individuals by providing access to web and real-world images, improving image retrieval by retrieving specific characteristics, and enabling video search by retrieving specific snippets or timestamps based on search queries. The application can also be applied in educational settings to provide a more interactive learning experience.

Contributing

Contributions are welcome! If you would like to contribute to this project, please raise an issue to discuss the changes you would like to make. Once the changes are approved, you can create a pull request.

License

This project is licensed under the MIT License.

Contact

If you have any questions or suggestions regarding the project, feel free to reach out to me on my GitHub profile.

Happy coding! ๐Ÿš€

wiseeye's People

Contributors

sitamgithub-msit avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.