Giter Site home page Giter Site logo

yangxuanyi / multi-agent-gpt Goto Github PK

View Code? Open in Web Editor NEW
215.0 2.0 5.0 10.18 MB

Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。

License: MIT License

Python 100.00%

multi-agent-gpt's Introduction

poster

A multimodal expert assistant GPT platform built using RAG+agent. It integrates tools for modalities such as text, images, and audio. Support local deployment and private database construction.

Code Maintenance PR's Welcome license

project_display.mp4

💡 1 RoadMap

1 Basic Function

  • Single/multi turn chat
  • Multimodal information display and interaction
  • Agent
  • Tools
    • Web searching
    • Image generation
    • Image caption
    • audio-to-text
    • text-to-audio
    • Video caption
  • RAG
    • Private database
    • Offline deployment

2 Supporting Information Modality

  • text
  • image
  • audio
  • video

3 Model Interface API

  • ChatGPT
  • Dalle
  • Google-Search
  • BLIP

👨‍💻 2 Development

Project technology stack: Python + torch + langchain + gradio

⚡ 2.1 Installation

  1. Create a virtual environment in Anaconda:
conda create -n agent python=3.10
  1. Enter the virtual environment and Install related dependency packages:
conda activate agent
pip install -r ./requirements.txt
  1. Install the BLIP model locally, open the BLIP website, and download all files to Models/BLIP.

  2. Follow the prompts to configure the key for the API that needs to be used in the .env.

💻 2.2 Demo

Multi Agent GPT provides UI interface interaction, allowing users to launch agents and achieve intelligent conversations by running the web.py:

python ./web.py

The program will run a local URL: http://XXX. Open using a local browser to see the UI interface:

demo

📻 2.3 News

1 Chat_with_Image

By integrating the BLIP model, agents can understand image information and provide high-quality dialogue information.

🗄️ 3 Structure

- .env
- Agents/
  - openai_agents.py  #用来定义基于gpt3.5的agent
- Database/
- Docs/
- Imgs/
  - Show/                #存储一些示例图片
- Models
  - BLIP                 #图像理解大模型
- Tools/
  - ImageCaption.py      #基于BLIP的图像理解工具
  - ImageGeneration.py  #定义了一个基于openai dalle的文本生成图像的工具
  - search.py            #基于Google-search的联网搜索工具
- Utils/
  - data_io.py
  - stdio.py            #实现了如何截获当前程序的日志信息,主要是用来获取agent的verbose信息
  - utils_image.py      #关于图像处理的一些功能函数
  - utils_json.py       #从已有的log日志信息中提取相关的有用字段(服务stdio) 
- python_new_funciton.py #开发过程中的测试文件
- readme.md
- requirements.txt
- web.py                 #主运行文件

multi-agent-gpt's People

Contributors

yangxuanyi avatar

Stargazers

ZeroOneCN avatar  avatar Robin Liu avatar Alex GENG avatar  avatar alleniver avatar Dpuntu avatar  avatar KyoMio avatar  avatar  avatar leebozhan avatar troubadour avatar Jian avatar chiefass avatar zql avatar Jiang aplus avatar Kelsey avatar  avatar Void Soul avatar  avatar spele avatar DanL0 avatar  avatar  avatar Kevins avatar lDevin avatar Jerry LI avatar  avatar MeiCXi avatar Wind avatar  avatar 孙娇女 avatar  avatar happy new year avatar helloworld avatar Krasjet-Yu avatar $truggler avatar  avatar 蔡铭修 avatar  avatar 嘿嘿 avatar 李子凡 avatar jxcheng avatar sanzhang avatar Kitty avatar Yanming Liu avatar  avatar Meta Luo avatar  avatar  avatar su7-gaga avatar  avatar  avatar  avatar  avatar huangjunxian avatar  avatar lee avatar  avatar albert avatar  avatar  avatar xuecheng avatar lzw avatar Jae Hauck avatar ruojian li avatar Molly Lau avatar misster chen avatar  avatar  avatar 李健 avatar LeoCui avatar sada avatar Tigger avatar  avatar Loraine Lesch avatar SundogsLiu avatar  avatar Chengsr avatar ChengDingxin avatar  avatar mask avatar  avatar Zresnso avatar jason shen avatar kkk avatar 追风 avatar  avatar  avatar test_init avatar Jackey Wang avatar  avatar huyikai avatar Lex avatar  avatar  avatar  avatar Daxiong avatar Handsome avatar

Watchers

 avatar  avatar

multi-agent-gpt's Issues

未来拓展计划

  1. 加入语音识别模型
  2. 加入上传文档,提供文档总结功能
  3. RAG构建本地数据库

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.