yangxuanyi / multi-agent-gpt Goto Github PK

Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。

License: MIT License

Python 100.00%

multi-agent-gpt's Introduction

A multimodal expert assistant GPT platform built using RAG+agent. It integrates tools for modalities such as text, images, and audio. Support local deployment and private database construction.

project_display.mp4

💡 1 RoadMap

1 Basic Function

2 Supporting Information Modality

text
image
audio
video

3 Model Interface API

ChatGPT
Dalle
Google-Search
BLIP

👨‍💻 2 Development

Project technology stack: Python + torch + langchain + gradio

⚡ 2.1 Installation

Create a virtual environment in Anaconda:

conda create -n agent python=3.10

Enter the virtual environment and Install related dependency packages:

conda activate agent

pip install -r ./requirements.txt

Install the BLIP model locally, open the BLIP website, and download all files to Models/BLIP.
Follow the prompts to configure the key for the API that needs to be used in the .env.

💻 2.2 Demo

Multi Agent GPT provides UI interface interaction, allowing users to launch agents and achieve intelligent conversations by running the web.py:

python ./web.py

The program will run a local URL: http://XXX. Open using a local browser to see the UI interface:

📻 2.3 News

1 Chat_with_Image

By integrating the BLIP model, agents can understand image information and provide high-quality dialogue information.

🗄️ 3 Structure

- .env
- Agents/
  - openai_agents.py  #用来定义基于gpt3.5的agent
- Database/
- Docs/
- Imgs/
  - Show/                #存储一些示例图片
- Models
  - BLIP                 #图像理解大模型
- Tools/
  - ImageCaption.py      #基于BLIP的图像理解工具
  - ImageGeneration.py  #定义了一个基于openai dalle的文本生成图像的工具
  - search.py            #基于Google-search的联网搜索工具
- Utils/
  - data_io.py
  - stdio.py            #实现了如何截获当前程序的日志信息，主要是用来获取agent的verbose信息
  - utils_image.py      #关于图像处理的一些功能函数
  - utils_json.py       #从已有的log日志信息中提取相关的有用字段(服务stdio) 
- python_new_funciton.py #开发过程中的测试文件
- readme.md
- requirements.txt
- web.py                 #主运行文件

multi-agent-gpt's People