Giter Site home page Giter Site logo

zilliztech / akcio Goto Github PK

View Code? Open in Web Editor NEW
229.0 12.0 33.0 1.6 MB

Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses vector databases to fetch relevant documents to enhance the quality and relevance of the output.

Home Page: https://github.com/zilliztech/akcio/wiki

License: Other

Python 99.88% Dockerfile 0.12%
chatbot chatgpt dolly ernie-bot langchain llm milvus minimax openai retrieval-augmented-generation

akcio's Introduction

Akcio: Enhancing LLM-Powered ChatBot with CVP Stack

OSSChat | Documentation | Contact | LICENSE

Index

ChatGPT has constraints due to its limited knowledge base, sometimes resulting in hallucinating answers when asked about unfamiliar topics. We are introducing the new AI stack, ChatGPT+Vector database+prompt-as-code, or the CVP Stack, to overcome this constraint.

We have built OSSChat as a working demonstration of the CVP stack. Now we are presenting the technology behind OSSChat in this repository with a code name of Akcio.

With this project, you are able to build a knowledge-enhanced ChatBot using LLM service like ChatGPT. By the end, you will learn how to start a backend service using FastAPI, which provides standby APIs to support further applications. Alternatively, we show how to use Gradio to build an online demo.

Overview

Akcio allows you to create a ChatGPT-like system with added intelligence obtained through semantic search of customized knowledge base. Instead of sending the user query directly to LLM service, our system firstly retrieves relevant information from stores by semantic search or keyword match. Then it feeds both user needs and helpful information into LLM. This allows LLM to better tailor its response to the user's needs and provide more accurate and helpful information.

You can find more details and instructions at our documentation.

Akcio offers two AI platforms to choose from: Towhee or LangChain. It also supports different integrations of LLM service and databases:

Towhee LangChain
LLM OpenAI
Llama-2
Dolly
Ernie
MiniMax
DashScope
ChatGLM
SkyChat
Embedding OpenAI
HuggingFace
Vector Store Zilliz Cloud
Milvus
Scalar Store (Optional) Elastic
Memory Store Postgresql
MySQL and MariaDB
SQLite
Oracle
Microsoft SQL Server
Rerank MS MARCO Cross-Encoders

Option 1: Towhee

The option using Towhee simplifies the process of building a system by providing pre-defined pipelines. These built-in pipelines require less coding and make system building much easier. If you require customization, you can either simply modify configuration or create your own pipeline with rich options of Towhee Operators.

  • Pipelines

    • Insert: The insert pipeline builds a knowledge base by saving documents and corresponding data in database(s).
    • Search: The search pipeline enables the question-answering capability powered by information retrieval (semantic search and optional keyword match) and LLM service.
    • Prompt: a prompt operator prepares messages for LLM by assembling system message, chat history, and the user's query processed by template.
  • Memory: The memory storage stores chat history to support context in conversation. (available: most SQL)

Option 2: LangChain

The option using LangChain employs the use of Agent in order to enable LLM to utilize specific tools, resulting in a greater demand for LLM's ability to comprehend tasks and make informed decisions.

  • Agent
    • ChatAgent: agent ensembles all modules together to build up qa system.
    • Other agents (todo)
  • LLM
    • ChatLLM: large language model or service to generate answers.
  • Embedding
    • TextEncoder: encoder converts each text input to a vector.
    • Other encoders (todo)
  • Store
    • VectorStore: vector database stores document chunks in embeddings, and performs document retrieval via semantic search.
    • ScalarStore: optional, database stores metadata for each document chunk, which supports additional information retrieval. (available: Elastic)
    • MemoryStore: memory storage stores chat history to support context in conversation.
  • DataLoader
    • DataParser: tool loads data from given source and then splits documents into processed doc chunks.

Deployment

  1. Downloads

    $ git clone https://github.com/zilliztech/akcio.git
    $ cd akcio
  2. Install dependencies

    $ pip install -r requirements.txt
  3. Configure modules

    You can configure all arguments by modifying config.py to set up your system with default modules.

    • LLM

      By default, the system will use OpenAI service as the LLM option. To set your OpenAI API key without modifying the configuration file, you can pass it as environment variable.

      $ export OPENAI_API_KEY=your_keys_here
      Check how to SWITCH LLM. If you want to use another supported LLM service, you can change the LLM option and set up for it. Besides directly modifying the configuration file, you can also set up via environment variables.
      • For example, to use Llama-2 at local which does not require any account, you just need to change the LLM option:

        $ export LLM_OPTION=llama_2
      • For example, to use Ernie instead of OpenAI, you need to change the option and set up ERNIE Bot SDK token :

        $ export LLM_OPTION=ernie
        $ export EB_API_TYPE=your_api_type
        $ export EB_ACCESS_TOKEN=your_ernie_access_token
    • Embedding

      By default, the embedding module uses methods from Sentence Transformers to convert text inputs to vectors. Here are some information about the default embedding method:

    • Store

      Before getting started, all database services used for store must be running and be configured with write and create access.

      • Vector Store: You need to prepare the service of vector database in advance. For example, you can refer to Milvus Documents or Zilliz Cloud to learn about how to start a Milvus service.
      • Scalar Store (Optional): This is optional, only work when USE_SCALAR is true in configuration. If this is enabled (i.e. USE_SCALAR=True), the default scalar store will use Elastic. In this case, you need to prepare the Elasticsearch service in advance.
      • Memory Store: By default, both LangChain and Towhee mode allow interaction with any database supported by SQLAlchemy 2.0.

      The system will use default store configs. To set up your special connections for each database, you can also export environment variables instead of modifying the configuration file.

      For the Vector Store, set ZILLIZ_URI:

      $ export ZILLIZ_URI=your_zilliz_cloud_endpoint
      $ export ZILLIZ_TOKEN=your_zilliz_cloud_api_key  # skip this if using Milvus instance

      For the Memory Store, set SQL_URI:

      $ export SQL_URI={database_type}://{user}:{password}@{host}/{database_name}
      By default, scalar store (elastic) is disabled. Click to check how to enable Elastic.

      The following commands help to connect your Elastic cloud.

      $ export USE_SCALAR=True
      $ export ES_CLOUD_ID=your_elastic_cloud_id
      $ export ES_USER=your_elastic_username
      $ export ES_PASSWORD=your_elastic_password

      To use host & port instead of cloud id, you can manually modify the VECTORDB_CONFIG in config.py.


  1. Start service

    The main script will run a FastAPI service with default address localhost:8900.

    • Option 1: using Towhee
      $ python main.py --towhee
    • Option 2: using LangChain
      $ python main.py --langchain
  2. Access via browser

    You can open url http://localhost:8900/docs in browser to access the web service.

    /: Check service status

    /answer: Generate answer for the given question, with assigned session_id and project

    /project/add: Add data to project (will create the project if not exist)

    /project/drop: Drop project including delete data in both vector and memory storages.

    Check Online Operations to learn more about these APIs.

Load data

The insert function in operations loads project data from url(s) or file(s).

There are 2 options to load project data:

Option 1: Offline

We recommend this method, which loads data in separate steps. There is also advanced options to load document, for example, generating and inserting potential questions for each doc chunk. Refer to offline_tools for instructions.

Option 2. Online

When the FastAPI service is up, you can use the POST request http://localhost:8900/project/add to load data.

Parameters:

{
  "project": "project_name",
  "data_src": "path_to_doc",
  "source_type": "file"
}

or

{
  "project": "project_name",
  "data_src": "doc_url",
  "source_type": "url"
}

This method is only recommended to load a small amount of data, but not for a large amount of data.


LICENSE

Akcio is published under the Server Side Public License (SSPL) v1.

akcio's People

Contributors

bennu-li avatar chiiizzzy avatar codingjaguar avatar jaelgu avatar zc277584121 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

akcio's Issues

[Bug]: when post /project/add, service failed to parse uploaded txt or pdf document. error message: "name 'enc' is not defined "

Current Behavior

curl -X 'POST' \
  'http://localhost:8900/project/add?project=test' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected];type=text/plain'

Response body

[
  {
    "status": false,
    "msg": "Failed to load data:\nNode-token-counter-1 runs failed, error msg: name 'enc' is not defined, Traceback (most recent call last):\n  File \"/export/xxx/anaconda3/envs/akcio/lib/python3.10/site-packages/towhee/runtime/nodes/node.py\", line 158, in _call\n    return True, self._op(*inputs), None\n  File \"/home/xxx/.towhee/operators/towhee/token-counter/versions/main/count_token.py\", line 13, in __call__\n    num_tokens = len(enc.encode(str(data)))\nNameError: name 'enc' is not defined\n, Traceback (most recent call last):\n  File \"/export/xxx/anaconda3/envs/akcio/lib/python3.10/site-packages/towhee/runtime/nodes/node.py\", line 171, in process\n    self.process_step()\n  File \"/export/xxx/anaconda3/envs/akcio/lib/python3.10/site-packages/towhee/runtime/nodes/_map.py\", line 63, in process_step\n    assert succ, msg\nAssertionError: name 'enc' is not defined, Traceback (most recent call last):\n  File \"/export/xxx/anaconda3/envs/akcio/lib/python3.10/site-packages/towhee/runtime/nodes/node.py\", line 158, in _call\n    return True, self._op(*inputs), None\n  File \"/home/xxx/.towhee/operators/towhee/token-counter/versions/main/count_token.py\", line 13, in __call__\n    num_tokens = len(enc.encode(str(data)))\nNameError: name 'enc' is not defined\n\n\n"
  },
  400
]

towhee version: 1.1.1

Expected Behavior

No response

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

[Enhancement]: Support Llama-2 in LangChain mode

What would you like to be added?

Akcio offers two options to build the system: langchain or towhee.
The option using Towhee has already supported Llama-2 as LLM.
To support Llama-2 for LangChain, we need to add a llama_2_chat.py under https://github.com/zilliztech/akcio/tree/main/src_langchain/llm.

With llama-2 supported in LangChain mode, the following steps should start service successfully:

  1. Set up

Change to your own Milvus & Postgres connection details (modify config.py if needed.)

$ export LLM_OPTION=llama_2
$ export MILVUS_URI=https://localhost:19530
$ export SQL_URI=postgresql://postgres:postgres@localhost/chat_history
  1. Start service
$ python main.py --langchain

OR start gradio demo:

$ python gradio_demo.py --langchain

Why is this needed?

No response

Anything else?

No response

[Enhancement]: Add GPTCache to the project

What would you like to be added?

It would be nice to tweak this project a bit to include GPTCache.

Thanks.

Why is this needed?

Well GPTCache reduces the API bill and also improves speed. It's also in the same Zilliz family so it is beneficial to be integrated.

Anything else?

No response

[Bug]: Trouble setting up the service P2

Current Behavior

This is the procedure I followed:

  • I have set the environment variable OPENAI_API_KEY
  • I created an account on Zilliz and set up a standard cluster using the free trial. in order to have URI and user and password
  • I tryied the connection of the Zilliz account using the following script and seems to be connected, then I setted the enviroment variables
import os
from pymilvus import connections, utility

connections.connect(uri=os.getenv('MILVUS_URI', 'https://the_link_given:19530'), user=os.getenv('MILVUS_USER', 'db_admin'), password=os.getenv('MILVUS_PASSWORD', 'tmy_password'), secure=True)
print(utility.list_collections())
  • I installed postgresql and setted a std database named chathistory and checked the connection to the database with the code below and it was working. Then setted the enviroment variable
import psycopg2
import os

connect_str = os.getenv('SQL_URI', 'postgresql://postgres:My_password@localhost/chathistory')

try:
    conn = psycopg2.connect(connect_str)
    print("its working!")
except Exception as e:
    print(f"not working: {e}")
  • Then I tryied to run the code using:
py main.py --towhee

I've seen that there's a similar problem that has already been solved #43
so I tryied both Zilliztech/akcio and Jaelgu/akcio but i have exactly the same error.
Any advice how to fix it?

The error shown is:

Traceback (most recent call last):
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\internazionale\Lib\site-packages\pymilvus\client\grpc_handler.py", line 131, in _wait_for_channel_ready
    grpc.channel_ready_future(self._channel).result(timeout=timeout)
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\internazionale\Lib\site-packages\grpc\_utilities.py", line 151, in result
    self._block(timeout)
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\internazionale\Lib\site-packages\grpc\_utilities.py", line 97, in _block
    raise grpc.FutureTimeoutError()
grpc.FutureTimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\03 - Secondo Tentativo Akcio\akcio-main\main.py", line 40, in <module>
    from src_towhee.operations import chat, insert, drop, check, get_history, clear_history, count  # pylint: disable=C0413
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\03 - Secondo Tentativo Akcio\akcio-main\src_towhee\operations.py", line 14, in <module>
    towhee_pipelines = TowheePipelines()
                       ^^^^^^^^^^^^^^^^^
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\03 - Secondo Tentativo Akcio\akcio-main\src_towhee\pipelines\__init__.py", line 58, in __init__
    connections.connect(
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\internazionale\Lib\site-packages\pymilvus\orm\connections.py", line 355, in connect
    connect_milvus(**kwargs, user=user, password=password, token=token, db_name=db_name)
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\internazionale\Lib\site-packages\pymilvus\orm\connections.py", line 302, in connect_milvus
    gh._wait_for_channel_ready(timeout=timeout)
  File "C:\Users\erric\Desktop\Corso python\Notizie internazionale 2\internazionale\Lib\site-packages\pymilvus\client\grpc_handler.py", line 134, in _wait_for_channel_ready
    raise MilvusException(
pymilvus.exceptions.MilvusException: <MilvusException: (code=2, message=Fail connecting to server on in01-54ec8f16cac25a2.aws-us-west-2.vectordb.zillizcloud.com:19530. Timeout)>

Expected Behavior

No response

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

[Enhancement]: Docker image to start service

What would you like to be added?

  • Dockerfile to build docker image
  • Docker image to start FastAPI service
  • Update deployment method in docs

Why is this needed?

The deployment option of Docker image offers greater flexibility, consistency, scalability, and manageability compared to starting a service directly from source code.

Anything else?

No response

Trouble setting up the service

Each time I try running python main.py --towhee it gives me this error:

Traceback (most recent call last):
File "C:\Users\victo\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymilvus\client\grpc_handler.py", line 119, in _wait_for_channel_ready
grpc.channel_ready_future(self._channel).result(timeout=timeout)
File "C:\Users\victo\AppData\Local\Programs\Python\Python311\Lib\site-packages\grpc_utilities.py", line 151, in result
self._block(timeout)
File "C:\Users\victo\AppData\Local\Programs\Python\Python311\Lib\site-packages\grpc_utilities.py", line 97, in _block
raise grpc.FutureTimeoutError()
grpc.FutureTimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\victo\Documents\CODE FOLDER\Protected Akcio\akcio\main.py", line 24, in
from src_towhee.operations import chat, insert, drop # pylint: disable=C0413
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\victo\Documents\CODE FOLDER\Protected Akcio\akcio\src_towhee\operations.py", line 14, in
towhee_pipelines = TowheePipelines()
^^^^^^^^^^^^^^^^^
File "C:\Users\victo\Documents\CODE FOLDER\Protected Akcio\akcio\src_towhee\pipelines_init_.py", line 51, in init
connections.connect(
File "C:\Users\victo\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymilvus\orm\connections.py", line 349, in connect
connect_milvus(**kwargs, user=user, password=password, token=token, db_name=db_name)
File "C:\Users\victo\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymilvus\orm\connections.py", line 282, in connect_milvus
gh._wait_for_channel_ready(timeout=timeout)
File "C:\Users\victo\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymilvus\client\grpc_handler.py", line 123, in _wait_for_channel_ready
raise MilvusException(Status.CONNECT_FAILED,
pymilvus.exceptions.MilvusException: <MilvusException: (code=2, message=Fail connecting to server on in01-9434ec54c12222e.gcp-us-west1.vectordb.zillizcloud.com:443. Timeout)>

Reproduction steps:
I cloned the git repo and ran the dependency installing script. Then I modified the config.py file with my OPENAI API key like so:
'openai': { 'openai_model': 'gpt-3.5-turbo', 'openai_api_key': 'my_key_here', # will use environment value 'OPENAI_API_KEY' if None 'llm_kwargs': { 'temperature': 0.8, # 'max_tokens': 200, } },

I left the embedding code untouched but I downloaded sentence-transformers model multi-qa-mpnet-base-cos-v1 using pip install -U sentence-transformers.

For the vector db, I set up a Zilliz cloud cluster on a standard plan so that it would let me input a user and password. inside the cluster I intialized a collection with IP metrics and 768 dimensions. To set up a connection to this cluster I modified the config.py file with the cluster URI, my username and the corresponding password:
'connection_args': { 'uri': os.getenv('MILVUS_URI', 'my_uri_here;), 'user': os.getenv('MILVUS_USER', 'my_user_here'), 'password': os.getenv('MILVUS_PASSWORD', 'my_password_here'), 'secure': True if os.getenv('MILVUS_SECURE', 'False').lower() == 'true' else False },

The memory db was set up using postgresql through pgAdmin 4. I simply set up a database with default owner and all other properties, then connected to it by modifying the config.py file with the instructed string {database_type}://{user}:{password}@{host}/{database_name}, my string looked like: postgresql://{user}:{password}@localhost:5432/{database_name}.

Everything else was left as is. Running the LangChain option gives me a link to localhost:8900, but the link produces an empty page.

Error: Tuple Indices Must Be Integers or Slices in Chatbot Responses

Hello,
So I am running main.py with langchain and after I inserted an md file I was able to ask it a question. However, when I try to ask a second question, I get this response:

  {
    "status": true,
    "msg": "Something went wrong:\ntuple indices must be integers or slices, not str",
    "debug": {
      "original question": "Which tech does Lagrange use?",
      "modified question": "Which tech does Lagrange use?",
      "answer": "Something went wrong:\ntuple indices must be integers or slices, not str"
    }
  },
  200
]

P.S Every time I use a new session id, my first question is instantly answered but the second one returns this response.

Memory problem of the bot

Hello, so I am using a unique session ID and I am using PostgreSQL for memory. The thing is I don't see how I should set it up so that the bot uses the chat_history for context. For example, I will ask the bot to call me Lydacious and in my next message it will say "Unfortunately, I couldn't find any information about what you asked me to call you earlier.". I don't know if it is relative but in my prompt, I've made sure so the bot never comes up with new answers, only takes them from the doc chunks that have been provided.

StatusCode.UNKNOWN, token not found

Hi team,

I tried to run this project and had an error as follows with the token, whereas I have set the tokens and it can be seen in the config object.

Appreciate for any tip to run this. Thanks.

python3 gradio_demo.py --towhee

image

2023-12-08 10:15:15,782 - 140703044237120 - decorators.py-decorators:88 - WARNING: [__internal_register] retry:4, cost: 0.27s, reason: <_InactiveRpcError: StatusCode.UNKNOWN, token not found>

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "token not found"
debug_error_string = "UNKNOWN:Error received from peer ipv4:34.160.220.160:443 {grpc_message:"token not found", grpc_status:2, created_time:"2023-12-08T10:15:29.907598478+07:00"}"

I printed out the config object it looks like this:

config: type='RecursiveCharacter' chunk_size=300 splitter_kwargs={} embedding_model='BAAI/bge-base-en' openai_api_key='sk-tdnNgbL----------lkQudP2XntZ' embedding_device=-1 embedding_normalize=True milvus_uri='https://in03-f2-------cf.api.gcp-us-west1.zillizcloud.com' milvus_token='ec22ee0b9cef0daffc401b049--------c8a8061f7f41f27a76276bde1fb469b5' milvus_host=None milvus_port=None milvus_user=None milvus_password=None es_enable=False es_connection_kwargs={'hosts': ['https://127.0.0.1:9200'], 'basic_auth': ('elastic', 'my_password')} token_model='gpt-3.5-turbo' llm_kwargs={'temperature': 0.2, 'max_tokens': 20} openai_model='gpt-3.5-turbo' llm_src='openai'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.