Giter Site home page Giter Site logo

gcp-serverless-rag's Introduction

This repo contains example code for an end-to-end serverless RAG implementation on Google Cloud. The aim of the example here is not deliver high quality results and it should only be used as a guiding tool for your own serverless RAG application.

1.1. Prerequisites

Before moving forward, create a service account with the following roles:

  • BigQuery Admin
  • Cloud Datastore Owner
  • Cloud Functions Admin
  • Cloud Functions Invoker
  • Cloud Run Admin
  • Cloud Run Invoker
  • Document AI Administrator
  • Eventarc Event Receiver
  • Logging Admin
  • Service Account Token Creator
  • Storage Admin
  • Vertex AI User
  • Workflows Admin

This service account can now be used to run all the Cloud Functions and the workflow cited below. It, however, doesn't follow the least permission approach.

For least permission approach, create a custom role with the following permissions and assign that to a service account. This however has not been properly tested and can result in permission issues!

  • bigquery.datasets.create
  • bigquery.datasets.get
  • bigquery.jobs.create
  • bigquery.tables.create
  • bigquery.tables.get
  • bigquery.tables.update
  • bigquery.tables.updateData
  • datastore.entities.create
  • datastore.entities.get
  • datastore.entities.list
  • datastore.entities.update
  • cloudfunctions.functions.invoke
  • eventarc.events.receiveEvent
  • run.routes.invoke
  • documentai.processors.create
  • documentai.processors.get
  • documentai.processors.list
  • documentai.processors.processBatch
  • documentai.processors.update
  • eventarc.events.receiveEvent
  • logging.logEntries.create
  • iam.serviceAccounts.signBlob
  • storage.objects.create
  • storage.objects.delete
  • storage.objects.get
  • storage.objects.list
  • aiplatform.endpoints.predict
  • workflows.executions.create

1.2. Cloud Workflows

This workflow is where everything is put together. While creating this workflow, you need to create an EventArc trigger for:

Event provider: Cloud Storage

Event type: google.cloud.storage.object.v1.finalized

Bucket: $BUCKET_NAME

Now, this workflow will get triggered automatically when a file gets uploaded into a designated bucket using the Pre-signed URL Cloud Function. After this, the workflow calls the Chunker function, before Indexer kicks in. See below for the details of all these functions.

1.3. Cloud Functions

1.3.1. Presigned URL

This function creates a presigned URL for a file upload to a Cloud Storage Bucket. BUCKET_NAME must be provided as an environment variable to the function to represent the parent bucket where all files will be uploaded - this should be the same as the bucket you chose above for trigger. The function expects two parameters in the request body:

  1. object_name - The name of the file to be uploaded
  2. collection_name - The name of the folder inside the bucket where the file will be uploaded. This name should be unique to each user, so that each user has its own collection where their documents get uploaded.

When deploying this funciton, please use the 1st gen runtime since there seems to be a bug in the 2nd gen runtime that doesn't see the signBlob permissions correctly.

Note

You'll need to provide your PROJECT_ID and LOCATION before deploying the following functions

1.3.2. Chunker

Parses the uploaded document and split it into chunks. Document AI will be used to parse the document and will be split at the paragraph level. All the document metadata will be saved in Firestore. OUTPUT_BUCKET_NAME must be provided as an environment variable to represent the parent bucket where Document AI will save its response.

1.3.3. Indexer

Reads the data out of Firestore for a specific document, embeds the data in batches and stores the result into BigQuery Vector Store. It also accordingly updates the index status of all the chunks who embeddings have been successfully saved. This status can be polled by a frontend in order to know when a document is completely indexed and the query process can start.

1.3.4. Query

Once the other functions have successfully run, you are ready to run the query. This function uses the “similarity search” retriever to retrieve the data similar to the query. That data is then sent along with the query to Gemini Pro on Vertex AI to generate an answer.

curl -m 70 -X POST <Query_Function_URL> \
-H "Authorization: bearer $(gcloud auth print-identity-token)" \
-H "Content-Type: application/json" \
-d '{
  "query": "YOUR_QUERY",
}'

gcp-serverless-rag's People

Contributors

iamulya avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.