aws-solutions / generative-ai-application-builder-on-aws Goto Github PK

Generative AI Application Builder on AWS facilitates the development, rapid experimentation, and deployment of generative artificial intelligence (AI) applications without requiring deep experience in AI. The solution includes integrations with Amazon Bedrock and its included LLMs, such as Amazon Titan, and pre-built connectors for 3rd-party LLMs.

Home Page: https://aws.amazon.com/solutions/implementations/generative-ai-application-builder-on-aws

License: Apache License 2.0

Shell 0.68% JavaScript 19.39% TypeScript 48.71% Python 30.81% HTML 0.07% CSS 0.34%

bedrock chatbot generative-ai llms retrieval-augmented-generation

generative-ai-application-builder-on-aws's Introduction

Generative AI Application Builder on AWS

NOTE: If you want to use the solution without any custom changes, navigate to Solution Landing Page and click the "Launch in the AWS Console" in the Deployment options for a 1-click deployment into your AWS Console.

The Generative AI Application Builder on AWS solution (GAAB) provides a web-based management dashboard to deploy customizable Generative AI (Gen AI) use cases. This Deployment dashboard allows customers to deploy, experiment with, and compare different combinations of Large Language Model (LLM) use cases. Once customers have successfully configured and optimized their use case, they can take their deployment into production and integrate it within their applications.

The Generative AI Application Builder is published under an Apache 2.0 license and is targeted for novice to experienced users who want to experiment and productionize different Gen AI use cases. The solution uses LangChain open-source software (OSS) to configure connections to your choice of Large Language Models (LLMs) for different use cases. The first release of GAAB allows users to deploy chat use cases which allow the ability to query over users' enterprise data in a chatbot-style User Interface (UI), along with an API to support custom end-user implementations.

Some of the features of GAAB are:

Rapid experimentation with ability to productionize at scale
Extendable and modularized architecture using nested Amazon CloudFormation stacks
Enterprise ready for company-specific data to tackle real-world business problems
Integration with Amazon Bedrock, Amazon SageMaker and select third-party LLM providers
Multi-LLM comparison and experimentation with metric tracking using Amazon CloudWatch dashboards
Growing list of model providers and Gen AI use cases

For a detailed solution implementation guide, refer to The Generative AI Application Builder on AWS

Architecture Overview

There are 3 unique user personas that are referred to in the solution walkthrough below:

The DevOps user is responsible for deploying the solution within the AWS account and for managing the infrastructure, updating the solution, monitoring performance, and maintaining the overall health and lifecycle of the solution.
The admin users are responsible for managing the content contained within the deployment. These users gets access to the Deployment dashboard UI and is primarily responsible for curating the business user experience. This is our primary target customer.
The business users represents the individuals who the use case has been deployed for. They are the consumers of the knowledge base and the customer responsible for evaluating and experimenting with the LLMs.

NOTE:

You have the option of deploying the solution as a VPC enabled configuration. With a VPC enabled configuration, you can choose
- if the solution should build the VPC for this deployment.
- if you would like to deploy the solution in a VPC existing in your AWS account.
To see the VPC related architecture diagrams, please visit the implementation guide.

Deployment Dashboard

When the DevOps user deploys the Deployment Dashboard, the following components are deployed in the AWS account:

The admin users can log in to the deployed Deployment Dashboard UI.
Amazon CloudFront delivers the web UI which is hosted in an Amazon S3 bucket.
AWS WAF protects the APIs from attacks. This solution configures a set of rules called a web access control list (web ACL) that allows, blocks, or counts web requests based on configurable, user-defined web security rules and conditions.
The web app leverages a set of REST APIs that are exposed using Amazon API Gateway.
Amazon Cognito authenticates users and backs both the Cloudfront web UI and API Gateway.
AWS Lambda is used to provide the business logic for the REST endpoints. This Backing Lambda will manage and create the necessary resources to perform use case deployments using AWS Cloudformation.
Amazon DynamoDB is used as a configuration store for the deployment details.
When a new use case is created by the admin user, the Backing Lambda will initiate a CloudFormation stack creation event for the requested use case.
If the configured deployment uses a third-party LLM, then a secret will be created in AWS Secrets Manager to store the API key.
All of the LLM configuration options provided by the admin user in the deployment wizard will be saved in an AWS System Manager Parameter Store. This Parameter store is used by the deployment to configure the LLM at runtime.
Using Amazon Cloudwatch, operational metrics are collected by various services to generate custom dashboards used for monitoring the solution's health.

Note: Although the Deployment dashboard can be launched in most AWS regions, the deployed use cases have some restrictions based on service availability. See Supported AWS Regions in the Implementation Guide for more details.

Use Cases

Once the Deployment Dashboard is deployed, the admin user can then deploy multiple use case stacks. When a use case stack is deployed by the admin user, the following components are deployed in the AWS account:

Business users can log in to the use case UI.
Amazon CloudFront delivers the web UI which is hosted in an Amazon S3 bucket.
The web UI leverages a WebSocket integration built using Amazon API Gateway. The API Gateway is backed by a custom Lambda Authorizer function, which returns the appropriate IAM policy based on the Amazon Cognito group the authenticating user is part of.
Amazon Cognito authenticates users and backs both the Cloudfront web UI and API Gateway.
The LangChain Orchestrator is a collection of Lambda functions and layers that provide the business logic for fulfilling requests coming from the business user.
The LangChain Orchestrator leverages Parameter store and DynamoDB to get the configured LLM options and necessary session information (such as the chat history).
If the deployment has enabled a knowledge base, then the LangChain Orchestrator will leverage Amazon Kendra to run a search query to retrieve document excerpts.
Using the chat history, query, and context from Amazon Kendra, the LangChain Orchestrator creates the final prompt and sends the request to the LLM hosted on Amazon Bedrock or Amazon SageMaker.
If using a third-party LLM outside of Amazon Bedrock or Amazon SageMaker, the API key is stored in AWS Secrets Manager and must be obtained before making the API call to the third-party LLM provider.
As the response comes back from the LLM, the LangChain Orchestrator Lambda streams the response back through the API Gateway WebSocket to be consumed by the client application.
Using Amazon Cloudwatch, operational metrics are collected by various services to generate custom dashboards used for monitoring the deployment's health.

Deployment

NOTE:

To use Amazon Bedrock, you must request access to models before they are available for use. Refer to Model access in the Amazon Bedrock User Guide for more details.
You can also test the UI project locally by deploying the API endpoints and the rest of the infrastructure. To do so, follow either of the below two options and then refer Deployment Dashboard and Chat UI project for details.

There are two options for deployment into your AWS account:

1. Using `cdk deploy`

Following are pre-requisites to build and deploy locally:

Docker
Nodejs 20.x
CDK v2.118.0
Python >= 3.11, <=3.12.1
- Note: normal python installations should include support for ensurepip and pip; however, if running in an environment without these packages you will need to manually install them (e.g. a minimal docker image). See pip's installation guide for details.
AWS CLI
jq

Note: Configure the AWS CLI with your AWS credentials or have them exported in the CLI terminal environment. In case the credentials are invalid or expired, running cdk deploy produces an error.

Also, if you have not run cdk bootstrap in this account and region, please follow the instructions here to execute cdk bootstrap as a one time process before proceeding with the below steps.

After cloning the repo from GitHub, complete the following steps:

  cd <project-directory>/source/infrastructure
  npm install
  npm run build
  cdk synth
  cdk deploy DeploymentPlatformStack --parameters AdminUserEmail=<replace with admin user's email>

Note: Because cdk deploy is executed with a stack name, it does not synthesize the other CloudFormation stacks in the infrastructure folder. To ensure all stacks are synthesized based on the infrastructure code changes, please ensure to cdk synth. For a complete list of cdk commands that can be run, see Toolkit commands

For the deployment dashboard to deploy LLM chat use cases, you would additionally need to stage synthesized CDK assets (such as lambdas, synthesized CloudFormation templates, etc.) from the source/infrastructure/cdk.out directory to a configured S3 bucket in your account from where these resources will be pulled from at the time of deployment. To make it easy to stage these assets, you can use the source/stage-assets.sh script. This script should be run from the source directory.

cd <project-directory>/source
./stage-assets.sh

When run, the script looks like this:

>>> ./stage-assets.sh
This script should be run from the 'source' folder
The region to upload CDK artifacts to (default:us-east-1)?
>>> us-west-2
>>> All assets will be uploaded to cdk-hnb659fds-assets-123456789-us-west-2
>>> Do you want to proceed? (y/n) y

You must provide the full region name as the first input to the script as shown in the above example.

Note: Assets must be staged every time there is a change in the codebase to have the most up-to-date staged assets. It is also recommend to run cdk synth before staging.

2. Using a custom build

Refer section Creating a custom build

Source code

Project structure

├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Config
├── LICENSE.txt
├── NOTICE.txt
├── README.md
├── buildspec.yml
├── deployment
│   ├── build-open-source-dist.sh
│   ├── build-s3-dist.sh
│   ├── cdk-solution-helper
│   ├── clean-for-scan.sh
│   ├── get-cdk-version.js
│   └── manifest.yaml
├── pyproject.toml
├── pytest.ini
├── sonar-project.properties
└── source
    ├── images
    ├── infrastructure                       [CDK infrastructure]
    ├── lambda                               [Lambda functions for the application]
    ├── pre-build-lambda-layers.sh           [pre-builds lambda layers for the project]
    ├── run-all-tests.sh                     [shell script that can run unit tests for the entire project]
    ├── stage-assets.sh
    ├── test
    ├── ui-chat                              [Web App project for chat UI]
    └── ui-deployment                        [Web App project for deployment dashboard UI]
└── docs

SageMaker Model Input Documentation

The project provides a docs folder which gives you access to sample SageMaker inputs. As SageMaker models can take in and output a variety of input and output schemas, respectively, the solution requests these values from the users to allow correct model invocation. This allows the solution to support a wide set of SageMaker models.

The input schemas are essentially your model's payload, with placeholders for the actual values. The placeholders enable replacing the actual model values at run-time and are represented by a keyword enclosed in angle brackets like: <<prompt>>. Note that <<prompt>> and <<temperature>> are reserved placeholders for the model prompt and temperatures respectively.

The model's output JSONPath provides the solution a path to retrieve the LLM's textual response from the model response.

Please always refer to model documentation and SageMaker JumpStart jupyter notebook samples to see the most up-to-date model payloads and supported parameters.

Creating a custom build

1. Clone the repository

Run the following command:

git clone https://github.com/aws-solutions/<repository_name>

2. Build the solution for deployment

Install the dependencies:

cd <rootDir>/source/infrastructure
npm install

(Optional) Run the unit tests:

Note: To run the unit tests, docker must be installed and running, and valid AWS credentials must be configured.

cd <rootDir>/source
chmod +x ./run-all-tests.sh
./run-all-tests.sh

Configure the bucket name of your target Amazon S3 distribution bucket:

export DIST_OUTPUT_BUCKET=my-bucket-name
export VERSION=my-version

Build the distributable:

cd <rootDir>/deployment
chmod +x ./build-s3-dist.sh
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION $CF_TEMPLATE_BUCKET_NAME

Parameter details:

$DIST_OUTPUT_BUCKET - This is the global name of the distribution. For the bucket name, the AWS Region is added to the global name (example: 'my-bucket-name-us-east-1') to create a regional bucket. The lambda
artifacts should be uploaded to the regional buckets for the CloudFormation template to pick it up for deployment.

$SOLUTION_NAME - The name of This solution (example: generative-ai-application-builder-on-aws)
$VERSION - The version number of the change
$CF_TEMPLATE_BUCKET_NAME - The name of the S3 bucket where the CloudFormation templates should be uploaded

When you create and use buckets, we recommended that you:

Use randomized names or uuid as part of your bucket naming strategy.
Ensure that buckets aren't public.
Verify bucket ownership prior to uploading templates or code artifacts.

Deploy the distributable to an Amazon S3 bucket in your account.

Note: You must have the AWS CLI installed.

aws s3 cp ./global-s3-assets/ s3://my-bucket-name-<aws_region>/generative-ai-application-builder-on-aws/<my-version>/ --recursive --acl bucket-owner-full-control --profile aws-cred-profile-name
aws s3 cp ./regional-s3-assets/ s3://my-bucket-name-<aws_region>/generative-ai-application-builder-on-aws/<my-version>/ --recursive --acl bucket-owner-full-control --profile aws-cred-profile-name

Anonymized data collection

This solution collects anonymized operational metrics to help AWS improve the quality and features of the solution. For more information, including how to disable this capability, please see the implementation guide.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

generative-ai-application-builder-on-aws's People

Contributors

Stargazers

Watchers

generative-ai-application-builder-on-aws's Issues

Enable option to allow anonymous users

Is your feature request related to a problem? Please describe.
I want my chatbot to be available publicly, Its a customer service rep.

Describe the feature you'd like
Allow option on deployment to enable anonymous users.

Additional context
Add any other context or screenshots about the feature request here.

CORS error after deployment

Describe the bug
After deploying the cloudformation returns error loading deployments and also to create deployments

To Reproduce
install de cloudformation from the solution library link

Expected behavior
the admin open show 0 deployments, instead of infinity loading, and without CORS error (preflight)

Please complete the following information about the solution:

Version: [e.g. v1.0.0]
1.1.0

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0276) - Generative AI Application Builder on AWS Solution. Version v1.0.0".

[ x] Region: [e.g. us-east-1]
Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
[ x] Have you checked your service quotas for the sevices this solution uses?
Were there any errors in the CloudWatch Logs?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

Access to XMLHttpRequest at 'https://x.execute-api.us-east-1.amazonaws.com/prod/deployments' from origin 'https://x.cloudfront.net' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: It does not have HTTP ok status.

Chat over single uploaded document

Is your feature request related to a problem? Please describe.
It would be nice to be able to chat over a private, stand alone document, that you don't necessarily want to upload the global knowledge base.

Describe the feature you'd like
It would be great if there was an upload button in the GUI that let you upload a local document where you can then ask questions about it.

Additional context

support for claude 3 / sonnet & haiku

Is there roadmap to include Claude 3 models (sonnet & haiku for now, and perhaps opus in the future) as options in the solution stack? thanks.

Add language parameter for Kendra search

Is your feature request related to a problem? Please describe.
When you create a Kendra index in another language than English, it is not possible to query the knowledge database.

Describe the feature you'd like
It will be great to have a language parameter when we configure the connection to the Kendra index (in the Rag additional parameters) or the application detects the language of the index.

Allow option to not enable prompt customization

Is your feature request related to a problem? Please describe.
I do not want my users to go change the prompt.
While deploying, allow an option to select whether we should alloww the prompt to be changed.

Additional context
Add any other context or screenshots about the feature request here.

Option to use Bedrock knowledge base instead of kendra

Is your feature request related to a problem? Please describe.
I tested the app and it works fine, but kendra is way to expensive as source knowledge base where there are millions of documents to index.

Describe the feature you'd like
It would be great to also be able to choose Bedrock knowledge base.

Deployment of use case fails when email address is provided for the use case.

Describe the bug
Deployment of use case fails when an email address is provided in the input parameters.

To Reproduce
Steps to follow:

Deploy the solution and access the deployment dashboard.
Deploy a use case from the deployment dashboard and add email address when creating a use case. The deployment fails because the AWS CloudFormation role that is passed to the use case does not have cognito-idp:AdminGetUser IAM action.

Expected behavior
Passing an email address to the use case should not fail create stack operation. As a successful operation it should provision the user and create a unique group for the Gen AI use case deployed.

Please complete the following information about the solution:

Version: v1.3.0
Region: us-east-1
Was the solution modified from the version published on this repository? Fails either way
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses? Yes
Were there any errors in the CloudWatch Logs? Error is the AWS CloudFormation service indicating the missing action.

Incorrect README doucmentation

Describe the bug
The READMEs under both source/ui-chat and source/ui-deployment for running and building the UI locally have a bug in the sample code for creating the runtimeConfig file where the file extension is .js instead of .json
~~touch source/ui-deployment/public/runtimeConfig.js~~
touch source/ui-deployment/public/runtimeConfig.json

To Reproduce
If user mistakenly uses wrong file extension, npm commands will work and not error out, but localhost won't load anything. It basically isn't immediately obvious what the issue is if a user blindly copy-pasted code.

Expected behavior
When user runs npm start, website should load locally in browser.

Please complete the following information about the solution:

[-] Version: [e.g. v1.0.0]

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0276) - Generative AI Application Builder on AWS Solution. Version v1.0.0".

[-] Region: [e.g. us-east-1]
[No] Was the solution modified from the version published on this repository?
[-] If the answer to the previous question was yes, are the changes available on GitHub?
[-] Have you checked your service quotas for the sevices this solution uses?
[-] Were there any errors in the CloudWatch Logs?

Screenshots
N/A

Additional context
N/A

UI/Backend misalignment of prompt limit enforcement

Describe the bug
The prompt character limits in the UI are currently hard-coded, are too small, and do not align with the actual limits of the LLM in use.

To Reproduce
Input a prompt larger than 2000 characters

Expected behavior
The prompt limit enforced on the UI aligns with the limits of the LLM in use.

Please complete the following information about the solution:

Version: [v1.3.0]

Initial deployment failed to create UseCasesTableXXX based on KMS validation error: com.amazonaws.services.kms.model.NotFoundException: Key 'arn:aws:kms:us-east-1:xxx:key/xxxx' does not exist

Describe the bug
Very first attempt to deploy the solution failed, after following the steps (stack name and admin email), the stack failed to deploy the nested stack "DeploymentPlatformStorageDeploymentPlatformStorageNestedStackDe-GKN9DKVAKLTC" based on failure to create the UseCasesTableXXX, resulting in following error:

Resource handler returned message: "KMS validation error: com.amazonaws.services.kms.model.NotFoundException: Key 'arn:aws:kms:us-east-1:xxx:key/xxx' does not exist (Service: AWSKMS; Status Code: 400; Error Code: NotFoundException; Request ID: xxx; Proxy: null) (Service: DynamoDb, Status Code: 400, Request ID: XXX)" (RequestToken: xxx, HandlerErrorCode: InvalidRequest)

To Reproduce
Deploy into us-east-1 with only specifying the Stack Name and Admin Email properties.

Expected behavior
Expect to deploy.

Please complete the following information about the solution:

Version v1.1.1
Region: us-east-1
Was the solution modified from the version published on this repository? NO
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses?
Were there any errors in the CloudWatch Logs? CloudFormation error above is sufficient

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

support for query of non-english documents

Describe the feature you'd like
requesting support for inquiry of non-english documents indexed in Kendra, with configuration options during the GenAI application setup.

Additional context
I have many non-English documents already indexed in Kendra. I can search those inside Kendra when specifying the default language of the source documents (as in captured screenshot). I wish to achieve the same result using the GenAI application builder, taking advantage of the LLM models in Bedrock.

1_Query_Non_English_Doc_Possible_Inside_Kendra

Add option to use Bedrock in a different AWS region

Is your feature request related to a problem? Please describe.
Bedrock is not currently available in many AWS regions, and not all regions have access to all the models and variants. Currently the only way to use this solution is to deploy it in a region where bedrock is available, but it would be desirable to deploy it in region without Bedrock.

Describe the feature you'd like
Add a VPC in a selected region where Bedrock is available and configure VPC peering with the VPC where the solution is deployed.

don't use a static login UI and use the Cognito UI for more functionalities

Is your feature request related to a problem? Please describe.
Yes as it will be an easier way to answer also issues #4 and #5

Describe the feature you'd like
Use the Cognito Hosted UI instead of a static login UI with the pattern used at https://github.com/awslabs/cognito-at-edge

This will allow you to support natively :

Identity Federation from social login
Identity Federation from Enterprise IdP
multi factor authentication
password reset and recovery
Self-service sign-up

UI: add option to see the Prompt history per session

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the feature you'd like
A clear and concise description of what you want to happen.

Additional context
Add any other context or screenshots about the feature request here.

Source code highlighting for chatbot.

Is your feature request related to a problem? Please describe.
When I ask the chatbot to write code for a specific need, the source code response is not readable.

Describe the feature you'd like
I'd like source response to be formatted and highlighted based on language syntax. While there are situations where I can use code whisperer or equivalent solutions, having good code highlighting capability within this solution will be very helpful.

Additional context
Readable response to technical source code questions.

Additional permissions are required

Describe the bug
I deployed the solution to my AWS account. I started to create the first Deployment, but it failed due to permissions issue for generative-ai-application-bui-CfnDeployRole*. I added "iam:PassRole" and "cloudformation:CreateStack" and next deployment was successful.
When I tried to updated deployment it failed due to missing "cloudformation:UpdateStack" permission. I added it and redeployed Stack in CloudFormation console, because the stack was not visible anymore in the App Deployments. Redeployment was successful.

The final policy is below.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"iam:PassRole",
"cloudformation:CreateStack",
"cloudformation:UpdateStack"
],
"Resource": "*"
}
]
}

To Reproduce
Steps to reproduce the behavior.

Expected behavior
A clear and concise description of what you expected to happen.

Please complete the following information about the solution:

[ 1.2.2] Version:

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0276) - Generative AI Application Builder on AWS Solution. Version v1.0.0".

[ us-west-2 ] Region: [e.g. us-east-1]
[no, as is ] Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses?
[in CloudFormation ] Were there any errors in the CloudWatch Logs?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

Ability to work with Bedrock Agents

Describe the feature you'd like
Provide a builder client to connect to a Bedrock Agent with agentId and agentAliasId. This will allow usage of knowledge bases and action groups with this systems

Additional context
Bedrock Agents allow LLM to reason and integrate with real time information.

[Question] Chat service failed to respond. Please contact your administrator for support and quote the following trace id.

Describe the bug
Deployed the Version: v1.2.0 with RAG enabled using Bedrock amazon.titan-text-express-v1 model, but kept getting the error from chatbot Chat service failed to respond. Please contact your administrator for support and quote the following trace id. Amazon Bedrock was raising the error.

To Reproduce

Download the CloudFormation template

Set SendAnonymousUsageData option to "No":

"AnonymousDataAWSCondition": {
  "Fn::Equals": [
    {
      "Fn::FindInMap": [
        "Solution",
        "Data",
        "SendAnonymousUsageData"
      ]
    },
    "No"
  ]
},

Deploy to us-west-1 or us-east-1
Deployments were successful
Create new Text type use case with RAG enabled
Deployments were successful
Setup Kendra data sources with S3 bucket
Login to application to access chatbot
Use default setting for prompt and start chatting
Get errors about Chat service failed to respond.

Expected behavior
Chat bot should be usable and answer questions based on the data set provided in Kendra

Please complete the following information about the solution:

Version: v1.2.0
CloudFormation stack description: (SO0276-BedrockChat) - generative-ai-application-builder-on-aws - BedrockChat - Version v1.2.0
Region: us-west-1 or us-east-1
Was the solution modified from the version published on this repository? No. But only setting the SendAnonymousUsageData to no as described in the section above
Have you checked your service quotas for the sevices this solution uses?
Were there any errors in the CloudWatch Logs?
Yes. See below for detail.

Screenshots

At the very beginning, the error I was getting was caused by the fact that the Titan Text G1 - Express model access in Amazon Bedrock wasn't granted. The implementation guide also did not mention about how to enable Amazon Bedrock to access the desired model. This may help anyone else there bumped into the same issue.

@message | {"level":"ERROR","location":"generate:179","message":"Error occurred while building Bedrock BedrockModelProviders.AMAZON Model. Error: Error raised by bedrock service: An error occurred (AccessDeniedException) when calling the InvokeModelWithResponseStream operation: You don't have access to the model with the specified model ID.","timestamp":"2023-12-19 18:45:36,237+0000","service":"BEDROCK_CHAT","xray_trace_id":"1-6581e4c8-c0b4da51fd0ffa3b5deafd11"}

With the Titan Text G1 - Express model access granted. The Chat bot seems to work, but failed most of the time, I had to turn off the prompt template to get it to work just a bit.

Second type of message I receive from CloudWatch specifically mentioned that Malformed input request: expected maxLength: 42000, actual: 54243, please reformat your input and try again. However, I'm not sure how I can change the input that was passed over to Bedrock? Is this configurable? Any guidance would be much appreciated! Thank you

@message | {"level":"ERROR","location":"lambda_handler:64","message":"An exception occurred in the processing of Bedrock chat: Error occurred while building Bedrock AMAZON_TITAN Model. Ensure that the model params and their data-types provided are correct. Error: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModelWithResponseStream operation: Malformed input request: expected maxLength: 42000, actual: 54243, please reformat your input and try again.","timestamp":"2023-12-19 18:56:49,734+0000","service":"BEDROCK_CHAT","xray_trace_id":"1-6581e76f-45b8cf12d3a91f9d3082c6dc"}

UI Branding option

Is your feature request related to a problem? Please describe.
The UI is plain

Describe the feature you'd like
It would be amazing to be able to customize the look and feel of the Chat interface where a customer can add their own logo, description and icons for the bot and customer avatar

Additional context
Add any other context or screenshots about the feature request here.

Conversation with RAG enabled Claude V2 is ending abruptly

Describe the bug
I have set up a RAG enabled Bedrock ClaudeV2 application, filled the Kendra DB with content and I am now interacting with the newly deployed stack. The Application stops streaming into the browser abruptly, after being idle for some time (latency between query and response is very notable). Upon debugging, I can see that the WebSocket sends a ##END_CONVERSATION message, so the client behavior is correct. I suspect a lambda timeout or something similar to block here, but I was unable to dive deep enough which lambda might be timing out. As some handler send the ##END_CONVERSATION message, I guess it is second level.

To Reproduce
Ask the system to generate a text with more than 1000 words

Expected behavior
I would expect the system to stream all the response data before ending the conversation.

Please complete the following information about the solution:

[v1.3.1] Version: [e.g. v1.0.0]
[us-east-1] Region: [e.g. us-east-1]
[no] Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
[yes] Have you checked your service quotas for the sevices this solution uses?
[no] Were there any errors in the CloudWatch Logs?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
CloudWatch logs:

START RequestId: 52cbce78-1724-4cbb-a104-91182945da59 Version: $LATEST
--
/opt/python/aws_lambda_powertools/metrics/metrics.py:129: UserWarning: No application metrics to publish. The cold-start metric may be published if enabled. If application metrics should never be empty, consider using 'raise_on_empty_metrics'
self.provider.flush_metrics(raise_on_empty_metrics=raise_on_empty_metrics)
{     "level": "INFO",     "location": "generate:130",     "message": "Prompt for LLM: \n\nHuman: You are a friendly AI assistant. You provide answers only based on the provided reference passages.\n\nHere are reference passages in <references></references> tags:\n<references>\n{context}\n</references>\n\nCarefully read the references above and thoughtfully answer the question below. If the answer can not be extracted from the references, then respond with \"Sorry I don't know\". It is very important that you only use information found within the references to answer. Try to be brief in your response.\n\nHere is the current chat history:\n{chat_history}\n\nQuestion: {question}\n\nAssistant:",     "timestamp": "2024-03-05 08:58:34,252+0000",     "service": "BEDROCK_CHAT",     "xray_trace_id": "1-65e6deb9-ebf1a943fcc1ae4992122783" }
{     "_aws": {         "Timestamp": 1709629114252,         "CloudWatchMetrics": [             {                 "Namespace": "Langchain/LLM",                 "Dimensions": [                     [                         "service"                     ]                 ],                 "Metrics": [                     {                         "Name": "LangchainQueries",                         "Unit": "Count"                     }                 ]             }         ]     },     "service": "GAABUseCase-15091dff",     "LangchainQueries": [         1     ] }
{     "_aws": {         "Timestamp": 1709629130051,         "CloudWatchMetrics": [             {                 "Namespace": "AWS/Kendra",                 "Dimensions": [                     [                         "service"                     ]                 ],                 "Metrics": [                     {                         "Name": "KendraQueries",                         "Unit": "Count"                     },                     {                         "Name": "KendraFetchedDocuments",                         "Unit": "Count"                     },                     {                         "Name": "KendraProcessingTime",                         "Unit": "Seconds"                     }                 ]             }         ]     },     "service": "GAABUseCase-15091dff",     "KendraQueries": [         1     ],     "KendraFetchedDocuments": [         5     ],     "KendraProcessingTime": [         0.5714249610900879     ] }
/opt/python/aws_lambda_powertools/metrics/metrics.py:129: UserWarning: No application metrics to publish. The cold-start metric may be published if enabled. If application metrics should never be empty, consider using 'raise_on_empty_metrics'
self.provider.flush_metrics(raise_on_empty_metrics=raise_on_empty_metrics)
{     "_aws": {         "Timestamp": 1709629157170,         "CloudWatchMetrics": [             {                 "Namespace": "Langchain/LLM",                 "Dimensions": [                     [                         "service"                     ]                 ],                 "Metrics": [                     {                         "Name": "LangchainQueryProcessingTime",                         "Unit": "Seconds"                     }                 ]             }         ]     },     "service": "GAABUseCase-15091dff",     "LangchainQueryProcessingTime": [         42.9008150100708     ] }
/opt/python/aws_lambda_powertools/metrics/provider/base.py:209: UserWarning: No application metrics to publish. The cold-start metric may be published if enabled. If application metrics should never be empty, consider using 'raise_on_empty_metrics'
self.flush_metrics(raise_on_empty_metrics=raise_on_empty_metrics)
END RequestId: 52cbce78-1724-4cbb-a104-91182945da59
REPORT RequestId: 52cbce78-1724-4cbb-a104-91182945da59	Duration: 43881.57 ms	Billed Duration: 43882 ms	Memory Size: 256 MB	Max Memory Used: 201 MB

OpenAPI doc for the REST API?

Is there a swagger or OpenAPI doc for the REST API?

Chat Failed with RAG for Cohere and Meta models

Describe the bug
The chat application fails to respond when using Cohere and Meta models with RAG. It is due to a missing Disambiguation Prompt in the model defaults files.

To Reproduce
Deploy a chat use case with RAG enabled using Cohere and Meta.

Expected behavior
Should work

Please complete the following information about the solution:

Version: [e.g. v1.3.2]

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0276) - Generative AI Application Builder on AWS Solution. Version v1.0.0".

Region: all
Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses?
Were there any errors in the CloudWatch Logs?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

Create ability to display formatted answers from model

I would like to display formatted answers if the model sends formatting information. Like <ul><li> or something along those lines.
I would like to see bullet lists and more paragraphs vs just walls of text.

If the model sends formatting characters we should display them and allow for the formatting in the UI.

Or we should create another prompt that formats text and send the output through that.

aws-solutions / generative-ai-application-builder-on-aws Goto Github PK

generative-ai-application-builder-on-aws's Introduction

Generative AI Application Builder on AWS

On this page

Architecture Overview

Deployment Dashboard

Use Cases

Deployment

1. Using cdk deploy

2. Using a custom build

Source code

Project structure

SageMaker Model Input Documentation

Creating a custom build

1. Clone the repository

2. Build the solution for deployment

Anonymized data collection

generative-ai-application-builder-on-aws's People

Contributors

Stargazers

Watchers

Forkers

generative-ai-application-builder-on-aws's Issues

Recommend Projects

Recommend Topics

Recommend Org

1. Using `cdk deploy`