dto-btn / ssc-assistant Goto Github PK

Second iteration of the SSC chatbot/assistant

License: MIT License

Python 28.61% HCL 19.82% JavaScript 0.50% HTML 0.16% TypeScript 50.85% Dockerfile 0.06%

ssc-assistant's Introduction

SSC Assistant

Second iteration of the SSC chatbot/assistant.

The SSC Assistant leverages the Azure OpenAI API to utilize advanced language models such as GPT-4o. It employs the Retrieval Augmented Generation (RAG) method, integrating external APIs and tools to offer a wide array of options for users.

When the Assistant uses a tool in its API call, it indicates which tools were used in the response and displays the relevant metadata. For example, here is how it presents information retrieved using the GEDS tool:

Developer(s)

This section will help developer understand this project and how to set it up from the ground up and how to run it on their machine.

Codespaces

Simply create your branch and create a codespace for it from github. Load up that codespace you will have everything that you need to start working,

or Dev Containers (recommended)

Here is how you can develop locally with Dev Containers.

Pre-requisites

VSCode with Dev Container extension
Docker engine installed on Linux (Can be via WSL 2.0 on Windows)

NOTE: If you are using Docker Desktop you are required to have an Enterprise lisence.

Steps

Ctrl + Shift + P in VSCode and then Dev Containers: Reopen in Container (or alternatively if you never did open it Dev Containers: Open Folder in Container... and select the repo with ssc-assistant).

NOTE: Sharing SSH Key with the container via ssh-agent

Development steps:

Development steps are the same once you are inside the container (regardless if spun from Codespaces or locally from Dev Containers extension):

Ctrl + Shift + ~ to open 2 new terminals and simply start the services:

NOTE: It is important that you firsts login via az login --use-device-code

API:

cd app/api && flask run --debug --port=5001

Frontend:

cd app/frontend && npm run dev

All the necessary npm install and pip install commands have already been run and you can simply reach the codespace url that points to the 8080 port.

NOTE: You will need to have a 163dev account in order to develop and test on DEV/locally (request an account).

Issues with Dev Container Workspace Cache

If you encounter an issue starting npm in dev containers you can refer to this workaround/bug.

Manual setup (deprecated)

We are developing on python 3.11+.

Please setup your environement like so:

We have 2 python projects in this repo, I create 2 virtual envs and switch between them via command line and/or visual studio.

First we setup the backend API project:

python3 -m venv .venv_api
source .venv_api/bin/activate
pip install -r app/api/requirements.txt --upgrade

You should then see something like this denoting which environement you are in your shell. To leave this .venv simply type deactivate

(.venv_api) ➜  ~/git/ssc-assistant/

Now ensure that VSCode uses the proper .venv folder by pressing Ctrl + Shift + P and then type Python: Select Interpreter

and then the azure function project:

python3 -m venv .venv_func
source .venv_func/bin/activate
pip install -r az-functions/create-index/requirements.txt --upgrade

(for this virtual env the .vscode/settings.json should already be pointing to the proper folder, else re-follow steps above to ensure VSCode uses the proper venv for that section of the project)

npm install
npm run dev

Infrastructure

The current infrastructure of this project is as follow:

Azure Sandbox Subscription
- Azure Function to transform the raw data (SSCPlus, etc.) to a Search Service index
Service dependencie(s)
- SSCPlus Data Fetch services (loads up raw data into blobs)
- Azure OpenAI Services

Spinning up the infrastructure

Prerequisites:

Azure Client, minimum of Contributor role in the subscription, then simply az login
terraform

cd terraform/
terraform init
terraform plan -var-file="secret.tfvars"

163dev Account Permissions

To sign into a Microsoft account during development, a @163dev.onmicrosoft.com account must be used.

The following permisions must be granted to the account through Azure:

Documentation

ssc-assistant's People

Contributors

Stargazers

Watchers

Forkers

kingbain

ssc-assistant's Issues

improve logging feedback code (check for erros or 4xx return code, etc, validation)

This ticket serves to re-work how system messages (Error codes/success messages) are displayed to the user.

Currently if there's an error with a completion from the gpt API, it's displayed as a snack bar in the bottom of the screen. Additionally, when a user submits feedback for a response, a dialog box appears saying "Feedback submitted, thank you!" regardless of response code from the API.

We will instead show these alert messages in the chat as a third type of chat message. The alerts should:

Not persist between reloads
Be dismissible (and removed from the chat history)
Be internationalized
Appear centered in the screen

The feedback form should no longer show the "Feedback received, thank you!" dialog box after submission and instead close immediately on submit.

bug: do not show regenerate on first welcome message

Create separate component for Feedback form

Feedback form must be within a separate component along with all it's code related to it.
remove UUID from the parameters SAVED inside the table (we already have it as it's own field PartitionKey)

Enable tools functionality through front-end

bug: auth redirect is null

[Tue, 16 Apr 2024 13:17:03 GMT] : [] : @azure/[email protected] : Info - MsalProvider - msal:handleRedirectEnd results in setting inProgress from handleRedirect to none

when installed as an application and running user has to click 2 times before logging in, the first time the redirect is set to None and thus authentication not completed full circle.

Add API token to database when logging conversation? Or User ID if forced logged in? Or Both?

criteria

frontend sends with feedback(?) and chat/completion calls the user id/auth token
backend validates this token to ensure user is who he is.
backend logs user id inside the storage logs..

develop/test mobile device interface

Fix deployment

hitting maximum field size on DB for logging data..

ERROR:utils.db:The property value exceeds the maximum allowed size (64KB). If the property value is a string, it is UTF-16 encoded and the maximum number of characters should be 32K or less.
RequestId:8b5e09f0-3002-0056-0ec8-897032000000
Time:2024-04-08T15:22:52.3609420Z
ErrorCode:PropertyValueTooLarge
Content: {"odata.error":{"code":"PropertyValueTooLarge","message":{"lang":"en-US","value":"The property value exceeds the maximum allowed size (64KB). If the property value is a string, it is UTF-16 encoded and the maximum number of characters should be 32K or less.\nRequestId:8b5e09f0-3002-0056-0ec8-897032000000\nTime:2024-04-08T15:22:52.3609420Z"}}}

Enable token usage on streaming interfaces

Token usage is currently only returned on non streaming endpoints, however since we use streaming endpoints 100% on the frontend we should focus on making the token usage displayed there too.

This way on the UI side we will be able to add a couple of indicators to the user to display the current usage of tokens for the conversation/context window.

Apply new domain name

Acceptance criteria

new domain is usable to reach the ssc-assistant application and API
- ssc-assistant.cio-sandbox-ect.ssc-spc.cloud-nuage.canada.ca
- ssc-assistant-api.cio-sandbox-ect.ssc-spc.cloud-nuage.canada.ca

update frontend and migrate in this project to work with api

feedback endpoint must be secured like other endpoints and require a specific key to be used.

new keys will need to be regenerated

Improve create-index azure function (durable, logging, params)

There are a few tweaks that needs to be done to the new function that index to a Azure Search Service.

Tasks

make function durable
improve logging?
make the directory it reads from parametrizable
fix the terraform template code to inject the proper variables (see README.md doc on .env)
add terraform Azure Automation task to trigger this function (see example in chatbot-infra for example on Azure Automation task). Could also look at other alternatives.

add speed dial MUI component for parametrization of the chat functionalities

https://mui.com/material-ui/react-speed-dial/

R&D: windows deployment (taskbar icon + app)

Investigate the possibility of deploying the assistant (via Company Portal) to a taskbar shortcut (in addition of desktop) and as an App.

tasks

can we script shortcut to desktop + taskbar
can we script shortcut that launch as an installed app in windows (same as via edge/chromium)

Disclaimer cookie has to be back, set at 24 hours

re-add a cookie for the Disclaimer showing up
cookie should expire after 24 hours

In the quick menu allow users to switch from gpt3.5 to gpt4 vice and versa

add UI component to allow selection of different GPT versions that we support
pass this value to the backend API
modify backend API to support this new attribute, default to the current one we are using if not passed.
Use this model value in client creation (at start of API) and allow users to leverage specific clients when making the queries

SSC Assistant - enable the comments/survey - front end

In preparation to expanding the SSC Assistant pilot to all SSC Employees re-enable the comments/survey - Front end tasks

Add a thumbs up and thumbs down icon alongside the existing icons in the user interface.
Implement functionality to open a modal window when either the thumbs up or thumbs down icon is clicked. This modal should contain a text field for user feedback.
In api.ts, add a new API call for the feedback endpoint.
When sending data to the feedback endpoint, include the current conversation in the payload.
add new conversation UUID as part of feedback and messagerequest

SSC Assistant - enable the comments/survey - back end

In preparation to expanding the SSC Assistant pilot to all SSC Employees re-enable the comments/survey - back end tasks

acceptance criteria's

have a new endpoint in the backend that can receive feedback
feedback will be stored in the same database as the chat logs audit section
feedback endpoint must be secured like other endpoints and require a specific key to be used.
feedback endpoint should receive at least 3 things: current conversation chain, along with "highlighted" message (the message the feedback was clicked on), the feedback (good or bad) and the message that is optional.

Requirements for prof services

Mandatory Skills:

Optional/Good to have skills:

Flag potentially dangerous questions?

An example of what happens atm if OpenAI judges your question is not okay:

openai.BadRequestError: Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400, 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': True, 'severity': 'medium'}, 'violence': {'filtered': False, 'severity': 'safe'}}}}}

The chatbot simply gives an error out (HTTP 400) thus we display an error toast, but nothing is logged

Add a clear chat button

add a new button to clear the chat
- make sure clear chat button is accessible and also responsive (mobile or larger screens)
make sure the clear chat button clears all the current chat and retains only the default "welcome" message.
use only Material UI components for this.
add chat history saved in a cookie, allow for multiple "conversations" to be stored that way so for the future we can have multiple discussions going ala OpenAI/CANChat.

Fix Citations numbering

The current raw text we get back from using the /api/1.0/completion/myssc endpoint returns us something like this:

{
 "message": {
        "content": "The President of Shared Services Canada (SSC) is granted with HR authorities and many are then delegated to lower levels as defined in SSC's Instrument of Human Resource Delegation of Authorities [doc2]. Unfortunately, there is no information available about the phone number of Guillaume Turcotte in the retrieved documents.",
        "context": {
            "citations": [
               {
                    "content": "Archibus is an application about ..."
                }
                {
                    "content": "Human Resources Delegation of Authorities Through various Human Resources (HR) legislation , the President of Shared Services Canada (SSC)..."
                } ]}
}

As you can see in the content and context:

citations directly referred are numbered (like [doc1], [doc3]).
The number of citations passed in the request (via the max parameter, will always be matched and returned in the array)
The citations will not aways be used in the in the content section.

The current code will only return the CITED citations in the Assistant bubble, for example:

As you can see here, even if we get returned an array of 10 citations in the context, we only display the one being cited, however you can see that the citation is number 2 (we retain the position of the index in the array). This was done for simplicy as it was easier to code.

We need to fix this so the array starts at 1 and retains the proper citation regardless, possible fix would be to add a new key to the mapping of the citation with the original position in the array and the current citation key, etc.

criterias of success

citations now start at 1 in the content of the response
citations referred in the bottom of the box, must match the citation number from the respective citation in the text.

Identity of users - Chatbot personality

Getting identity of user through the identity provider
Greet user
- Hello <first name>, I am your SSC Assistant how may I help you today?

Create Terms of Service popup on reload

migrate code from old chabot
ensure we cannot click away (besides the box) to close the dialog
rename the close button to Accept

create Azure Vault to keep secreet like the encoding key for jwt tokens

missing TF values to match current providers changes

bug: extra buttons show when hovering over the text y-axis

When you hover over the text box's y-axis (i.e. your mouse on the blue X) the extra buttons are shown.
The buttons should only be shown when your mouse is in the red square.

need target _blank on the source links directly in the response

Open API to the MySSC plus team

This issue is going to be a big one. We will be exposing the chatbot API to the MySSC plus team. In order to do that we will also take this opportunity to update the current one so it's more streamlined

create TF template for the new service in the ssc_assistant-rg resource group.
add code for a Python Flask API that supports a POST for chat request.
request needs to support history,
looking into using the search service spun up already along with llama_index (to speed up the loading VS in memory)
provide documentation for the API so the other teams can view it to help them develop more quickly (something like openapi/swagger)

SSC Assistant - add to disclaimer that it is in pilot mode

Integrate Microsoft Graphs API

Add Microsoft Graphs API to access organizational structure

provide url and icon for desktop team contacts

to Anjillan Firdous and
Bradley MacDonald

Index Collective Agreement + add function to validate classification group

save request and completion separately in db

if completion returns an error we currently do not log the question, so we need to do so now.

bonus: also log completion error (offensive messaage etc?)

API: message request management fixes

Work to be done in the API portion of the code.

Completion criterias

rename app/api/prompt.py to something more approriate to management of message request
handle messages from MessageRequest to ensure that they do not exceed the maximum of 20 (with the system prompt) enforced by us.
handle messages to have a sensible default (max=10, currently the case), enforced by user inside the message request, if not provided default to 10.
truncate messages if need be (but not the system prompt, example: truncate history)

Improve logging or store questions in db for audit purposes

Current logging is left to be desired. We need to improve logging to enhance our capability of auditing the questions asked in order to imrpove the chatbot/myssc+ content.

Tasks

create TF template for a database that will receive the data.
log every request made to the server (payload sent, etc) to the db
log every response made by the server to the db
have a new endpoint in the backend that can receive feedback
feedback will be stored in the same database as the chat logs audit section
feedback endpoint must be secured like other endpoints and require a specific key to be used.
feedback endpoint should receive at least 3 things: current conversation chain, along with "highlighted" message (the message the feedback was clicked on), the feedback (good or bad) and the message that is optional.

BONUS

investigate if we can control headers we receive to the app from the reverseproxy/idp authentication flow (user id/or other information for stats purpose (unique users, etc)

SSC Assistant Login - auto/force

Remove the login option, force the login upon page load.

Probably leave logout but will have to redirect out ...?

Enable history on frontend, send back message list so history is always processed.

Add information on our chat options (SSC Data, Tools)

bug: when long conversation happens, unable to scroll back to first message

Unable to scroll back to first message on long conversation chain, the first message is hidden bellow the banner, possible calculation issue for scroll.

Experiment with Cognitive Search Embedding

Create new project that will host the ssc assistant/chatbot search services dedicated to it's content
create azure function that create said index along with mappings and vectorisation
demo a search result with the embedding search alone (before we tie it to Azure OpenAI)
~~llama_index now supports using Azure OpenAI embeddings, do initial test with some documents~~

get rid of the VITE_REDIRECT_URI dependency

Load this variable (VITE_REDIRECT_URI) differently. Instead of having to set it in github and loading it via the action since this is always matching the domain of the currently deployed app we should just use something else to determine that value (either from Azure deploy or locally).

SSC Assistant - removing "chat with"

removing "chat with" from the SSC Assistant

Create copy and reload buttons for each response

add icon on each response (similar to gemini/bing, openai)
add copy icon somewhere in the paper component so users can individually copy each response

dto-btn / ssc-assistant Goto Github PK

ssc-assistant's Introduction

SSC Assistant

Developer(s)

Codespaces

or Dev Containers (recommended)

Pre-requisites

Steps

Development steps:

Issues with Dev Container Workspace Cache

Manual setup (deprecated)

Infrastructure

Spinning up the infrastructure

163dev Account Permissions

Documentation

ssc-assistant's People

Contributors

Stargazers

Watchers

Forkers

ssc-assistant's Issues

criteria

Tasks

tasks

acceptance criteria's

criterias of success

Completion criterias

Tasks

BONUS

Recommend Projects

Recommend Topics

Recommend Org