Giter Site home page Giter Site logo

dissorial / doc-chatbot Goto Github PK

View Code? Open in Web Editor NEW
777.0 11.0 129.0 2.6 MB

Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.

TypeScript 96.43% JavaScript 1.33% CSS 2.25%
openai typescript gpt-3 gpt-4 langchain mongoose nextjs openai-api chat chatbot

doc-chatbot's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

doc-chatbot's Issues

Question about storage location

Thanks forsharing this great repo.

I have a quick question. why does the latest version of this repo use localStorage instead of MongoDB? Privacy information is leaked if using local one. Thanks.

Kind regards.
Kai

Unexpected token A, "An error o" is not valid JSON

After login with a profile I have used on local instance:

  • I don't see past namespaces
  • When uploading a new document, it doesn't ingest. I get an error instead: Unexpected token 'A', "An error o"... is not valid JSON

image

Web app authentication issues

Hi Dissorial

thanks for all the work. I am now trying to run the app on vercel.

I copied from your git
Changed the env file with api secrets I was using on desktop
Changed url to webapp url

I am getting an authentication error. Not sure why? I am using the next auth secret that I generated on desktop. Could this be? How would I generate next auth secret I’m vercel?

I have added the user in Google api console but it says access denied.

Any idea what might be causing this?

IMG_0623

Error when uploading files deployed in Azure web app.

Hi @dissorial, I know this question is out of context but would be really be happy if you know why this errors persist?
I tried to deploy doc-chatbot in an Azure web app and I get this error when uploading files. Thank you.

  1. API resolved without sending a response for /api/upload, this may result in stalled requests.
  2. error - unhandledRejection: Error: EXDEV: cross-device link not permitted, rename '/tmp/80zJfS6ogGLdDgG39JsfUzzJ.pdf' -> '/home/site/wwwroot/tmp/The 7 Alternative.pdf'

Sharing of Namespaces between Google accounts

Hey, awesome project. Keep up the great work!

Ideally, i'd like to create Namespaces to be shared between all users. Is this possible now? or would it need to be a feature?

Also, if it was possible to have admin accounts and limit creation of Namespaces to admins, i'd for sure use that feature.

Cheers.

Possible use of self Hosted vector db

It is possible to setup the server using a self Hosted milvus database? We are a tiny ngo that uses lot of information but manage an incredible low to inexistence budget :)

Issue on upload.ts

``Hi @dissorial, I have tried to deploy the app in the azure web app, I have successfully deployed it but I have encountered one issue which I'm hoping you know what's causing below issue.

A. When I tried to upload a file it says on the network reponse "{"message":"Files /tmp/BSzeVfW4xjD0TZ8ncpF_T99A.pdf uploaded and moved!"}"
B. Right after uploaded I add a namespace name "Test" and clicking the Ingest button and the network response "{"message":"Data ingestion complete"}"

I'm assuming the above events will created a pinecone data but it's not happening.

To give you context on what I change on the code to make the deployment worked.

  1. I changed the env to this NODE_ENV=production.
  2. Since the current code when deployed in the azure web app wont work when I upload a file and ingest it will give me an error "EXDEV: cross-device link not permitted" on the file I uploaded.
  3. So I change the upload.ts in the part of
    if (process.env.NODE_ENV !== 'production') to if (process.env.NODE_ENV == 'production')
    to force to use the development code instead of the production code and after that I get the A and B results above and it's not creating the Pinecone index and it is also not uploading the file in the web app.

I hope my points above is clear.
Hoping for your response. Thank you.

Failed to ingest your data

on the settings page it says

Your Pinecone namespaces

You currently do not have any namespaces
Failed to ingest your data

Failed to ingest - Vercel Deployed

I have been trying locally and it has worked smoothly.

But now I have deployed into Vercel and I get this "failed to ingest" for whatever file I try to process.
These files has been ingested previously in local so I guess they are not the problem.

Is there any extra step that needs to be performed in Vercel apart from setting all the Env Variables and deploying?
I mean, don`t fully understand where goes the updated file while is not ingested.

Thanks for your help in advance.

Failed to ingest

npx nextauth secret error

When I tried to generate secret using command "npx nextauth secret" and "npx nextauth jwt-secret", the output seems to indicate there is no such npm module named 'nextauth'.

ERROR1(Execute npx nextauth secret):

npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/nextauth - Not found
npm ERR! 404
npm ERR! 404 'nextauth@*' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

npm ERR! A complete log of this run can be found in:
npm ERR! /home/ai/.npm/_logs/2023-05-18T08_37_34_893Z-debug-0.log


ERROR2(Execute npx nextauth jwt-secret):

npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/nextauth - Not found
npm ERR! 404
npm ERR! 404 'nextauth@*' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

npm ERR! A complete log of this run can be found in:
npm ERR! /home/ai/.npm/_logs/2023-05-18T08_39_35_623Z-debug-0.log

Idea: ingestion via email? Since we already have Google oauth

Three things come to mind

  • Title of email = namespace
  • Ingests body of email and attachments: pdf or doc or txt or md
  • ability to add new documents to existing namespaces
  • stores documents in the associated email or it’s Google drive

What do you think?

API resolved without sending a response

I am on Windows 11, and I think I have everything setup correctly, but I am getting this error when trying to load a new namespace:

API resolved without sending a response for /api/upload, this may result in stalled requests.
error - unhandledRejection: Error: ENOENT: no such file or directory, rename 'C:\Users\avina\AppData\Local\Temp\B4yWRxiHcgn9Q2jTRMqO23jW.pdf' -> 'F:\pdf-chatbot/docs/na-13075-questionnaire-about-military-service.pdf'
at Object.renameSync (node:fs:1040:3)
at eval (webpack-internal:///(api)/./pages/api/upload.ts:38:55)
at F:\pdf-chatbot\node_modules\multiparty\index.js:139:9
at F:\pdf-chatbot\node_modules\multiparty\index.js:118:9
at process.processTicksAndRejections (node:internal/process/task_queues:77:11) {
errno: -4058,
syscall: 'rename',
code: 'ENOENT',
path: 'C:\Users\avina\AppData\Local\Temp\B4yWRxiHcgn9Q2jTRMqO23jW.pdf',
dest: 'F:\pdf-chatbot/docs/na-13075-questionnaire-about-military-service.pdf'
}

Invalid PDF structure - Failed to Ingest

I accidentally tried to upload a PDF that was corrupt. Now I can't seem to ingest any PDFs at all. I've tried recreating the Pinecode DB and also cleared out the Namespaces from Mongo. I'm going to try rebuilding everything and will update this issue, but just wanted to highlight it. Cheers!

Issue running npx netauth jwt-secret

Hi, I tried to follow the instruction for running "npx netauth jwt-secret" which causing an error

AzureAD+KevinClaytonBaroro@DESKTOP-SVJE8CL MINGW64 ~/doc-chatbot (master)
$ npx netauth jwt-secret
npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/netauth - Not found
npm ERR! 404
npm ERR! 404 'netauth@*' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

npm ERR! A complete log of this run can be found in: C:\Users\User\AppData\Local\npm-cache_logs\2023-05-17T15_15_52_219Z-debug-0.log

May you have the fix for this?

PDF Size Best Practices

Just a quick question in regards to the size of the PDFs.

Say I had a PDF that was 1000 pages long. Could I expect better results if I was to split the PDF into multiple documents, or would it not make any difference?

Many thanks.

data map - Unhandled Runtime Error

It cant load frontpage/ chats. I followed every step and also I'm logged in

ges\index.tsx (118:23) @ map

116 | setMessageState((state) => ({
117 | ...state,

118 | messages: data.map((message: any) => ({
| ^
119 | type: message.sender === 'user' ? 'userMessage' : 'apiMessage',
120 | message: message.content,
121 | sourceDocs: message.sourceDocs?.map((doc: any) => ({

Streaming Chat

Thanks for sharing this amazing repo.

Is it possible to enable "stream" mode for GPT? I tried to add the parameter like "streaming: true", but it did not work. Any tips for that? Thanks in advance.

image

Kind regards.
Kai

Add support to CSV

Hi thank you for this awesome upgrade of mayo's app. Is it possible to add support for CSV file? There's someone added the support of csv to Mayo's source and I successfully incorporate it their but I'm having trouble to where I can put the code in your source code.

Ingest Data and Failed to Initialize the Pinecoin Client

Hi,
We followed all the steps properly and host this application on Ubuntu 22.04, Node version 18.16.0, and Azure.
But we can't upload any files of Q/A docs, PDFs, or TXT files. We are getting errors.
Please check and suggest to us.
image
image

Can't Uploading files above 8mbs

It seems it's uploading only very small files. It will be awesome if also you fixed the previous authenticated version. Like upload both the old and new versions. Also It's not good at answering slight broad questions which makes it seem dumb or useless at times. But thanks for the update.

Unexpected token 'R', "Request En"... is not valid JSON when uploading

image

I am getting this error under 3 different circumstances.

  1. When one of the attached docs is around 5-10 MB
  2. When uploading more than 6 files (whatever the size)
  3. When the size of the input files is higher than 7-8 MB whatever the number of files

For the first error I`ve changed the name of the doc -> split it

Splitting it has solved the issue, meaning is not based on the file specs (bad OCR, scanned or whatever). Is a nice and clean pdf with text (no tabs).

Any idea?

storage location / system prompt / web app?

  1. where are the documents / pdf stored currently after we upload them via browser?

  2. Where can I look at / change system prompt for OpenAI QA?

  3. Would it be possible for you to share a little guide on how to host this web app so that it can be accessed via browser anywhere? (Currently running locally). Sorry I am not a coder but able to Lego build this

thanks very much for your help so far!

Callback redirection OAuth2

Hey, thanks for share your project. I was testing but I cannot login. I generated Google client id and secret and configure the redirect URL in the credentials as http://localhost:3000/settings. Is it right? I would appreciate If you can help me your test your app.

Regrets...

Function request: Source Provide

Is there any way to also provide the source file and when we click the source, it will bring us to the page that the bot mentioned?

data.map is not a function

Hiya! Thanks for this guide! I've followed every step very closely but I get this error.
When skipping it and jumping to login, I cant use my google account; it doesnt let me select it or enter credentials. Could you please help out?

Thanks in advance!

Unhandled Runtime Error
TypeError: data.map is not a function

Source
pages\index.tsx (118:23) @ map

  116 | setMessageState((state) => ({
  117 |   ...state,
> 118 |   messages: data.map((message: any) => ({
      |                 ^
  119 |     type: message.sender === 'user' ? 'userMessage' : 'apiMessage',
  120 |     message: message.content,
  121 |     sourceDocs: message.sourceDocs?.map((doc: any) => ({

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.