Giter Site home page Giter Site logo

cgpt_python's Introduction

QnA with OpenAI

This is explained in this blog

Code for this is in qna folder.

Implementation of the same is detailed in this blog

Env

Set OPENAI_API_KEY environment variable to your OPENAI Key. This is required by all demos in this repo.

Create Embedding

  • Book is here. Download chapters of this book and copy it to S3 location.
  • Modify the code create_embedding.py for your bucket path and folder. output folder does not matter, embeddings are stored in local directory.
  • These embeddings are then loaded by detect.py. If you change the filenames in anyway, then update this code.
  • To create embedding run the following command. File must be .pdf. This is sent to textract for text extraction and then embedding is created for each page. This is then stored in _embedding.csv file in local directory as well as .txt file with all the text returned by textract in one file.
    • These extra txt files are not used in this demo, but are used in 'Summarization' demo. so if you want you can delete these from here or move to summarization folder.
python create_embedding.py <filename.pdf>

Runnning the demo

  • Execute the API backend
python detect.py
  • Open frontend in browser i.e index.html (open directly, this is not served via backend) and then ask some questions from the story. You can choose some from below or you can create your own after reading the stories. Better you get your own pdf document and use it for this demo.

Why Embedding

First ask any question from below and check the full checkbox. This will submit the whole chapter to the openai, you will see that this will definitely fail for chapter 3 and may be for others as well due to token limits.

Sample questions

Chapter 1 and 2

  • where did mrs dorling lived
  • Why did the narrator of the story want to forget the address
  • what happened to table silver
  • why mrs dorling took belongings of narrators mother
  • why did boys return the horse
  • who was mourad

Mother(ch 3)

  • How does Mrs. Pearson feel about her family's treatment of her, and why is she hesitant to stand up for herself?
  • What is Mrs. Fitzgerald's role in the play, and how does she influence Mrs. Pearson?
  • How does Mrs. Pearson's behavior change after the body switch with Mrs. Fitzgerald, and how do her family members react to her new demeanor?
  • How do Doris and Cyril Pearson initially respond to their mother's changed behavior, and what is their opinion of her actions?
  • What impact does Mrs. Pearson's newfound assertiveness have on her husband, George, and his perception of himself within the family and the community?

Ghat (Ch4)

  • Who is the dying man in the story, and what is his request to Amitav?
  • How did Shahid's illness affect his friendship with Amitav?
  • What kind of poet was Shahid, and how did his poetry differ from contemporary styles?
  • What was Shahid's passion besides poetry and writing?
  • When and how did Shahid pass away, and what impact did it have on Amitav?

Birth (Ch5)

  • What is the significance of Joe Morgan's visit to Andrew?
  • What challenges does Andrew face during the delivery and resuscitation of the child, and how does he handle the situation?
  • what happened to susan during delivery?
  • What does the successful revival of the baby mean to Andrew?
  • How does Joe Morgan react to the outcome of the childbirth?

Melon City (ch6)

  • How did the architect defend himself when he was accused by the King?
  • IWhat unfortunate incident occurred when the King rode under the arch?
  • Why were the workmen and masons also blamed by the King?
  • How did the King eventually meet his fate?

Summarization

  • Similarly for summarization, execute the corresponding summary.py (only one of the two APIs runs as they are on same port, change the port if you want to run them together)
  • For any given pdf, first step will be to extract the text out of it and then send to GPT. This is performed using AWS Textract, but you can use any other method. I have provided some sample txt files summarization/data folder, which you can use.

Health Bot

This is explained in this blog

Main code is in main.py and function definition is in functions.json

NER - Named Entity Recognition

This is based on summarization in a sense that we instruct GPT to identify named entities and respond with JSON structure of key value pairs for these.

  • Text data for recipts are in summarization/data folder i.e bestbuy, fiveguys and wage1. Corresponding pdfs are in pdf folder in root.

cgpt_python's People

Contributors

skamalj avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.