Comments (2)
Maybe I can help... I was so fascinated by the paper and concept that I really wanted to try it, I don't know python as well as I know TypeScript.... so I first made this to do the heavy lifting for me and have spent a day doing hand edits where ChatGPT struggled.
This is where everything starts
If you're curious on a TypeScript version (with bonus docker-compose file to run MariaDB and PhpMyAdmin) I'm close to putting my work on github once it actually runs. This is what I have done so far:
# Project
- Add `debug.ts`
- Add `types.ts`
- `SqlStep` type added
- `ChatContext` type added
- Add `utils.ts`
- Adding `sleep()` method
- Adding `input()` method with `prompt-sync`
- Adding `new_message_list()` to type safely start `ChatCompletionRequestMessage[]` arrays
- Maps `List` to `Array`
- Maps `Dict` to `Record`
- Add Dependencies
- tiktoken
- mysql2
- langchain
- winston
- chalk
## call_ai_function.ts
- Adding `export` to all `function`
- Typing `model` as `TiktokenModel`
- Adding `async` to `call_ai_function()`
- Adding `async` to `populate_sql_statement()`
## chat.ts
- Adding `export` to all `function`
- Modify `generate_context` model param from `any` to `TiktokenModel`
- Modify return to `object` from `tuple`
- Adding types from `openai` to `create_chat_message()` to return `ChatCompletionRequestMessage`
- Typing `full_message_history` as `ChatCompletionRequestMessage[]`
## chatdb_prompts.ts
- ChatDB had alot of trouble on this one. It was easy to fix by hand.
- Renaming `user_inp` to `user_input` for use in `chatdb.ts`
## chatdb.ts
- Adding `export` to all `function`
- Modify `get_steps_from_response()` to return `SqlStep[]`
- Modify `chain_of_memory()` to be `async` and accept `SqlStep` instead of `Array<object>`
- Fix `init_system_msg()` PromptTemplate creation and making `async`
- Modify `generate_chat_responses()` to be `async`
- Typing `historical_message` as `ChatCompletionRequestMessage[]`
- Fix named argument calls incorrectly transpiled `{ user_input }` was `(user_inp = user_inp)`
## chatgpt.ts
- Adding `const sleep = (seconds: number) => new Promise((resolve)=> setTimeout(resolve, seconds * 1000));` to mimic python `sleep()`
- Adding `export async` to `create_chat_completion`
- Edit `create_chat_completion()` messages param from `any[]` to `ChatCompletionRequestMessage[]`
## config.ts
- Adding `import "dotenv/config";` was `load_dotenv();`
- Adding `const getEnv = (key: string) => process.env[key];`
- Alias for was `os.getenv`
- Commenting out `Singleton` class...
- Commenting out `Azure` stuff for now...
- Adding `mysql_database: string | undefined;` prop to `Config` class
- `export const config = new Config();` was `const cfg = new Config();`
## fruit_shop_schema.ts
- Adding `export` to `const`
## mysql.ts
- Edit `import from "mysql2";` was `import * as pymysql from "pymysql";`
- Add `export` to `class MySQLDB`
- Remove `Cursor` references from `class MySQLDB`
- Modify `insert` and `update` from `data:any` to `data: Record<string, string>`
- Remove
```
if (require.main === module) {
import { cfg } from "./config";
let mysql: MySQLDB = new MySQLDB(cfg.mysql_host, cfg.mysql_user, cfg.mysql_password, cfg.mysql_port, "try2");
}
```
- Modify Removing connect/disconnect to use `mysql2` connection pool features
- `./scripts/test-db-connection.ts` added to test DB
## sql_examples.ts
- Remove `import re;`
- Add `export` to all entries
- Rename `ex_*` was `eg_*`
- Rename `examples` was `egs`
## tables.ts
- Add `export` to all `function` and `const`
- Add `async` to `init_database()`
- 2nd Pass at `get_table_info()` since the regex calls were not converted correctly
## token_counter.ts
- Adding imports from `tiktoken`
- Adding `export` to all `function`
- Edit `TiktokenModel` for `string` on the model parameter on `count_string_tokens()`
- Edit `TiktokenModel` for `string` on the model parameter on `count_message_tokens()`
- Edit `let encoding: Tiktoken;` outside of try/catch
- Edit `let tokens_per_message: number;` to top of method body
- Edit `let tokens_per_name: number;` to top of method body
- The lower loop over `messages` in `count_message_tokens()` was missing the call off `encoding`
- Fixed errors in named model args
- 2nd pass on loop through ChatGPT produced the correct loop
- Typing parameter as messages as `ChatCompletionRequestMessage[]`
- ChatGPT incorrectly converted `len(encoding.encode(string))` to `string.length`
- Edit `len(encoding.encode(string))` => `encoding.encode(string).length`
from chatdb.
I've gone through your codes and paper but still confused about how you passed the table structure to the prompt to let the model know which operations it should take. I've seen
table_schema.py
but it only contains fixed SQL sentences. If this part of the design is missing, does it mean that the experiment is only valid in the fruit shop dataset scenario?
是的。有篇解读在这里 [大模型的符号性记忆框架,提升精确记忆和复杂推理能力-ChatDB - 知乎], 里面宣称:
之前的一些大语言模型和数据库结合的工作(比如DB-GPT和ChatExcel)也涉及用大语言模型生成SQL或Excel的指令,但ChatDB跟它们有本质上的不同。DB-GPT和ChatExcel更多关注利用大语言模型解决自然语言到SQL或Excel指令的转化,而且更多只是用来解决查询的问题,数据源本身是给定好的。ChatDB则是将数据库作为符号性记忆模块,不只涉及查询,还包括了数据库的增删改查等所有操作。整个数据库是从无到有,不断记录并更新大语言模型的历史信息。并且,ChatDB中的数据库,即符号性记忆模块,是与大语言模型紧密关联、融为一体的,可以帮助大语言模型进行复杂的多步推理。
我理解这篇工作是想基于Text-to-SQL 做更通用的数据库对话,更具体来说,目前实现的场景,就是在静态查询以外还能增删改这些动态能力。但是肯定还是依赖Text-to-SQL准确度的(目前都不高),所以这篇还是比较偏概念设计,难以拓展。
from chatdb.
Related Issues (14)
- Any Update for ChatDB code? HOT 3
- ChatGLM怎么使用这个项目呢? HOT 2
- 非常感兴趣这个项目,有交流群吗? HOT 6
- 文章中的50个问题会提供吗? HOT 2
- pymysql.err.OperationalError: (2003, "Can't connect to MySQL server on 'localhost'
- SQL error: execute() first HOT 5
- bug HOT 1
- What's the meaning of function 'need_update_sql' in chatdb.py ?
- 请问为什么在提问的时候会先输出一句“NOT NEED MEMORY”,好像也不会根据数据库来做出回答? HOT 1
- -
- 点击 Get Started 返回 JSON
- 关于sql_results_history, new_mem_ops 这两个变量的问题
- 只能接入chatgpt么?或者是否可以自定义baseurl和model name.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatdb.