Giter Site home page Giter Site logo

kipdata / fncksql Goto Github PK

View Code? Open in Web Editor NEW
492.0 492.0 40.0 2.49 MB

SQL as a Function for Rust

Home Page: http://www.kipdata.site/

License: Apache License 2.0

Rust 99.92% Dockerfile 0.08%
data database embeddings myrocks oltp postgresql query-engine rust rust-lang sql sql-query sql-server sqlite sqlite-database web

fncksql's People

Contributors

arlottang avatar crwen avatar eliasyaoyc avatar ge-fighting avatar guojidan avatar joeyscat avatar kkould avatar lewiszlw avatar loloxwg avatar mayingbo avatar mayooot avatar sacloudy avatar yingmanwumen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fncksql's Issues

PANIC occurs during bind `Insert`

Bug Report

What version of KipSQL are you using?
0.0.1-alpha.6

What version of Rust are you using?

❯ rustc -V
rustc 1.76.0-nightly (de686cbc6 2023-12-14)

What's the status of the running?

What did you do?

> πŸ‘‡πŸ‘‡πŸ»πŸ‘‡πŸΌπŸ‘‡πŸ½πŸ‘‡πŸΎπŸ‘‡πŸΏ <
create table blog_1 (id int primary key, title varchar unique, created_at datetime);

+----------------------+
| CREATE TABLE SUCCESS |
+======================+
| blog_1               |
+----------------------+

> πŸ‘‡πŸ‘‡πŸ»πŸ‘‡πŸΌπŸ‘‡πŸ½πŸ‘‡πŸΎπŸ‘‡πŸΏ <
insert into blog_1 (id, title) values (0, 'KipSQL', '2022-10-12 12:22:33');
thread 'main' panicked at /home/jojo/code/open-source/kipsql/src/binder/insert.rs:52:52:
index out of bounds: the len is 2 but the index is 2
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

What did you expect to see?
Insert bind fails with error

> πŸ‘‡πŸ‘‡πŸ»πŸ‘‡πŸΌπŸ‘‡πŸ½πŸ‘‡πŸΎπŸ‘‡πŸΏ <
insert into blog_1 (id, title) values (0, 'KipSQL', '2022-10-12 12:22:33');
Oops!: bind error: column count not match, expect 2, got 3

What did you see instead?
PANIC occurs when the number of columns does not match the expression during bind Insert.

> πŸ‘‡πŸ‘‡πŸ»πŸ‘‡πŸΌπŸ‘‡πŸ½πŸ‘‡πŸΎπŸ‘‡πŸΏ <
insert into blog_1 (id, title) values (0, 'KipSQL', '2022-10-12 12:22:33');
thread 'main' panicked at /home/jojo/code/open-source/kipsql/src/binder/insert.rs:52:52:
index out of bounds: the len is 2 but the index is 2
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Insert operations where some columns are null are not supported

Bug Report

What version of KipSQL are you using?
latest

What version of Rust are you using?
nightly
What's the status of the running?

What did you do?
create table t1 (a int, b bool null);

insert into table t1 values (1, null);

like bustub we should support the part of the column is null
image

What did you expect to see?
insert success
and the value is exp. ' bool(null) '

What did you see instead?
the value is 'NULL'

Reduce performance issues caused by `Build Plan`

Feature Request

I try to perform this case in a table with 200,000 rows: select * from t1 where c1 > 500 and c1 < 1000, and use pprof-rs to generate a flame graph
flamegraph
The most obvious one is StatisticMetaLoader::load, which will then be cached, and then I generate a flame graph after loading.
flamegraph_no_load_meta
In the new flame graph, HepOptimizer::find_best is a relatively large performance bottleneck.

Perf: Optimize the loading of `StatisticsMetaLoader`

Feature Request

As mentioned in issue: #162, StatisticsMetaLoader::load takes a considerable amount of time.
Currently, StatisticsMetaLoader loads table statistical information at a granularity. This may happen when a table has only a few indexes that are frequently used, but indexes that are not commonly used are still loaded. We can reduce the granularity and increase the loading speed.

Tips: StatisticsMeta is based on Index as the granularity

pub fn load(&self, table_name: TableName) -> Result<&Vec<StatisticsMeta>, DatabaseError> {
        let option = self.cache.get(&table_name);

        if let Some(statistics_metas) = option {
            Ok(statistics_metas)
        } else {
            let paths = self.tx.statistics_meta_paths(&table_name)?;
            let mut statistics_metas = Vec::with_capacity(paths.len());

            for path in paths {
                statistics_metas.push(StatisticsMeta::from_file(path)?);
            }

            Ok(self
                .cache
                .get_or_insert(table_name, |_| Ok(statistics_metas))?)
        }
    }

`SUBSTRING` enhancement

Bug Report

What version of KipSQL are you using?

latest

What version of Rust are you using?

1.77.0

What's the status of the running?

What did you do?
When I run a SUBSTRING function that length is larger than the string, the connectionn is reseted.

What did you expect to see?

=> select substring('abc', 1, 10);
---------------------------------
 abc

What did you see instead?

=> select substring('abc', 1, 10);
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

I think it's better to handle this by retrieving all the characters after the start index(like PostgreSQL and MySQL)

catalog

Background

Build the most basic component of the database: catalog. It provides metadata information for database tables, which is used for subsequent binder and logical planner construction.
At the same time, it is a component that runs through the entire query engine processing flow and provides table metadata information.

To simplify implementation, there is no introduction of the concepts of database and schema here.
Under RootCatalog, a HashMap directly contains all TableCatalogs.
The TableCatalog contains all ColumnCatalogs.

root->table->column

Tasks

  • root
  • table
  • column

parser

Background

In KipSQL, we use sqlparser-rs to parse SQLs.

Parser will return an AST, which is a tree that represents users' query. For example, the above SQL will produce an AST like...

create table t1 (v1 int not null, v2 int);
AST: CreateTable {
    or_replace: false,
    temporary: false,
    external: false,
    global: None,
    if_not_exists: false,
    transient: false,
    name: ObjectName(
        [
            Ident {
                value: "t1",
                quote_style: None,
            },
        ],
    ),
    columns: [
        ColumnDef {
            name: Ident {
                value: "v1",
                quote_style: None,
            },
            data_type: Int(
                None,
            ),
            collation: None,
            options: [
                ColumnOptionDef {
                    name: None,
                    option: NotNull,
                },
            ],
        },
        ColumnDef {
            name: Ident {
                value: "v2",
                quote_style: None,
            },
            data_type: Int(
                None,
            ),
            collation: None,
            options: [],
        },
    ],
    constraints: [],
    hive_distribution: NONE,
    hive_formats: Some(
        HiveFormat {
            row_format: None,
            storage: None,
            location: None,
        },
    ),
    table_properties: [],
    with_options: [],
    file_format: None,
    location: None,
    query: None,
    without_rowid: false,
    like: None,
    clone: None,
    engine: None,
    default_charset: None,
    collation: None,
    on_commit: None,
    on_cluster: None,
    order_by: None,
}

Tasks

  • Implemented with traits, it is convenient to implement parser by yourself later

Feature: drop table should support if_exists

Feature Request

Is your feature request related to a problem? Please describe:

execute drop table t, if table t is not exist will get an error, I think need support if_exists param
Oops!: executor error: storage_ap error: kipdb error

Describe the feature you'd like:

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

Fix null failure of insert part

Bug Report

when inserting some null values, the DataType conversion fails.

because AST recognizes null in Insert Values ​​as DataValue::Null, and the data type Null corresponding to KipSQL's DataValue::i32 is represented as i32(None)
therefore, Binder::bind_insert should convert the DataType::Null of expr in the cols again according to the datatype of the corresponding catalog (if the DataType of ColumnCataLog is DataType::i32, then DataValue::Null -> DataValue::i32(None) )

Bug: In multi-level nested subquery, the field relationships of different tables are wrong.

Bug Report

It seems to project twice on temp table.
And I don't think it can handle more complex situations.

explain SELECT b FROM t1 WHERE a in (SELECT a FROM t2 where (select a from t1));
                               PLAN
-------------------------------------------------------------------
Projection [t1.b] [Project]                                      +
  LeftSemi Join On t1.a = (t2.a) as (_temp_table_2_.a) [HashJoin]+
    Scan t1 -> [a, b] [SeqScan]                                  +
    Projection [(t2.a) as (_temp_table_2_.a)] [Project]          +
      Projection [t2.a] [Project]                                +
        Inner Join Where _temp_table_1_.a [HashJoin]             +
          Scan t2 -> [a] [SeqScan]                               +
          Projection [(t2.a) as (_temp_table_1_.a)] [Project]    +
            Projection [t2.a] [Project]                          +
              Scan t1 -> [] [SeqScan]

I think every query using their own binder context may be a solution, but it may be a big work.

#168 (comment)

csv storage

Feature Request

Is your feature request related to a problem? Please describe:

  • for test
  • support bound and projection

Describe the feature you'd like:

  • add csv storage

Teachability, Documentation, Adoption, Migration Strategy:

  • use arrow type

Bind column incorrectly

Bug Report

What version of KipSQL are you using?
latest

What version of Rust are you using?

rustc 1.77.0-nightly

What did you do?

explain select t.a from t1;
            PLAN
----------------------------
 Projection [t.a] [Project]+
   Scan t1 -> [] [SeqScan]
(1 row)

What did you expect to see?

some error message

Implement SQL Insert

Feature Request

Convert sql to Logical Plan of Insert -> Projection -> Values

The new Values ​​operator is used to convert Rows to BoxedExecutor as a temporary data source

can not select from exsiting table after rebooting

Bug Report

What version of KipSQL are you using?

What version of Rust are you using?

What's the status of the running?

What did you do?
create table t1 (a int primary key, b int, k int)
insert into t1 (a, b, k) values (-99, 1, 1), (-1, 2, 2), (5, 2, 2)
quit
cargo run
select * from t1

What did you expect to see?
(-99, 1, 1), (-1, 2, 2), (5, 2, 2)

What did you see instead?
Oops!: bind error: invalid table bind table t1

TODO: SQL 2016

Feature Request

If you want to contribute code to FnckSQL and find requirements, you can refer to the following list

The corresponding test files are attached under the requirements description. The specific location is in the test directory.
e.g. E011_02 => tests/slt/sql_2016/E011_02.slt

Tips: When you complete the requirement and submit the PR, please do not forget to delete the comment corresponding to the Case.

  • DataType: DOUBLE PRECISION
  • DataType: REAL
    • E011_02
  • DataType: NUMERIC
    • E011_03
  • DataType: CHAR/CHARACTER (VARING)
    • E021_01
    • E021_02
    • E021_10
    • pr: #174
  • Function: CHARACTER_LENGTH()/CHAR_LENGTH()
    • E021_04
  • Function: OCTET_LENGTH()
    • E021_05
  • Function: SUBSTRING()
  • Function: LOWER()/UPPER()
    • E021_08
  • Function: TRIM()
    • E021_09
  • Function: POSITION()
  • Perf: Eliminate duplicate aggregations
    • E051_01: SELECT DISTINCT A, B FROM TABLE_E051_02_01_06 WHERE A = 1 GROUP BY A, B
    • pr: #132
  • TODO: Support AS() on Select
    • E051_07
    • E051_08
    • pr: #131
  • TODO: Support Aliases the result set
  • TODO: Rename columns in the FROM clause
  • TODO: Support BETWEEN on Where
  • TODO: Support ESCAPE on LIKE
  • TODO: Support subquery with ALL/ANY/SOME on WHERE
    • E061_07
    • E061_12
  • TODO: Support EXISTS on WHERE
    • E061_08
    • E071_06
  • TODO: Support Subquery on WHERE
  • TODO: Support Subquery on WHERE with IN/Not IN
    • E061_11
    • E061_13
    • pr: #147
  • TODO: Support UNION\UNION DISTINCT
  • TODO: Support UNION ALL
  • TODO: Support EXCEPT DISTINCT
    • E071_03
  • TODO: Columns combined via table operators need not have exactly the same data type
    • E071_05: SELECT A FROM TABLE_E071_05_01_011 UNION ALL SELECT B FROM TABLE_E071_05_01_012
  • TODO: Support SELECT INTO
  • TODO: Support Custom CONSTRAINT name
    • E141_01
  • TODO: Multiple primary keys
    • E141_03
    • E141_08
  • DataType: NAME
    • E141_07
  • Keyword: CURRENT_DATE
    • E141_07
    • F051_06
    • pr: #181
  • DataType: TIME
  • DataType: TIMESTAMPE
    • E141_07
  • Keyword: CURRENT_TIMESTAMP
    • E141_07
  • Keyword: LOCALTIME
    • E141_07
    • F051_07
  • Keyword: LOCALTIMESTAMP
    • E141_07
    • F051_08
  • TODO: Support multiple transaction levels
    • E152_02
  • TODO: Updatable queries with subqueries
    • E153
  • TODO: Support VIEW & CREATE VIEW & DROP VIEW
    • F031_02
    • F032_16
  • TODO: EXCEPT with VIEW
    • F081
  • TODO: WHERE, GROUP BY, and HAVING clauses supported in queries with grouped views
    • F131_01
  • TODO: Multiple tables supported in queries with grouped views
    • F131_02
  • TODO: Set functions supported in queries with grouped views
    • F131_03
  • TODO: Subqueries with GROUP BY and HAVING clauses and grouped views
    • F131_04
  • TODO: Support Using on Join
    • F041_01
    • F041_02
    • F041_03
    • F041_04
    • F041_07
    • pr: #146
  • TypeString: TIME
    • F051_02
    • F051_04
    • F051_05
  • TypeString: TIMESTAMPE
    • F051_03
    • F051_04
    • F051_05
  • TODO: Explicit defaults
  • TODO: Support CASE
  • TODO: Searched CASE
  • Keyword: NULLIF
  • Function: COALESCE
  • TODO: Support Subquery on Select

Support `like`

Feature Request

e.g. select v2 from t where v1 like 'G%';

add path parameter for main

Feature Request

By default, kipsql provides cargo run for testing sql statements. As an embedded database, kipsql needs to perform some maintenance operations. It is more user-friendly to operate with cli instead of coding. Therefore, use the main method of this test directly. Just provide an optional path parameter

e.g. cargo run -path '/home/kould/kipsql_data'

PG Wire support

Feature Request

skip startup step

  • Simple Query
  • Extended Query

Question:
KipSQL is currently mainly used as an embedded SQL database. Its similar database may be SQLite, but SQLite does not support remote connections.
Therefore, it may be a bit redundant for KipSQL to support PG Wire?
However, if it is used to support operation and maintenance, it may be a good choice.

Tips:
This is a lower priority issue(I think),Because there is no suitable answer in the Question part.

[1] http://mysql.taobao.org/monthly/2020/03/02/
[2] https://zhuanlan.zhihu.com/p/493045524

DDL Support

Feature Request

Function

  • CURRENT_TIMESTAMP

Create Table

  • PRIMARY KEY (column_1, column_2, ...)
  • UNIQUE (column_1, column_2, ...)
  • Options
    • Default e.g. CREATE TABLE CUSTOMER (C_ID INTEGER DEFAULT '0' NOT NULL)

Ref: #103

Create Index

  • CREATE UNIQUE INDEX u_index ON table(column_1, column_2, ...);
  • Perf Unique Index Choose (options)

Alert Table

  • ALTER TABLE table ADD column_1 INTERGER; ref: #104
  • ALTER TABLE table MODIFY column_1 INTERGER;
  • ALTER TABLE table CHANGE COLUMN column_1 column_2 INTERGER;
  • ALTER TABLE table DROP COLUMN column_1; ref: #109
  • RENAME table TO table ;

Ref:

Desc

Keyword splicing abbreviation in `TableCodec`

Feature Request

Use a certain format to simplify key generation
like: TableName + TypeNum + 0 + TypeId + 0 + DataValue
e.g. T0_Index_0_0_0_0000000000000000000 => T020000000000000000000000

support filter operator

Feature Request

Is your feature request related to a problem? Please describe:
currently, we just support select a from t1
we should add filter operation for exp.
select * from t1 where a=1

Describe the feature you'd like:
where a>1 a<1 a =1
where c1 > c2
where c1 > c2 * 2

Support `Decimal` type

Feature Request

TPC-H:

CREATE TABLE lineitem (
    l_orderkey INT,
    l_partkey INT,
    l_suppkey INT,
    l_linenumber INT,
    l_quantity DECIMAL(15,2),
    l_extendedprice DECIMAL(15,2),
    l_discount DECIMAL(15,2),
    l_tax DECIMAL(15,2),
    l_returnflag CHAR(1),
    l_linestatus CHAR(1),
    l_shipdate DATE,
    l_commitdate DATE,
    l_receiptdate DATE,
    l_shipinstruct CHAR(25),
    l_shipmode CHAR(10),
    l_comment VARCHAR(44),
    PRIMARY KEY (l_orderkey, l_linenumber)
);

Support `Text` type

Feature Request

Separate Varchar from Text, use a prefix of the same length, and combine it with the suffix to form a complete string.

Tips: The suffix of Varchar is string, and the suffix of Text is file.

support alert table

Feature Request

this issue is #97 sub-issue

support alert table:

  • ALTER TABLE table ADD column_1 INTERGER;
  • ALTER TABLE table MODIFY column_1 INTERGER;
  • ALTER TABLE table CHANGE COLUMN column_1 column_2 INTERGER;
  • ALTER TABLE table DROP COLUMN column_1;
  • RENAME table TO table ;

support sort operator

Feature Request

Is your feature request related to a problem? Please describe:
currently, we just support like select * from t1 where a=1
we should add order operation for exp.
select a, b from t1 order by a

Describe the feature you'd like:
order by a desc/asc

Invalid field specified when inserting

Bug Report

let _ = kipsql.run("create table t1 (a int, b int)").await?;
let _ = kipsql.run("insert into t1 (b, a) values (1, 1), (3, 3), (5, 4)").await?;

let vec_batch_full_fields = kipsql.run("select * from t1").await?;
print_batches(&vec_batch_full_fields)?;

πŸ‘‡

a b
1 1
3 3
5 4

Tips: Fix it by directly adding Projection to Insert and Values

support desc table

Feature Request

we need support desc table to query table structure:

  • DESC table_name;

I will try impl this

support update operator

Feature Request

  1. UPDATE table_name SET field1=new-value1, field2=new-value2 [WHERE Clause]
  2. logic plan
  3. physics plan
  4. executor
  5. storage using hash replace vector

Support `Begin` and `Commit`

Feature Request

Extract KipDB transactions and put them under user control

Invalid:

The Update statement involves two transaction creations:

  1. scan
  2. insert

PR

binder

Background

Binder is a module in the entire database system that is very important.
Its function is to bind the AST and Schema information generated after parsing, specifically including:

  1. Checking whether the input name is legal, whether there are duplicates or ambiguities
  2. Inferring the return value type of expressions and checking whether it is legal
  3. Converting input names into internal IDs.

Tasks

In this stage, you should Implement 3 types of statement binding

  • BoundCreateTable

    • schema id
    • table name
    • columns
  • BoundSelect

    • select list
    • from list
    • ...
  • BoundInsert

    • table ref id
    • column ids
    • column types
    • values

Perf: Composite Index Range Detach optimization

Feature Request

TODO: When the case is select * from t1 where (c1 = 7 or c1 = 10) and c2 < 2;, the range is (-inf, (10, 2)), in fact it is better to be ((7), (7) ,2)), ((10), (10, 2))

on pr: #154 TODO part

Support `sub query`

Feature Request

e.g. SELECT * FROM employees WHERE salary > (SELECT salary FROM employees WHERE last_name = 'Abel');

Roadmap v0.1

Milestone

The sql-query-engine-rs v0.1 milestone is to build a basic SQL query on CSV storage, like:

DDl:

CREATE TABLE student (
    id      INTEGER PRIMARY KEY,
    name    VARCHAR NOT NULL,
    age     INTEGER
);

DQL:

SELECT * FROM t
SELECT a, b FROM t
SELECT a, b, 1 FROM t

DML:

INSERT INTO t(a, b) VALUES (1, 10)

Therefore, we should implement the following database components:

Reading and writing CSV

Feature Request

By supporting CSV reading and output, it provides testing, backup and other applications.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.