Giter Site home page Giter Site logo

dtstack / dt-sql-parser Goto Github PK

View Code? Open in Web Editor NEW
262.0 10.0 83.0 42.71 MB

SQL Parsers for BigData, built with antlr4.

Home Page: https://dtstack.github.io/monaco-sql-languages/

License: MIT License

JavaScript 0.05% TypeScript 96.95% ANTLR 3.00% Shell 0.01% PLpgSQL 0.01%
bigdata sql parser antlr4 autocompletion flink hive impala mysql postgresql

dt-sql-parser's People

Contributors

aretecode avatar cythia828 avatar dependabot[bot] avatar haydenorz avatar hsunboy avatar jackwang032 avatar liuxy0551 avatar luckyfbb avatar mortalyoung avatar mumiao avatar nankanull avatar onlyflyer avatar profbramble avatar salvoravida avatar wewoor avatar xigua-jn avatar yuchen-ecnu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dt-sql-parser's Issues

文档错误,`text` property is missing from result

The Type of SQL
Standard

Your Code

import { GenericSQL } from 'dt-sql-parser';

const parser = new GenericSQL()
const sql = 'select id,name,sex from user1;'
const tokens = parser.getAllTokens(sql)
console.log(tokens)

Problem
The output is expected to be

/*
[
    {
        channel: 0
        column: 0
        line: 1
        source: [SqlLexer, InputStream]
        start: 0
        stop: 5
        tokenIndex: -1
        type: 137
        _text: null
        text: "SELECT"
    },
    ...
]
*/

But the text is missing in latest 4.0.0-beta.* version

有人吗

The Type of SQL
e.g. Impala

Your Code
e.g. parser.parserSql(....);

Problem
e.g. the code do not work.

Roadmap 2022

三季度

  • 完善 FlinkSQL, 补充单元测试
  • 完善 SparkSQL, 补充单元测试
  • Bug 修复
  • 查看支持 Autocomplete 的方案

四季度

如何获得整棵 AST 树的 JSON 数据?

通过 parse 获得 tree 对象后,如何转换成整棵 AST 树的 JSON 数据?

下面是某些节点示例,是我想要的效果:

{
    "type": "createDatabase",
    "children": [
        
    ]
}


{
    "type": "SelectElements",
    "children": [
        
    ]
}

{
    "type": "TableName"
    "value": "user"
}

can it support xx.xxx.xxx as t1?

请问可以支持有两个层级的表名吗?比如database1.mode.tableName ,现在好像了如果这样会报错。非常感谢!

tsc编译报错

版本信息

  • dt-sql-parser版本:4.0.0-beta.2.2
  • tsc版本:4.5.4
$ tsc -v
Version 4.5.4

报错信息

执行npx tsc,报错如下:

node_modules/dt-sql-parser/dist/lib/flinksql/FlinkSqlParserListener.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof FlinkSqlParserListener;
                 ~

node_modules/dt-sql-parser/dist/lib/flinksql/FlinkSqlParserVisitor.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof FlinkSqlParserVisitor;
                 ~

node_modules/dt-sql-parser/dist/lib/generic/SqlParserListener.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof SqlParserListener;
                 ~

node_modules/dt-sql-parser/dist/lib/generic/SqlParserVisitor.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof SqlParserVisitor;
                 ~

node_modules/dt-sql-parser/dist/lib/hive/HiveSqlListener.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof HiveSqlListener;
                 ~

node_modules/dt-sql-parser/dist/lib/hive/HiveSqlVisitor.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof HiveSqlVisitor;
                 ~

node_modules/dt-sql-parser/dist/lib/plsql/PlSqlParserListener.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof PlSqlParserListener;
                 ~

node_modules/dt-sql-parser/dist/lib/plsql/PlSqlParserVisitor.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof PlSqlParserVisitor;
                 ~

node_modules/dt-sql-parser/dist/lib/spark/SparkSqlListener.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof SparkSqlListener;
                 ~

node_modules/dt-sql-parser/dist/lib/spark/SparkSqlVisitor.d.ts:4:16 - error TS1005: '(' expected.

4     constructor: typeof SparkSqlVisitor;
                 ~

Found 10 errors.

cleanSql 方法里逻辑是不是有问题

/**

  • 清除注释和前后空格
  • @param {String} sql
    */
    function cleanSql(sql: string) {
    sql.trim(); // 删除前后空格
    const tokens = lexer(sql);
    let resultSql = '';
    let startIndex = 0;
    tokens.forEach((ele: Token) => {
    if (ele.type === TokenType.Comment) {
    resultSql += sql.slice(startIndex, ele.start);
    startIndex = ele.end + 1;
    }
    });
    resultSql += sql.slice(startIndex);
    return resultSql;
    }

上面删除前后空格这段代码并不会对源sql字段生效的。是不是要改成下面这样

/**

  • 清除注释和前后空格
  • @param {String} sql
    */
    function cleanSql(origiSql: string) {
    let sql = origiSql.trim(); // 删除前后空格
    const tokens = lexer(sql);
    let resultSql = '';
    let startIndex = 0;
    tokens.forEach((ele: Token) => {
    if (ele.type === TokenType.Comment) {
    resultSql += sql.slice(startIndex, ele.start);
    startIndex = ele.end + 1;
    }
    });
    resultSql += sql.slice(startIndex);
    return resultSql;
    }

sql内多层括号splitsql引起死循环

The Type of SQL
任意类型sql

Your Code

import { splitSql } from 'dt-sql-parser';

const sql = `with category(name, cn_name) as (values('测试名称(test), 名称'))`
const sqlList = splitSql(sql)
console.log(sqlList)

Problem
使用以上代码会引起死循环。
原因:
在src/utils/index.ts的matchFunction里只考虑了左括号去匹配下一个右括号,没考虑中间还有其他左括号

parser validate sql too slow

The Type of SQL
hive

Your Code

const { HiveSQL } = require('dt-sql-parser');
const parser = new HiveSQL();

const sql = `select 
from_unixtime(unix_timestamp('20210728','yyyyMMdd'),'yyyy-MM-dd') as stat_date,
-- from_unixtime(unix_timestamp('20210728','yyyyMMdd'),'yyyy-MM-dd HH:mm:ss') as stat_time,
'5' as track_type,
tenant_id,
-- 5  as track_type,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
count(DISTINCT uid_now) as now_cnt,
sum(case when uid_last is NULL then 0 else 1 end) as return_user_cnt 
from (select a.tenant_id,
        a.uid_now,
        b.uid_last 
from ( SELECT tenant_id ,
                user_id as uid_now 
        FROM tezign_dw.dw_fact_web_tracking_log_add_h
        where dt is not null
        and biz_line_id = 'd14eae7480b640cab48af3db8567d298' and product_line_id = '0358a581866642d9b8e6c1c94e5a5f00' AND is_page_flag =1 
        and cast(create_date as date) > DATE_ADD(from_unixtime(unix_timestamp('20210728','yyyyMMdd'),'yyyy-MM-dd'),-7)
        and cast(create_date as date) <= from_unixtime(unix_timestamp('20210728','yyyyMMdd'),'yyyy-MM-dd')
        and tenant_id not in ('t2', 't3', 't4', 't11', 't32', 't39', 't52', 't53', 't127', 't141', 't147', 't168', 't169', 't171',
           't172', 't173', 't174', 't178', 't187', 't192', 't197', 't199', 't202') 
        and tenant_id is not NULL and length(user_id)<11 
        group by tenant_id,user_id) a 
left join (SELECT tenant_id,
                    user_id as uid_last 
            FROM tezign_dw.dw_fact_web_tracking_log_add_h
            where dt is not null
            and biz_line_id = 'd14eae7480b640cab48af3db8567d298' and product_line_id = '0358a581866642d9b8e6c1c94e5a5f00' AND is_page_flag =1 
            and cast(create_date as date)> DATE_ADD(from_unixtime(unix_timestamp('20210728','yyyyMMdd'),'yyyy-MM-dd'),-14)
            and cast(create_date as date)<= DATE_ADD(from_unixtime(unix_timestamp('20210728','yyyyMMdd'),'yyyy-MM-dd'),-7)
            and tenant_id not in ('t2', 't3', 't4', 't11', 't32', 't39', 't52', 't53', 't127', 't141', 't147', 't168',
                         't169', 't171', 't172', 't173', 't174', 't178', 't187', 't192', 't197', 't199', 't202') 
            and tenant_id is not NULL and length(user_id)<11 
            group by tenant_id,user_id) b 
on a.tenant_id=b.tenant_id 
and a.uid_now = b.uid_last 
group by a.tenant_id,a.uid_now,b.uid_last) c 
group by tenant_id`;

const errors = parser.validate(sql);

console.log(errors);

Problem
validate sql too slow, need about 10 seconds;

hive sql的validate不准确

The Type of SQL

hive sql

Your Code

import {HiveSQL} from 'dt-sql-parser';

const parser = new HiveSQL();
const sqlStr = 'select * from abc from cc;'; // 错误的语法。
const errors = parser.validate(sqlStr);
console.log('errors:', errors);

Problem
没有打印出来错误信息

SQL 解析时长性能测试

针对 parse SQL 的时间,做一个基本的测试

  • 100 行 SQL
  • 1000 行 SQL
  • 5000 行 SQL

主要是解析 耗时

根据官网提供的,补全 FlinkSQL 语法(支持版本 Flink1.16),并补充单测

Insert

  1. 新增 mul statement(
    该语法在flink1.14及其之前版本,用的是,BEGIN STATEMENT SET;... END;
    从flink1.15开始使用的是 EXECUTE STATEMENT SET BEGIN ... END;
    )
  2. 补充 columnList

Drop

  1. 新增删除 catalog (drop catalog)
  2. 补充删除临时表(drop temporary table)
  3. 补全单测

Alter

  1. 新增对视图的修改(alert view)
  2. 修复 alert function 的语法,并指定支持的语言

Create

  1. 新增 create table as select 的语法(1.16)
  2. 新增 create temporary table 的语法(临时表的语法虽然官方的 syntax 没有写,但是在其他文档里都是有对 temporary table 的一个描述的,所以这里还是添加上)
  3. 修复 create function 语法中,对临时function的指定
  4. 补充 create function 中的language 以及用法

describe

  1. 新增 desc

explain

  1. 新增 explainDetials 的语法支持
  2. 修改原 plan for 的语法
  3. 新增 explain <statement_set>
  4. 新增 explain <insert_statement>

Use

  1. 新增 use module

Show

  1. 完善语法

(待补充...)

Provide declaration file

The dt-sql-parser has no the default typing declaration file, it throws the below error message in Typescript:

Could not find a declaration file for module 'dt-sql-parser'

特殊SQL解析失败,无法自动截取末尾的分号

The Type of SQL
SparkSQL

Your Code

select regexp_replace('abc', '\'', '233') ;
select regexp_replace('abc', '\'fefefef', '233') ; select * from table_test;

Problem
第一行 SQL 在调用 splitSql 方法以后的结果如下:

select regexp_replace('abc', '\'', '233') ;

期望结果如下:

select regexp_replace('abc', '\'', '233') 

第二行 SQL 在调用 splitSql 方法以后的结果如下,不会被识别为两段 SQL:

select regexp_replace('abc', '\'', '233') ; select * from table_test;

期望结果如下,期望可以被识别为两段 SQL:

select regexp_replace('abc', '\'', '233') ; 

select * from table_test;

lexer 函数死循环问题

image
这段代码有问题

如果是注释结尾这里就会陷入死循环。比如
let sql = 'select 1 --test';

建议 while (char !== '\n') 修改为 while (char !== '\n' && current < input.length)

Unable to integrate it with my react app

Hi Team,

I'm unable to integrate it with the my react app which is created by using the command 'npx create-react-app my-app'.
When I try to run the code its giving me the below error in terminal. Ive tried it with both node version 16 and 14.

Starting the development server...

<--- Last few GCs --->

[1317:0x7fea1e200000] 21952 ms: Scavenge 1010.6 (1033.9) -> 1007.1 (1034.1) MB, 1.4 / 0.0 ms (average mu = 0.194, current mu = 0.170) allocation failure
[1317:0x7fea1e200000] 21956 ms: Scavenge 1011.3 (1034.6) -> 1008.2 (1034.9) MB, 1.4 / 0.0 ms (average mu = 0.194, current mu = 0.170) allocation failure
[1317:0x7fea1e200000] 21960 ms: Scavenge 1012.3 (1035.6) -> 1008.9 (1035.9) MB, 1.4 / 0.0 ms (average mu = 0.194, current mu = 0.170) allocation failure

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0x10152a515 node::Abort() (.cold.1) [/usr/local/bin/node]
2: 0x10022b989 node::Abort() [/usr/local/bin/node]
3: 0x10022baff node::OnFatalError(char const*, char const*) [/usr/local/bin/node]
4: 0x1003ab2c7 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
5: 0x1003ab263 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
6: 0x10054c975 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
7: 0x1005509bd v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/usr/local/bin/node]
8: 0x10054d29d v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/local/bin/node]
9: 0x10054a7bd v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
10: 0x1005494d8 v8::internal::Heap::HandleGCRequest() [/usr/local/bin/node]
11: 0x1004f5cd1 v8::internal::StackGuard::HandleInterrupts() [/usr/local/bin/node]
12: 0x1008d9ca8 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
13: 0x100c83dd9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node]

Here is the package.json file code.

{
"name": "my-app",
"version": "0.1.0",
"private": true,
"dependencies": {
"@testing-library/jest-dom": "^5.15.0",
"@testing-library/react": "^11.2.7",
"@testing-library/user-event": "^12.8.3",
"dt-sql-parser": "^4.0.0-beta.2.2",
"react": "^17.0.2",
"react-dom": "^17.0.2",
"react-scripts": "4.0.3",
"web-vitals": "^1.1.2"
},
"scripts": {
"start": "node --max_old_space_size=4092 & react-scripts start",
"build": "node --max_old_space_size=4092 & react-scripts build",
"test": "react-scripts test",
"eject": "react-scripts eject"
},
"eslintConfig": {
"extends": [
"react-app",
"react-app/jest"
]
},
"browserslist": {
"production": [
">0.2%",
"not dead",
"not op_mini all"
],
"development": [
"last 1 chrome version",
"last 1 firefox version",
"last 1 safari version"
]
}
}

Tried the export NODE_OPTIONS=--max_old_space_size=4096 in terminal. But it doesn't work. By the way I'm running the app in mac.

Anyone, please help me with this.

dt-sql-parser4.0.0-beta.3.2、vite2.9、vue3.2项目引入dt-sql-parser后开发环境没有问题,yarn build也没有任何报错,部署后浏览器报TypeError: Cannot read properties of undefined (reading 'prototype') at generic.560e0fae.js:1:30666

import FlinkSQL from "dt-sql-parser/dist/parser/flinksql";

sqlParse(val) {
const parser = new FlinkSQL();
const validParser = parser.validate(val)[0];
const decora = this.decorations || [];
let newDecora = [];
if (validParser) {
const warningLalbel = 编译语句时异常:在行${validParser.startLine}:${validParser.startCol},${validParser.message};
const range = new monaco.Range(
validParser.startLine,
validParser.startCol,
validParser.endLine,
validParser.endCol + 1
);

  newDecora = [
    {
      range,
      options: {
        inlineClassName: "inlineDecoration",
        glyphMarginClassName: "glyphMarginClass",
        hoverMessage: {
          value: warningLalbel
        }
      }
    }
  ];
  this.decorations = this.instance.deltaDecorations(decora, newDecora);
} else {
  this.decorations = this.instance.deltaDecorations(decora, []);
}

}

UglifyJs Unexpected token: punc «,»

"dt-sql-parser": "^3.0.2"

生产环境使用webpack build时 uglifyjs-webpack-plugin 中会报错

...from UglifyJs
Unexpected token: punc «,» [./node_modules/dt-sql-parser/lib/core/comment.js:120,24

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.