Giter Site home page Giter Site logo

nl2lf's Introduction

NL2LF

(持续更新中...)
recently update log:

1. STRUG: Structure-Grounded Pretraining for Text-to-SQL
2. SmBoP: Semi-autoregressive Bottom-up Semantic Parsing
3. SDSQL: Improving Text-to-SQL with Schema Dependency Learning

The Resources for Natural Language to Logical Form Research, Focus on NL2SQL first.
"自然语言转逻辑形式"研究资料收集: 本阶段主要以NL2SQL的研究为主, 主要包括评测公开数据集、相关论文和部分代码实现、相关博客或公众号文章。

NL2SQL
一、主要评测数据集 dataset
二、主要论文方法及代码实现 papers&code
    1. WikiSQL
    2. Spider
三、相关资源扩展 extend-resources
    1. Related Works
    2. SQL2Seq
    3. 图神经网络 GNN

NL2SQL & Text2SQL

一、主要评测数据集(DataSet)










二、主要论文方法及代码实现(Papers&Code)

论文主要以WikiSQL和Spider为评测数据,相应排行榜详见任务主页。
下面主要整理具有代表性的方法,持续更新补充...
注: Exe_score 表示 | model | Dev accuracy | Test accuracy |,表示执行准确率(Execution accuracy)
Log_score 表示逻辑准确率(Logical accuracy),且Spider中不包括值预测。

1. WikiSQL:







  • Schema Dependency Guided 🔥🔥

    结合Question和Schema之间的依存关系来进行多任务学习。

    Paper

    Exe_score

    SDSQL + EG 92.5 92.4
    SDSQL 88.7 88.8


  • Information Extraction Approach 🔥🔥

    信息抽取的方法: 采用统一的基于BERT的抽取模型来识别query提及的槽位类型,包括序列标注方法、关系抽取和基于文本匹配的链接方法。

    Paper

    Exe_score

    BERT-IE-SQL + EG 92.6 92.5
    BERT-IE-SQL 88.7 88.8


  • MRC Approach 🔥

    阅读理解的方法: 与传统槽位填充方法不同的是,该方法将NL2SQL转化为QA问题,通过统一的MRC框架来预测不同的槽位。

    Paper

    Code

    Exe_score

    BERT-MRC-SQL + STILTs training + AGG enhancement 87.8 87.4
    BERT-MRC-SQL + STILTs training 86.2 86.0
    BERT-MRC-SQL 85.9 85.9




2. Spider:









  • SmBoP

    与自上而下的自回归分析相比,半自回归自底向上解析器具有多种优势。首先,由于每个解码步骤中的子树都是并行生成的,因此理论上的运行时间是对数而不是线性复杂度。其次,自下而上的方法学习在每个步骤上学习语义子程序的表示,而不是语义上模糊的部分树。最后,SMBOP基于Transformer的层将子树相互关联起来,与传统的beam-search不同,以探索过的其他树木为条件为树进行评分。

    Paper

    Code https://github.com/OhadRubin/SmBop

    Log_score

    SmBoP + GraPPa (DB content used) 74.7 69.5
    SmBoP + BART 66.0 60.5

    Exe_score

    SmBoP + GraPPa (DB content used) - 71.1






  • GAZP 🆕

    GAZP combines a forward semantic parser with a backward utterance generator to synthesize data (e.g. utterances and SQL queries) in the new environment, then selects cycleconsistent examples to adapt the parser. Unlike data-augmentation, which typically synthesizes unverified examples in the training environment, GAZP synthesizes examples in the new environment whose inputoutput consistency are verified.

    Paper

    Exe_score

    GAZP + BERT - 53.5




三、相关资源扩展 (extend resources)

1. Related Works
1.1 Pre-training 🔥🔥🔥

A novel weakly supervised Structure-Grounded pretraining framework (STRUG) for text-to-SQL that can effectively learn to capture text-table alignment based on a parallel text-table corpus.

A new method for Text-to-SQL parsing, Grammar Pre-training (GP),is proposed to decode deep relations between question and database.

An effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data.

A pretrained language model that jointly learns representations for NL sentences and (semi-)structured tables.

Adapting a semantic parser trained on a single language.

1.2 Systems
1.3 Surveys
1.4 Blogs
1.5 Other Papers
1.6 Tools
2. SQL2Seq

Paper

Code

3. 图神经网络(GNN)

Paper

Code

nl2lf's People

Contributors

baeseulki avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.