Giter Site home page Giter Site logo

table-meets-llm's Introduction

Table meets LLM

SUC is a useful benchmark for detecting table structural understanding capabilities proposed in the paper "Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study".

Benchmark Setting

In SUC, we design several specific tasks for each structural understanding capability. The tasks are designed with increasing difficulty. Please find the relevant codes in unified_benchmark_generator.py. The benchmark supports zero-shot, 1-shot and multiple input choices. More detailed commands can be found here.

# Multiple arguments can be set up when generating the benchmark
--dataset, choices=[tabfact, feverous, sqa, hybridqa, totto], help="choose which dataset you intend to use in your experiments"
--liearize_list, choices=[nl_sep, html, markdown, mark_down_grid, json, xml, latex], help="choose which linearization function you want to use, currently, SUC supports using nl_sep, html, markdown, mark_down_grid, json, xml, latex"
--use_partition_mark, help="whether to use partition mark in your experiments"
--use_format_explanation, help="whether to use format explanation in your experiments"
--use_role_prompting, help="whether to use use_role_prompting in your experiments"
--swap_input_order, help="whether to put external text (like questions, statement) ahead of tables."
--objective, choices=[oneshot, zero-shot], help="whether to add 1-shot example to the prompt"

# find examples of multiple arguments in unified_benchmark_generate.sh

Downstream Tasks Setting

Based on our findings and insights over SUC comparisons, we find that several combination of input designs will highly affect the LLMs performance on SUC tasks. In this paper, we give some guidance on how to apply our benchmark insights to promote LLMs performance on downstream tasks, and examine using self-augmented prompting to generate additional knowledge with LLMs self-knowledge. The code associated with downstream tasks can be found in unified_babel_convertor.py. The downstream tasks setting support both manual prompting engineering and self-augmented prompting. Multiple prompt choices can be found in config.py.

cd table_meets_llm
# generate table/databases downstream tasks
python unified_babel_convertor.py --task cosql dart tabfact feverous tabfact hybridqa spider totto sql2text logic2text sqa webqsp --objective zero --split train validation --unified --unified_file_output ./exps/downstream_tasks_20230113_log/

# generate self-augmented information
python unified_babel_convertor.py --task totto tabfact hybridqa sqa feverous --objective oneshot --heuristic heur_8 --split validation --unified --unified_file_output  ./exps/downstream_tasks_20230120_self_augmented_p2_log/heur_8  --linear_func html
python unified_babel_convertor.py --task totto tabfact hybridqa sqa feverous --objective oneshot --heuristic heur_9 --split validation --unified --unified_file_output ./exps/downstream_tasks_20230120_self_augmented_p2_log/heur_9  --linear_func html
python unified_babel_convertor.py --task totto tabfact hybridqa sqa feverous --objective oneshot --heuristic heur_10 --split validation --unified --unified_file_output ./exps/downstream_tasks_20230120_self_augmented_p2_log/heur_10 --linear_func html


# more detailed information can be found in unified_bael_convertor.sh

Additional Info

  • Dataset Collection Code can be found in scripts/dataset_collection. We use huggingface dataset as our dataloader.
  • Serialization Functions can be found in utils/structured_data_linearize.py.

Reference

@misc{sui2023gpt4table,
      title={Table meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study}, 
      author={Yuan Sui and Mengyu Zhou and Mingjie Zhou and Shi Han and Dongmei Zhang},
      year={2023},
      eprint={2305.13062},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Table Provider

To be continue.

table-meets-llm's People

Contributors

y-sui avatar

Stargazers

 avatar Yikang Pan avatar  avatar Chen Peng avatar  avatar Zirui Wu avatar

Watchers

 avatar

Forkers

cpeng-pz

table-meets-llm's Issues

Request for Dataset

Dear author,
I recently downloaded your sample generation code from the GitHub repository and tried running it. My code encountered some issues and was unable to run successfully. I have tried various methods to solve this problem, but have not been successful.
Could you please provide the dataset after running the sample code? This will be very helpful for me to better understand and use your work.
Thank you for your time and help.
Sincerely

尊敬的作者,
您好!
我最近下载了您在GitHub仓库中的样本生成代码,并尝试运行它。我的代码遇到了一些问题,未能成功运行。我尝试了多种方法来解决这个问题,但仍未成功。
能否请您提供运行样本代码完成后的数据集?这将对我更好地理解和使用您的工作大有帮助。
感谢您的时间和帮助。
此致

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.