Giter Site home page Giter Site logo

nvbench's Introduction

nvBench: Natural Language to Visualization (NL2VIS) Benchmarks

nvBench is a large dataset for complex and cross-domain NL2VIS task, which covers 105 domains, supports seven common types of visualizations, and contains 25,750 (NL, VIS) pairs. This repository contains the corpus of NL2VIS, with JSON format and Vega-Lite format.

Introduction to nvBench

  • nvBench.json stores the JSON format of (NL, VIS) pairs in the nvBench benchmark.

  • nvBench_VegaLite contains all (NL, VIS) pairs in the nvBench benchmark, and renders the VIS using the Vega-Lite visualization library.

  • database contains all databases used by the NVBench benchmark.

nvBench.json

(NL, VIS) JSON format

Each (NL, VIS) pair is denoted as a JSON object in NVBench.json, with the following fields:

  • key: the id of the (NL, VIS) pair in NVBench benchmark
  • vis_query: contains the query for VIS, with two parts: vis_part and data_part.
  • chart: the visualization types: Bar, Pie, Line, Scatter, Stacked Bar, Grouping Line, and Grouping Scatter.
  • db_id: the visualization comes from which database.
  • vis_obj: the JSON format for representing a visualization object, with chart (chart type), x_name (name of the X-axis), y_name(name of the Y-axis), x_data (data for the X-axis), y_data (data for the Y-axis), classify (Z-axis data, for stacked bar, grouping line, and grouping scatter chart.)
  • nl_queries: contains the NL queries for querying this visualization object.

Below is an example:

"8": {
        "vis_query": {
            "vis_part": "Visualize PIE",
            "data_part": {
                "sql_part": "SELECT Rank , COUNT(Rank) FROM Faculty GROUP BY Rank",
                "binning": ""
            },
            "VQL": "Visualize PIE SELECT Rank , COUNT(Rank) FROM Faculty GROUP BY Rank"
        },
        "chart": "Pie",
        "hardness": "Easy",
        "db_id": "activity_1",
        "vis_obj": {
            "chart": "pie",
            "x_name": "Rank",
            "y_name": "CNT(Rank)",
            "x_data": [
                [
                    "AssocProf",
                    "AsstProf",
                    "Instructor",
                    "Professor"
                ]
            ],
            "y_data": [
                [
                    8,
                    15,
                    8,
                    27
                ]
            ],
            "classify": [],
            "describe": "GROUP BY Rank"
        },
        "nl_queries": [
            "A pie chart showing the number of faculty members for each rank.",
            "What is the number of the faculty members for each rank? Return a pie.",
            "Compute the total the number of rank across rank as a pie chart."
        ]
    }

Citation

When you use the nvBench dataset and the corresponding baseline models, we would appreciate it if you cite the following:

@inproceedings{nvBench_SIGMOD21,
  author    = {Yuyu Luo and
               Nan Tang and
               Guoliang Li and
               Chengliang Chai and
               Wenbo Li and
               Xuedi Qin},
  title     = {Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks},
  booktitle = {Proceedings of the 2021 International Conference on Management of
               Data, {SIGMOD} Conference 2021, June 20โ€“25, 2021, Virtual Event, China},
  publisher = {{ACM}},
  year      = {2021},
}

NL2VIS Baselines

Please adapt the Seq2Seq Baselines at the Spider repository. Replace the data preprocessing part and fed the (NL, VIS) pairs of nvBench for training and testing.

Publications

For more details, please refer to our research paper.

Contributors

# Contributor Affiliation Contact
1 Guoliang Li Professor, Tsinghua University [email protected]
2 Nan Tang Senior Scientist, Qatar Computing Research Institute [email protected]
3 Yuyu Luo PhD Student, Tsinghua University [email protected]
If you have any questions or feedbacks about this project, please feel free to contact Yuyu Luo ([email protected]).

License

nvBench is available under the MIT license.

nvbench's People

Contributors

thanksyy avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.