Giter Site home page Giter Site logo

chartqa's Introduction

ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

  • Authors: Ahmed Masry, Do Long, Jia Qing Tan, Shafiq Joty, Enamul Hoque
  • Paper Link: ChartQA

Updates

  • Added VisionTaPas Model
  • Added the Mask-RCNN training and inference codes to generate the visual features for VL-T5
  • Added the full ChartQA dataset (including the bounding boxes annotations)
  • Added T5 and VL-T5 models codes along with the instructions.
  • Added the first version of the ChartQA dataset (does not have the annotations folder)

ChartQA Dataset

First Version (does not have the annotations folder)

The ChartQA dataset is available in the ChartQA Dataset folder in this repository. You can also download it from the following google drive link: ChartQA Dataset

Full Version (with the annotations folder)

The full ChartQA dataset (including the annotations) can be downloaded from the following google drive link: Full ChartQA Dataset. The dataset has the following structure:

├── ChartQA Dataset                   
│   ├── train   
│   │   ├── train_augmented.json # ChartQA-M (machine-generated) questions/answers. 
│   │   ├── train_human.json     # ChartQA-H (human-authored) questions/answers. 
│   │   ├── annotations           # Chart Images Annotations Folder
│   │   │   ├── chart1_name.json
│   │   │   ├── chart2_name.json
│   │   │   ├── ...
│   │   ├── png                   # Chart Images Folder
│   │   │   ├── chart1_name.png
│   │   │   ├── chart2_name.png
│   │   │   ├── ...
│   │   ├── tables                # Underlying Data Tables Folder
│   │   │   ├── chart1_name.csv
│   │   │   ├── chart2_name.csv
│   │   │   ├── ...
│   └── val  
│   │   │   ...
│   │   │   ...
│   └── test  
│   │   │   ...
│   │   │   ...
│   │   |   ...

Note: In order to produce the annotations (e.g., bounding boxes) for the charts, we processed the SVG files of these charts automatically. However, some of the SVG files were corrupt/noisy/missing, so the provided annotations in this dataset are a bit noisy. Moreover, the Pew Research Centre chart images didn't have any SVG files when we crawled them. That's why we had to manually annotate them and use some heuristics to accelerate the annotation process.

Each annotation json file has the following format (similar to PlotQA and FigureQA datasets):

models: a list of dictionaries where each dictionary contains the following keys:
    **For bar and line charts**
      name: The Legend Label of the data points (bars, line).
      color: Color of the data points (bars, line). 
      bboxes: Bounding boxes of the data points (bars, line segments)
      x: x-value of the datapoints.
      y: y-value of the datapoints.
     ** Pie Charts **
      name: The label of the pie slice
			color: Color of the pie slice.
			bbox: Bounding box of the pie slice
			value: Value of the pie slice
			text_label: Text label of the pie slice
			text_bbox: Bounding box of the text label
      points: Coordinates of the start/end/center points of the pie slice. 

type: Chart Type (v_bar, h_bar, line, pie).

general_figure_info: It is a dictionary containng the following keys-
		title: Bounding box and the text corresponding to the title of the plot.
		x_axis: Bounding boxes, axis labels corresponding to the x-axis of the chart image.
		y_axis: Bounding boxes, axis labels corresponding to the y-axis of the chart image.
		legend: Bounding boxes, axis labels corresponding to the legend of the chart image.
		figure_info: Bounding box corresponding to the plot area of the chart image.

Models

VL-T5

Please refer to VL-T5

T5

Please refer to T5

VisionTapas

Please refer to VisionTapas

Contact

If you have any questions about this work, please contact Ahmed Masry using the following email address: [email protected]. Please note that my school email which was mentioned in the paper ([email protected]) has been deactivated since I have already graduated.

Reference

Please cite our paper if you use our models or dataset in your research.

@inproceedings{masry-etal-2022-chartqa,
    title = "{C}hart{QA}: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning",
    author = "Masry, Ahmed  and
      Long, Do  and
      Tan, Jia Qing  and
      Joty, Shafiq  and
      Hoque, Enamul",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-acl.177",
    doi = "10.18653/v1/2022.findings-acl.177",
    pages = "2263--2279",
}

chartqa's People

Contributors

ahmedmasryku avatar vis-nlp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.