Giter Site home page Giter Site logo

soujanyaporia / cicero Goto Github PK

View Code? Open in Web Editor NEW

This project forked from declare-lab/cicero

0.0 0.0 0.0 5.36 MB

This repository contains the dataset and codes used in our ACL 2022 paper CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues

License: MIT License

Python 100.00%

cicero's Introduction

CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues

CICERO Inferences

Introducing CICERO, a new dataset for dialogue reasoning with contextualized commonsense inference. It contains 53K inferences for five commonsense dimensions – cause, subsequent event, prerequisite, motivation, and emotional reaction collected from  5.6K dialogues. To show the usefulness of CICERO for dialogue reasoning, we design several challenging generative and multichoice answer selection tasks for state-of-the-art NLP models to solve.

Read the paper

Data Format

The CICERO dataset can be found in the data directory. Each line of the files is a json object indicating a single instance. The json objects have the following key-value pairs:

Key Value
ID Dialogue ID with dataset indicator.
Dialogue Utterances of the dialogue in a list.
Target Target utterance.
Question One of the five questions (inference types).
Choices Five possible answer choices in a list. One of the answers is
human written. The other four answers are machine generated
and selected through the Adversarial Filtering (AF) algorithm.
Human Written Answer Index of the human written answer in a
single element list. Index starts from 0.
Correct Answers List of all correct answers indicated as plausible
or speculatively correct by the human annotators.
Includes the index of the human written answer.

An example of the data is shown below.

{
    "ID": "daily-dialogue-1291",
    "Dialogue": [
        "A: Hello , is there anything I can do for you ?",
        "B: Yes . I would like to check in .",
        "A: Have you made a reservation ?",
        "B: Yes . I am Belen .",
        "A: So your room number is 201 . Are you a member of our hotel ?",
        "B: No , what's the difference ?",
        "A: Well , we offer a 10 % charge for our members ."
    ],
    "Target": "Well , we offer a 10 % charge for our members .",
    "Question": "What subsequent event happens or could happen following the target?",
    "Choices": [
        "For future discounts at the hotel, the listener takes a credit card at the hotel.",
        "The listener is not enrolled in a hotel membership.",
        "For future discounts at the airport, the listener takes a membership at the airport.",
        "For future discounts at the hotel, the listener takes a membership at the hotel.",
        "The listener doesn't have a membership to the hotel."
    ],
    "Human Written Answer": [
        3
    ],
    "Correct Answers": [
        3
    ]
}

Experiments

The details of the answer selection (MCQ) experiments can be found here. The details of the answer generation (NLG) experiments can be found here.

CICERO Tasks

Citation

CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues. Deepanway Ghosal and Siqi Shen and Navonil Majumder and Rada Mihalcea and Soujanya Poria. ACL 2022.

cicero's People

Contributors

deepanwayx avatar dependabot[bot] avatar soujanyaporia avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.