Giter Site home page Giter Site logo

atoke's Introduction

ATOKE

This is the repository for our paper [AAAI 2024] History Matters: Temporal Knowledge Editing in Large Language Model.

Datasets

Overview

ATOKE dataset contains three temporal knowledge editing datasets.

The datasets are included in datasets/. There are three files:

  • AToKe-SE.json: Single Edit dataset.
  • AToKe-ME.json: Multiple Edits dataset.
  • AToKe-EE.json: Extending Edit dataset.

Data format

The dataset is saved as a list of dicts, each of which represents a data instance. We take AToKe-ME, the most complex format, as an example, while AToKe-SE and AToKe-EE are both simplifications of AToKe-ME. An example in AToKe-ME is shown below.

{
    "case_id": 3,
    "requested_rewrite": [
        {   ### Note: Multi-step editing - first step
            "prompt": "{}'s team is",
            "time_prompt": "From 1947 to 1948, {} is a player of",
            "target_new": {"str": "Southampton F.C.", "id": "<1!Phb4wLjn>"},
            "target_true": {"str": "Manchester United F.C.", "id": "<zUrH0rVXjH>"
            },
            "subject": "Billy Wrigglesworth",
            "relation": "<playsFor>", "relation_id": "<Xu7>",
            "time_new": {"since": 1947, "until": 1948, "gptj": false},
            "time_true": {"since": 1937, "until": 1947, "gptj": true},
            "evaluation": {
                "completion": "Billy Wrigglesworth's team is",
                "questions": "Which club does Billy Wrigglesworth play for?",
                "paraphrase_completion": "Billy Wrigglesworth plays for",
                "time_completion": "From 1947 to 1948, Billy Wrigglesworth is a player of",
                "time_questions": "Which club does Billy Wrigglesworth affiliate with from 1947 to 1948?",
                "paraphrase_time_completion": "From 1947 to 1948, Billy Wrigglesworth plays for"
            }
        },
        {   ### Note: Multi-step editing - second step
            "prompt": "{}'s team is",
            "time_prompt": "From 1948 to 1953, {} is a player of",
            "target_new": {"str": "Arsenal F.C.", "id": "<6zNA4ZR7x?>"},
            "target_true": {"str": "Southampton F.C.", "id": "<1!Phb4wLjn>"},
            "subject": "Billy Wrigglesworth",
            "relation": "<playsFor>", "relation_id": "<Xu7>",
            "time_new": {"since": 1948, "until": 1953, "gptj": false},
            "time_true": {"since": 1947, "until": 1948, "gptj": false},
            "evaluation": {
                "completion": "Billy Wrigglesworth's team is",
                "questions": "Which club does Billy Wrigglesworth play for?",
                "paraphrase_completion": "Billy Wrigglesworth plays for",
                "time_completion": "From 1948 to 1953, Billy Wrigglesworth is a player of",
                "time_questions": "Which club does Billy Wrigglesworth affiliate with from 1948 to 1953?",
                "paraphrase_time_completion": "From 1948 to 1953, Billy Wrigglesworth plays for"
            }
        }
    ],
    "history_evaluation": [
        {
            "completion": "Billy Wrigglesworth used to play for",
            "questions": "Which team did Billy Wrigglesworth play for before?",
            "paraphrase_completion": "Billy Wrigglesworth's previous team was",
            "time_completion": "From 1937 to 1947, Billy Wrigglesworth's team was",
            "time_questions": "Which club did Billy Wrigglesworth play for from 1937 to 1947?",
            "paraphrase_time_completion": "From 1937 to 1947, Billy Wrigglesworth played for"
        },
        {
            "completion": "Billy Wrigglesworth used to play for",
            "questions": "Which team did Billy Wrigglesworth play for before?",
            "paraphrase_completion": "Billy Wrigglesworth's previous team was",
            "time_completion": "From 1947 to 1948, Billy Wrigglesworth's team was",
            "time_questions": "Which club did Billy Wrigglesworth play for from 1947 to 1948?",
            "paraphrase_time_completion": "From 1947 to 1948, Billy Wrigglesworth played for"
        }
    ],
    "answer": [["Manchester United F.C."], ["Southampton F.C."]],
    "answer_alias": [["Manchester Red Devils", ...],["Soton FC", ...]],
    "new_answer": [["Southampton F.C."], ["Arsenal F.C."]],
    "new_answer_alias": [["Soton FC", ...],["Arsenal football club", ...]],
}
  • requested_rewrite: The list of editing facts we want to inject into the model: only one element in AToKe-SE and AToKe-EE; multiple consecutive-time fact lists in AToKe-ME. (In general, we follow the format of the MQuAKE dataset. In particular, we place the question of the current time here.)
    • time_true and time_new: The current time interval of the model and the new time interval we want to inject into the model. "gptj: true" indicates the current time interval before editing the model.
    • evaluation: includes both completion (cloze-style for GPT-j) and natural language QA formats, and includes both direct questions about the current time and questions that include current time intervals. paraphrase_completion is used to eliminate the effects of different textual expressions
  • history_evaluation: includes both completion and natural language QA formats, and includes both direct questions about past time and questions that include past time intervals
  • answer and answer_alias: the gold answer before injecting new facts into language models. answer_alias is a list of aliases of the answer.
  • new_answer and new_answer_alias: the gold answer after injecting new facts into language models.

For AToKe-EE only:

  • The object before and after editing is unchanged and the time interval is extended.
  • There is no history_evaluation, because the facts don't change, just the time interval of the facts.

Evaluation

We evaluate the model's ability to edit temporal knowledge in terms of two dimensions: the time at which the model is questioned (historical, current) and the form of the time at which it is questioned (relative, explicit).

Historical Current
Relative Historical Relative time Question Score (HRS) Current Relative time Question Score (CRS)
Explicit Historical Explicit time Question Score (HES) Current Explicit time Question Score (CES)

In addition, we eliminate the effect of question description by paraphrasing the question. We measure this as the Paraphrase Question Score.

Examples are as follows:

  • time:
    • historical: "Billy Wrigglesworth used to play for"
    • current: "Billy Wrigglesworth's team is"
  • time form:
    • relative: "Billy Wrigglesworth's team is"
    • explicit: "From 1947 to 1948, Billy Wrigglesworth is a player of"
    • paraphrase: "From 1947 to 1948, Billy Wrigglesworth plays for"

METO

Issues or Questions?

If you come across any issues while using the datasets or have any questions regarding the repository or the paper, please don't hesitate to reach out. You can contact Xunjian Yin at ([email protected]) or create an issue.

Citation

If you use our code in your research, please cite our work:

@article{yin2023history,
  title={History Matters: Temporal Knowledge Editing in Large Language Model},
  author={Yin, Xunjian and Jiang, Jin and Yang, Liming and Wan, Xiaojun},
  journal={arXiv preprint arXiv:2312.05497},
  year={2023}
}

atoke's People

Contributors

arvid-pku avatar jiangjin1999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

jiangjin1999

atoke's Issues

A question about how to verify the facts by GPT-J

Hi !
I hope this message finds you well. I was reviewing the ATOKE on GitHub. I appreciate the effort you've put into this, but I have a few questions for clarification.

In your article, during constructing the benchmark, GPT-J is used to verify whether the facts model has. I have some questions about this. Are you directly verifying the decoded results with the original answers, or are you filtering through the probability set threshold of the decoded results. In addition, how is the verification of the original answer non-sing-token completed?

I'm eager to understand this better. Your assistance would be greatly appreciated.

Thank you for your time and consideration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.