Giter Site home page Giter Site logo

关于reward about agenttuning HOT 2 CLOSED

thudm avatar thudm commented on June 12, 2024
关于reward

from agenttuning.

Comments (2)

lr-tsinghua11 avatar lr-tsinghua11 commented on June 12, 2024
  1. 对于 Held-in 任务是的,AgentLM在 sft 过程中学习 gpt-4 的高质量交互对话(reward 筛选),并在这些任务上表现不错(reward 评测),同时也能泛化到其余 Held-out 智能体任务上。
  2. 这 6 个 Held-in 任务为 AgentBench 子集,reward 计算方式均能在 AgentBench 论文附录中每个数据集的 Dataset details 中找到

from agenttuning.

DryPilgrim avatar DryPilgrim commented on June 12, 2024

请教以下问题,非常感谢您的回答:)

  1. AgentBench 论文附录中数据集的 Dataset details 中找不到reward的计算方式!?比如DB的C.1中只是提到”Metrics. We measure the Success Rate of agents in completing instructions.“ 这个不是计算trajectory的reward分数(而且AgentBench中DB数据并没有trajectory)。
  2. AgentBench中DB数据并没有交互轨迹,如何使用CoT with Actions呢?
  3. AgentBench 中为什么#Dev比#Test大呢?如DB的#Dev=60,#Test=300. 训练集比测试集大吗?
    》参考如下:
  • AgentBench使用CoT with Actions:
AgentBench论文第2节中说Since LLM-as-Agent requires LLMsstrong reasoning ability, CoT (Wei et al., 2022b), which has been considered a de facto strategy in related evaluation together with actions (Yao et al., 2023b), is also adopted in AGENTBENCH.“
  • AgentBench的DB数据:
{
    "description": "how many weeks did julie covington's \"don't cry for me argentina\" spend at the top of australia's singles chart?",
    "label": [
        "7"
    ],
    "create": {
        "database": "wikitq",
        "init": "wikitq_init.sql"
    },
    "table": {
        "table_name": "Music Chart History",
        "table_info": {
            "columns": [
                {
                    "name": "#",
                    "type": "INT"
                },
                {
                    "name": "Title",
                    "type": "TEXT"
                },
                {
                    "name": "Artist",
                    "type": "TEXT"
                },
                {
                    "name": "Highest pos. reached",
                    "type": "INT"
                },
                {
                    "name": "weeks at No. 1",
                    "type": "TEXT"
                }
            ],
            "rows": [
                [
                    "1.",
                    "\"Don't Cry for Me Argentina\"",
                    "Julie Covington",
                    "1",
                    "7"
                ],
                [
                    "2.",
                    "\"The Way You That You Do It\"",
                    "Pussyfoot",
                    "1",
                    "7"
                ],
                [
                    "3.",
                    "\"I Just Want to Be Your Everything\"",
                    "Andy Gibb",
                    "1",
                    "7"
                ],
                [
                    "4.",
                    "\"That's Rock and Roll\"",
                    "Shaun Cassidy",
                    "2",
                    ""
                ],
                [
                    "5.",
                    "\"Living Next Door to Alice\"",
                    "Smokie",
                    "2",
                    ""
                ],
                [
                    "6.",
                    "\"I Go To Rio\"",
                    "Peter Allen",
                    "1",
                    "5"
                ],
                [
                    "7.",
                    "\"Torn Between Two Lovers\"",
                    "Mary McGregor",
                    "1",
                    "4"
                ],
                [
                    "8.",
                    "\"Walk Right In\"",
                    "Dr Hook",
                    "1",
                    "5"
                ],
                [
                    "9.",
                    "\"You're Moving Out Today\"",
                    "Carole Bayer Sager",
                    "1",
                    "4"
                ],
                [
                    "10.",
                    "\"If You Leave Me Now\"",
                    "Chicago",
                    "1",
                    "5 (pkd #1 in 76 & 77)"
                ],
                [
                    "11.",
                    "\"Don't Give Up on Us\"",
                    "David Soul",
                    "1",
                    "3"
                ],
                [
                    "12.",
                    "\"Lido Shuffle\" / \"What Can I Say\"",
                    "Boz Scaggs",
                    "2",
                    ""
                ],
                [
                    "13.",
                    "\"You and Me\"",
                    "Alice Cooper",
                    "2",
                    ""
                ],
                [
                    "14.",
                    "\"Dance Little Lady Dance\"",
                    "Tina Charles",
                    "4",
                    ""
                ],
                [
                    "15.",
                    "\"When I Need You\"",
                    "Leo Sayer",
                    "8",
                    ""
                ],
                [
                    "16.",
                    "\"Don't Fall in Love\"",
                    "Ferrets",
                    "2",
                    ""
                ],
                [
                    "17.",
                    "\"I Feel Love\"",
                    "Donna Summer",
                    "1",
                    "1"
                ],
                [
                    "18.",
                    "\"Help is on its Way\"",
                    "Little River Band",
                    "1",
                    "1"
                ],
                [
                    "19.",
                    "\"You Gotta Get Up and Dance\"",
                    "Supercharge",
                    "3",
                    ""
                ],
                [
                    "20.",
                    "\"Mull of Kintyre\"",
                    "Wings",
                    "1",
                    "11 (pkd #1 in 77 & 78)"
                ],
                [
                    "21.",
                    "\"Don't Leave Me This Way\"",
                    "Thelma Houston",
                    "6",
                    ""
                ],
                [
                    "22.",
                    "\"Ain't Gonna Bump No More with No Big Fat Woman\"",
                    "Joe Tex",
                    "2",
                    ""
                ],
                [
                    "23.",
                    "\"You're in My Heart\"",
                    "Rod Stewart",
                    "1",
                    "1"
                ],
                [
                    "24.",
                    "\"Ma Baker\"",
                    "Boney M",
                    "5",
                    ""
                ],
                [
                    "25.",
                    "\"Lucille\"",
                    "Kenny Rogers",
                    "7",
                    ""
                ],
                [
                    "26.",
                    "\"Livin' la Vida Loca\"",
                    "Ricky Martin",
                    "1",
                    "3"
                ],
                [
                    "27.",
                    "\"Smooth\"",
                    "Santana featuring Rob Thomas",
                    "1",
                    "12"
                ],
                [
                    "28.",
                    "\"No Scrubs\"",
                    "TLC",
                    "3",
                    ""
                ],
                [
                    "29.",
                    "\"All Star\"",
                    "Smash Mouth",
                    "4",
                    ""
                ],
                [
                    "30.",
                    "\"Baby One More Time\"",
                    "Britney Spears",
                    "1",
                    "2"
                ],
                [
                    "31.",
                    "\"Say My Name\"",
                    "Destiny's Child",
                    "1",
                    "3"
                ],
                [
                    "32.",
                    "\"Genie in a Bottle\"",
                    "Christina Aguilera",
                    "1",
                    "5"
                ],
                [
                    "33.",
                    "\"Smooth Criminal\"",
                    "Michael Jackson",
                    "7",
                    ""
                ],
                [
                    "34.",
                    "\"I Will Always Love You\"",
                    "Whitney Houston",
                    "1",
                    "10"
                ],
                [
                    "35.",
                    "\"You Are Not Alone\"",
                    "Michael Jackson",
                    "1",
                    "5"
                ]
            ]
        }
    },
    "evaluation": "",
    "example": "",
    "type": [
        "other"
    ],
    "heads": [
        "#",
        "Title",
        "Artist",
        "Highest pos. reached",
        "weeks at No. 1"
    ],
    "add_description": "The name of this table is Music Chart History, and the headers of this table are #,Title,Artist,Highest pos. reached,weeks at No. 1.",
    "sql": {
        "query": "SELECT weeks_at_No_1 FROM `Music Chart History` WHERE Artist = 'Julie Covington' AND Title = 'Don\\'t Cry for Me Argentina';",
        "length": 123
    },
    "source": "wikitq"
}

from agenttuning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.