lifan-yuan / craft Goto Github PK

View Code? Open in Web Editor NEW

41.0 41.0 2.0 14.93 MB

Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"

Python 100.00%

craft's People

Contributors

Stargazers

Watchers

Forkers

mengjin001 intersyncanalytics

craft's Issues

Some problem about the code

Dear author, when I run the code, there exists some error. I do not know whether is the problem of code. I will be appreciate it if you can help me solve this problem.

Plans to release model outputs?

Hi,

Interesting work! Are there any plans to release the model outputs (especially the induced library of tools and the test set solutions that use those tools) for the datasets studied in the paper?

About the reusable code snippets

Hi yuan:
Thanks for your awesome and insightful work! I would like to know whether and when the reusable code snippets in these tasks could be released. It helps a lot to facilitate future research. Thanks😊!

Heming

Some potential issue in codes

In the file https://github.com/lifan-yuan/CRAFT/blob/main/tab_and_math/retrieve_tools.py:

if table is None:
    prompt = retrieval_template_tab.format(table, query)
else:
    prompt = retrieval_template_math + "\n".join([f"Query: {query}", "Let's think step by step:"]) + "\n"

It seems that "if table is None:" is not correct. I think it should be:

if table is not None:
    prompt = retrieval_template_tab.format(table, query)
else:
    prompt = retrieval_template_math + "\n".join([f"Query: {query}", "Let's think step by step:"]) + "\n"

GQA evaluation

Thank you for the great work! I wanted to reproduce evaluation on GQA, however, I am not sure how I can do that.

I am working with the 1000 samples of GQA that you provided with the code and used gpt-3.5-turbo-0613.
However, I got an accuracy of 33.2, which is more than 10% lower than the reported accuracy.
I used 'results/craft_tools/5_deduplicated_tool.csv' as a toolset and used the default configuration on retrieval_gqa_config.yaml.

Can you help me reproduce the results?
Also, if possible, can you provide the output of the code you got using gpt-3.5?

Thanks in advance!

lifan-yuan / craft Goto Github PK

craft's People

Contributors

Stargazers

Watchers

Forkers

craft's Issues

Some problem about the code

Plans to release model outputs?

About the reusable code snippets

Some potential issue in codes

GQA evaluation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent