Giter Site home page Giter Site logo

lupantech / chameleon-llm Goto Github PK

View Code? Open in Web Editor NEW
1.0K 1.0K 84.0 226.64 MB

Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".

Home Page: https://chameleon-llm.github.io

License: Apache License 2.0

Shell 0.06% Jupyter Notebook 75.29% Python 24.64%
ai chatgpt gpt-4 llm openai python tool

chameleon-llm's People

Contributors

guspan-tanadi avatar lupantech avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chameleon-llm's Issues

ModuleNotFoundError: No module named 'func_timeout'

The example on the main page does not seem to work.
Am I missing something?

pip install -r requirements.txt
cd run_scienceqa

python run.py \
> --model chameleon \
> --label chameleon_gpt4 \
> --policy_engine gpt-4 \
> --kr_engine gpt-4 \
> --qg_engine gpt-4 \
> --sg_engine gpt-4 \
> --test_split test \
> --test_number -1
Traceback (most recent call last):
  File "/home/marc/code/chameleon-llm/run_scienceqa/run.py", line 11, in <module>
    from utilities import *
  File "/home/marc/code/chameleon-llm/utilities.py", line 4, in <module>
    import func_timeout
ModuleNotFoundError: No module named 'func_timeout'

What this mean?

../results/scienceqa/chameleon_chatgpt_minitest.json
Result file exists: ../results/scienceqa/chameleon_chatgpt_minitest.json
Count: 100, Correct: 44, Wrong: 56

Discrepancy in accuracy on minitest set for gpt-3.5-turbo

Hi @lupantech, thank you for your excellent work.

I observed inconsistent accuracies on the minitest set. Specifically, I got acc_average values of 49.29 for gpt-3.5-turbo and 46.93 for Llama-2-7b, while gpt-3.5's reported test set accuracy is 79.93.

Upon analyzing the "true_false" values in chameleon_chatgpt_test_cache.jsonl with matching pids in minitest set, I calculated an accuracy of 0.7948.

Could you help to clarify this discrepancy or share your minitest evaluation results, if available?

Questions on TabWMP

Thank you for your work. When running the TabWMP dataset, I found that some of the examples are executing very slowly, is there any way you can speed them up?
It only takes 4-5 hours to complete on the Science QA dataset, but it can take 20 times longer on the TabWMP dataset.
Looking forward to hearing from you, thanks!

An issue: program_generator on TabMWP

Hi, thanks for your awesome work.

I use GPT-3.5 Turbo as my model. When I run run.py in run_tabmwp, my program_generator generates program for the first question in TabMWP

question description:
Hannah baked cookies each day for a bake sale. How many more cookies did Hannah bake on Saturday than on Sunday?

My program generated:
image

Your sample program:
cookies_baked = {"Friday": 163, "Saturday": 281, "Sunday": 263}\nans = cookies_baked["Saturday"] - cookies_baked["Sunday"]

I'd like to ask what could be the reason for my generated program to look so unreasonable, even though I run it with exactly the same parameters as you did.This situation frequently occurs in other data of TabMWP, and it can be said that all the programs I generated are unreasonable, resulting in an extremely low average accuracy.

python run.py
--model chameleon
--label chameleon_chatgpt
--test_split test
--policy_engine gpt-3.5-turbo
--rl_engine gpt-3.5-turbo
--cl_engine gpt-3.5-turbo
--tv_engine gpt-3.5-turbo
--kr_engine gpt-3.5-turbo
--sg_engine gpt-3.5-turbo
--pg_engine gpt-3.5-turbo
--test_number 1000
--rl_cell_threshold 18
--cl_cell_threshold 18

For open VQA task

Hi, thanks for your great work. I want to ask could this method apply to open VQA task where an open answer is needed instead of choosing from a given list?

Planner issues

Hello, I would like to ask if the planner here refers to the natural language planner or is there another name?

捕获

the code is a bit wrong?

The update_modules function in run_scienceqa/model.py, default_modules = eval(["solution_generator", "answer_generator"]), here executes eval() on a list

Request for Release of Image Captioner and Text Detector modules

Hi,

I am writing to kindly request an update regarding the release of the code for the Image Captioner and Text Detector modules as promised in the README file of the project.

**" For the current version, the results for the Image Captioner and Text Detector are off-the-shelf and stored in data/scienceqa/captions.json and data/scienceqa/ocrs.json, respectively. The live calling these two modules are coming soon! "

As an eager user of the project, I'm excited to explore and utilize this module's functionality. Therefore, I am reaching out to kindly inquire about the current status of the code release.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.