Giter Site home page Giter Site logo

lupantech / scienceqa Goto Github PK

View Code? Open in Web Editor NEW
552.0 552.0 64.0 21.99 MB

Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".

License: MIT License

Python 97.57% Shell 2.43%

scienceqa's People

Contributors

guspan-tanadi avatar lupantech avatar tonyxia2001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scienceqa's Issues

Sometimes the images are not related to the question at all

Hi expert, is it by design?
For example, the picture is only some green plats in the small strays, but the real way to solve the problem counts on the language understanding.
"180":{ "question":"Which of the following was a dependent variable in this experiment?", "choices":[ "the temperature of the heating pad", "the number of days until a seed germinated" ], "answer":1, "hint":"The passage below describes an experiment. Read the passage and think about the variables that are described.\n\nKenneth wanted to grow cucumbers from seeds. He read that using a heating pad to heat up potting soil could help make seeds germinate, or sprout, faster. Kenneth wondered whether the temperature of the heating pad would affect how quickly the seeds germinated.\nKenneth prepared two potting trays, each made up of ten small pots of soil. He planted one cucumber seed in each small pot and arranged the potting trays near a sunny window. He set an electric heating pad to 75\u00b0F and placed it under one potting tray. He set a second heating pad to 85\u00b0F and placed it under the other potting tray. Kenneth observed the pots daily, and he counted the number of days it took until a seed germinated in each pot.\nHint: An independent variable is a variable whose effect you are investigating. A dependent variable is a variable that you measure.\nFigure: germinating plants in a potting tray.", "image":"image.png", "task":"closed choice", "grade":"grade6", "subject":"natural science", "topic":"science-and-engineering-practices", "category":"Designing experiments", "skill":"Identify independent and dependent variables", "lecture":"Experiments have variables, or parts that change. You can design an experiment to find out how one variable affects another variable. For example, imagine that you want to find out if fertilizer affects the number of tomatoes a tomato plant grows. To answer this question, you decide to set up two equal groups of tomato plants. Then, you add fertilizer to the soil of the plants in one group but not in the other group. Later, you measure the effect of the fertilizer by counting the number of tomatoes on each plant.\nIn this experiment, the amount of fertilizer added to the soil and the number of tomatoes were both variables.\nThe amount of fertilizer added to the soil was an independent variable because it was the variable whose effect you were investigating. This type of variable is called independent because its value does not depend on what happens after the experiment begins. Instead, you decided to give fertilizer to some plants and not to others.\nThe number of tomatoes was a dependent variable because it was the variable you were measuring. This type of variable is called dependent because its value can depend on what happens in the experiment.", "solution":"", "split":"test" },

How did you limit the answer generation space to just the options you provided?

Hello,

I am trying to finetune MiniGPT4 on a Student engagement dataset. It labels are image ids and captions about the image. I managed to get it to perform okay on the dataset. However, It would hallucinate or give answer outside of what I would wish for it to answer as MiniGPT-4 is used for Freeform open-ended VQA.

How did you limit the answer generation space to just the options you provided? The paper mentioned about using a linear classifier at the end to limit the output tokens to just those answer options?

Tony,

About experiments code

Hey, I think your work is very meaningful, and I found in the paper that you experimented on the UnifiedQA model, but this part of the code is not currently available in the repo.
Will you release the code next?

Topic-level evaluation result

Hi, this is a great dataset. Thanks for the hard work.

Instead of subject-level result, I am interested in topic-level evaluation result such as biology, chemistry, etc, and thus I wonder:

  1. How do I compute accuracy for topic level?
  2. Do the models on the leaderboard provide topic-level results or provide the raw result files so that people can compute topic-level accuracy for them?

Thanks.

Question about lecture

Hi expert, I've looked at the problems.json, the lecture looks not reasonable, how the lecture is generated? for example: for qid 5, the lecture is People can use the engineering-design process to develop solutions to problems. One step in the process is testing if a potential solution meets the requirements of the design. How can you determine what a test can show? You need to figure out what was tested and what was measured.nImagine an engineer needs to design a bridge for a windy location. She wants to make sure the bridge will not move too much in high wind. So, she builds a smaller prototype, or model, of a bridge. Then, she exposes the prototype to high winds and measures how much the bridge moves.nFirst, identify what was tested. A test can examine one design, or it may compare multiple prototypes to each other. In the test described above, the engineer tested a prototype of a bridge in high wind.nThen, identify what the test measured. One of the criteria for the bridge was that it not move too much in high winds. The test measured how much the prototype bridge moved.nTests can show how well one or more designs meet the criteria. The test described above can show whether the bridge would move too much in high winds..",
However, the question is something about Gordon's test

Hugging Face dataset

Hey, awesome work! I wanted to make this more accessible by putting on the huggingface hub: https://huggingface.co/datasets/derek-thomas/ScienceQA

There were a lot of fields in the description card that I filled in as best as I could. Would you consider reviewing this and after it meets your expectations could you add a link on your github repo?

Thanks,
Derek

Google drive link

It seems that the link of Google Drive cannot be downloaded, is there a download link of one drive?

Question about "context" in the data set

image

First of all, thank you for open source a very good dataset.

There is the above image on your official website, and I am a little confused about the "context" in the red box. I didn't find the relevant keys in the "problems.json" file you provided. Can you tell me which parts make up "context" ๏ผŸ

I would like to incorporate your dataset into my multimodal work. I would be very grateful if you could reply.

Questions about the process of dataset building

Thanks for your awesome work! It paves the way towards multimodal reasoning agents. I noticed that the questions are collected from IXL learning. Since IXL learning is a website, would you mind explaining in detail how do you get the data from it? And the details you process the crawled data?

Thanks in advance :)

Prompt in zero-shot setting.

Hi, nice work!
My questions are what is the prompt for GPT-3 zero-shot setting and how to ensure that the output of the model conforms to the parsable form.

The image format is dict not PIL.

Hi, I load the dataset using the following command:

data = datasets.load_dataset('derek-thomas/ScienceQA', 'test')
for sample in data['test']:
      sample['image']

But the sample['image'] in the data is the format of a dictionary with keys of 'bytes' and 'path', which is not a PIL image. And I don't know how to process it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.