Giter Site home page Giter Site logo

posgnu / rci-agent Goto Github PK

View Code? Open in Web Editor NEW
205.0 8.0 27.0 2.18 MB

A codebase for "Language Models can Solve Computer Tasks"

Home Page: https://posgnu.github.io/rci-web/

License: MIT License

Python 11.32% JavaScript 23.83% CSS 3.10% HTML 61.75%
large-language-models prompting reasoning

rci-agent's People

Contributors

posgnu avatar robinrcm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rci-agent's Issues

How to reproduce the results in Table 1&2?

Hi authors,

Thanks for your excellent work! I'm wondering if you could provide more details on how to produce the results in Table 1 and 2, for example:

  • Which LM did you use? (Section 3.1 said "InstructGPT-3 + RLHF", but which specific checkpoint?)
  • What are the hyperparameters?
  • What's the full prompt?
  • Would it be possible to provide the dataset split/model predictions/relevant code in order to reproduce the results?

Thanks in advance!

Bug Report: Some task can't be addressed when Headless parameter is enabled

Bug Description:
I encountered a strange bug while running two almost similar benchmarks for the "enter-time" task (but it might consider many other tasks). The only difference between the two runs is the value of the "headless" parameter. In the first case, I set it to False (headless = False), while in the second case, I left it as True, which was the default value.

Steps to Reproduce:

  1. Git clone the SNow_benchmark branch from my fork and follow the installation in the README.md
    .
  2. Set the headless parameter to False and run the benchmark for the "enter-time" :
    python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding
  3. Set the headless parameter to True and run the benchmark for the "enter-time" :
    python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding --headless
    Expected Behavior:
    The results should be identical, regardless of the value of the headless parameter.

Actual Behavior:
When the headless parameter is disabled (set to False), certain actions are not allowed or counted, resulting in a failed task. (I could benchmark the task several time, I will still get the same results)

(RCI-agent-WSL) thirdcore@DESKTOP-5I4C9HH:~/rci-agent$ python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding 
False
INFO:root:Starting WebDriver Instance 0
INFO:selenium.webdriver.common.selenium_manager:Applicable driver not found; attempting to install with Selenium Manager (Beta)
INFO:root:Send a request to the language model from initialize_plan
INFO:root:The number of generated action steps: 4
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="tt"]
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: type 02:07PM
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="subbtn"]
success rate: 1.0
(RCI-agent-WSL) thirdcore@DESKTOP-5I4C9HH:~/rci-agent$ python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding --headless
True
INFO:root:Starting WebDriver Instance 0
INFO:selenium.webdriver.common.selenium_manager:Applicable driver not found; attempting to install with Selenium Manager (Beta)
INFO:root:Send a request to the language model from initialize_plan
INFO:root:The number of generated action steps: 4
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="tt"]
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: type 1017AM
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="subbtn"]
success rate: 0.0

Additional Information:
I'm still investigating the root cause of this issue. It seems that when the browser is not displayed, some actions are restricted or not properly accounted for, leading to the task failure. Did you have the same behavior, is there something that I'm missing ?

MINIWOB_BASE_URL environment variable not defined

Hello,

After installing the different required packages, I tried to run an experiment on the choose-list environment using the command line you provided :

python main.py --env choose-list --llm chatgpt --num-episodes 1 --irci 1 --sgrounding

And I got the following error :

(RCI-agent) PS C:\Users\Tom\Desktop\rci-agent> python main.py --env choose-list --llm chatgpt --num-episodes 1 --irci 1 --sgrounding
INFO:root:Starting WebDriver Instance 0
C:\Users\Tom\miniconda3\envs\RCI-agent\lib\site-packages\gym\utils\passive_env_checker.py:20: UserWarning: WARN: It seems a Box observation space is an image but the `dtype` is not `np.uint8`, actual type: int32. If the Box observation space is not an image, we recommend flattening the observation to have only a 1D vector.
  logger.warn(
C:\Users\Tom\miniconda3\envs\RCI-agent\lib\site-packages\gym\utils\passive_env_checker.py:174: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator.
  logger.warn(
C:\Users\Tom\miniconda3\envs\RCI-agent\lib\site-packages\gym\utils\passive_env_checker.py:187: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information.
  logger.warn(
INFO:selenium.webdriver.common.selenium_manager:Applicable driver not found; attempting to install with Selenium Manager (Beta)

DevTools listening on ws://127.0.0.1:51802/devtools/browser/6802c38e-d0ec-42dd-b55b-0574baaefc72
ERROR:root:Page did not load properly. Wrong MINIWOB_BASE_URL?
INFO:root:Closed instance 0
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\Tom\miniconda3\envs\RCI-agent\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "c:\users\tom\desktop\rci-agent\computergym\computergym\miniwob\miniwob_interface\instance.py", line 128, in run
    self.create_driver()
  File "c:\users\tom\desktop\rci-agent\computergym\computergym\miniwob\miniwob_interface\instance.py", line 200, in create_driver
    raise e
        (No symbol) [0x0034A304]
        (No symbol) [0x0035C482]
        (No symbol) [0x0034A0B6]
        (No symbol) [0x00327E08]
        (No symbol) [0x00328F2D]
        GetHandleVerifier [0x006C8E3A+2540266]
        GetHandleVerifier [0x00708959+2801161]
        GetHandleVerifier [0x0070295C+2776588]
        GetHandleVerifier [0x004F2280+612144]
        (No symbol) [0x00404F6C]
        (No symbol) [0x004011D8]
        (No symbol) [0x004012BB]
        (No symbol) [0x003F4857]
        BaseThreadInitThunk [0x76C97D59+25]
        RtlInitializeExceptionChain [0x77C9B74B+107]
        RtlClearBits [0x77C9B6CF+191]

I tried to run the file environment.py and got the same issue. The reason is that the environment variable MINIWOB_BASE_URL is not defined.

debug console :

import os 
base_url=os.environ.get("MINIWOB_BASE_URL")
print(base_url)
None

Am I supposed to define this environment variable myself?

PS : I'm running on Windows 11, Python 3.9.16, and I use a conda env.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.