posgnu / rci-agent Goto Github PK

View Code? Open in Web Editor NEW

205.0 8.0 27.0 2.18 MB

A codebase for "Language Models can Solve Computer Tasks"

Home Page: https://posgnu.github.io/rci-web/

License: MIT License

Python 11.32% JavaScript 23.83% CSS 3.10% HTML 61.75%

large-language-models prompting reasoning

rci-agent's People

Contributors

Stargazers

Watchers

rci-agent's Issues

How to reproduce the results in Table 1&2?

Hi authors,

Thanks for your excellent work! I'm wondering if you could provide more details on how to produce the results in Table 1 and 2, for example:

Which LM did you use? (Section 3.1 said "InstructGPT-3 + RLHF", but which specific checkpoint?)
What are the hyperparameters?
What's the full prompt?
Would it be possible to provide the dataset split/model predictions/relevant code in order to reproduce the results?

Thanks in advance!

Executing custom tasks

Hello!
Thanks for this fantastic repo! The paper is also very amazing and insightful.

I was wondering whether it's possible to define custom HTML pages and tasks to be executed.
I was thinking of adding the custom HTML file in computergym/miniwob/miniwob_interface/html/miniwob directory and also including it in available_tasks.txt
Would this approach work? Please, let me know your thoughts about this.

Regards

How to reproduce the results in the paper?

Thanks for the good work.
How can I reproduce the results presented in Table 17 in the paper? What hyper-parameter should I set?
Thanks~

Any plans to port this to the new openAI python library?

Bug Report: Some task can't be addressed when Headless parameter is enabled

Bug Description:
I encountered a strange bug while running two almost similar benchmarks for the "enter-time" task (but it might consider many other tasks). The only difference between the two runs is the value of the "headless" parameter. In the first case, I set it to False (headless = False), while in the second case, I left it as True, which was the default value.

Steps to Reproduce:

Git clone the SNow_benchmark branch from my fork and follow the installation in the README.md
.
Set the headless parameter to False and run the benchmark for the "enter-time" :
python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding
Set the headless parameter to True and run the benchmark for the "enter-time" :
python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding --headless
Expected Behavior:
The results should be identical, regardless of the value of the headless parameter.

Actual Behavior:
When the headless parameter is disabled (set to False), certain actions are not allowed or counted, resulting in a failed task. (I could benchmark the task several time, I will still get the same results)

(RCI-agent-WSL) thirdcore@DESKTOP-5I4C9HH:~/rci-agent$ python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding 
False
INFO:root:Starting WebDriver Instance 0
INFO:selenium.webdriver.common.selenium_manager:Applicable driver not found; attempting to install with Selenium Manager (Beta)
INFO:root:Send a request to the language model from initialize_plan
INFO:root:The number of generated action steps: 4
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="tt"]
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: type 02:07PM
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="subbtn"]
success rate: 1.0
(RCI-agent-WSL) thirdcore@DESKTOP-5I4C9HH:~/rci-agent$ python main.py --env enter-time --llm chatgpt --num-episodes 1 --irci 1 --sgrounding --headless
True
INFO:root:Starting WebDriver Instance 0
INFO:selenium.webdriver.common.selenium_manager:Applicable driver not found; attempting to install with Selenium Manager (Beta)
INFO:root:Send a request to the language model from initialize_plan
INFO:root:The number of generated action steps: 4
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="tt"]
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: type 1017AM
INFO:root:Send a request to the language model from generate_action
INFO:root:The executed instruction: clickxpath //*[@id="subbtn"]
success rate: 0.0

Additional Information:
I'm still investigating the root cause of this issue. It seems that when the browser is not displayed, some actions are restricted or not properly accounted for, leading to the task failure. Did you have the same behavior, is there something that I'm missing ?

MINIWOB_BASE_URL environment variable not defined

Hello,

After installing the different required packages, I tried to run an experiment on the choose-list environment using the command line you provided :

python main.py --env choose-list --llm chatgpt --num-episodes 1 --irci 1 --sgrounding

And I got the following error :

(RCI-agent) PS C:\Users\Tom\Desktop\rci-agent> python main.py --env choose-list --llm chatgpt --num-episodes 1 --irci 1 --sgrounding
INFO:root:Starting WebDriver Instance 0
C:\Users\Tom\miniconda3\envs\RCI-agent\lib\site-packages\gym\utils\passive_env_checker.py:20: UserWarning: WARN: It seems a Box observation space is an image but the `dtype` is not `np.uint8`, actual type: int32. If the Box observation space is not an image, we recommend flattening the observation to have only a 1D vector.
  logger.warn(
C:\Users\Tom\miniconda3\envs\RCI-agent\lib\site-packages\gym\utils\passive_env_checker.py:174: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator.
  logger.warn(
C:\Users\Tom\miniconda3\envs\RCI-agent\lib\site-packages\gym\utils\passive_env_checker.py:187: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information.
  logger.warn(
INFO:selenium.webdriver.common.selenium_manager:Applicable driver not found; attempting to install with Selenium Manager (Beta)

DevTools listening on ws://127.0.0.1:51802/devtools/browser/6802c38e-d0ec-42dd-b55b-0574baaefc72
ERROR:root:Page did not load properly. Wrong MINIWOB_BASE_URL?
INFO:root:Closed instance 0
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\Tom\miniconda3\envs\RCI-agent\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "c:\users\tom\desktop\rci-agent\computergym\computergym\miniwob\miniwob_interface\instance.py", line 128, in run
    self.create_driver()
  File "c:\users\tom\desktop\rci-agent\computergym\computergym\miniwob\miniwob_interface\instance.py", line 200, in create_driver
    raise e
        (No symbol) [0x0034A304]
        (No symbol) [0x0035C482]
        (No symbol) [0x0034A0B6]
        (No symbol) [0x00327E08]
        (No symbol) [0x00328F2D]
        GetHandleVerifier [0x006C8E3A+2540266]
        GetHandleVerifier [0x00708959+2801161]
        GetHandleVerifier [0x0070295C+2776588]
        GetHandleVerifier [0x004F2280+612144]
        (No symbol) [0x00404F6C]
        (No symbol) [0x004011D8]
        (No symbol) [0x004012BB]
        (No symbol) [0x003F4857]
        BaseThreadInitThunk [0x76C97D59+25]
        RtlInitializeExceptionChain [0x77C9B74B+107]
        RtlClearBits [0x77C9B6CF+191]

I tried to run the file environment.py and got the same issue. The reason is that the environment variable MINIWOB_BASE_URL is not defined.

debug console :

import os 
base_url=os.environ.get("MINIWOB_BASE_URL")
print(base_url)
None

Am I supposed to define this environment variable myself?

PS : I'm running on Windows 11, Python 3.9.16, and I use a conda env.

posgnu / rci-agent Goto Github PK

rci-agent's People

Contributors

Stargazers

Watchers

Forkers

rci-agent's Issues

How to reproduce the results in Table 1&2?

Executing custom tasks

How to reproduce the results in the paper?

Any plans to port this to the new openAI python library?

Bug Report: Some task can't be addressed when Headless parameter is enabled

MINIWOB_BASE_URL environment variable not defined

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent