Describe the bug This platform could not train a MLE model. W

move to <a class="issue-link js-issue-link" data-error-text="Failed to load title" dat

[BUG] about convlab-2 HOT 8 CLOSED

thu-coai commented on June 27, 2024

[BUG]

from convlab-2.

Comments (8)

zqwerty commented on June 27, 2024

Thanks! We will check it in a few days.

from convlab-2.

zqwerty commented on June 27, 2024

Please check if you re-install ConvLab-2 since the code has changed. (pip install -e . in ConvLab-2)

from convlab-2.

sherlock1987 commented on June 27, 2024

Yeah, I have already change the code, sure about that.

from convlab-2.

zqwerty commented on June 27, 2024

I have tried these:
in mle/multiwoz/: python train.py
in ppo/: python train.py --load_path ../mle/multiwoz/save/best_mle
in policy/: python evaluate.py --model_name PPO --load_path ppo/save/9_ppo (I choose 9_ppo randomly)
and get:

All 100 0.73
reward: 11.143058441558441

from convlab-2.

sherlock1987 commented on June 27, 2024

Oh, that is cool, for me the PPO is also good, but for GDPL, there are still some problem exists.

from convlab-2.

LinZichuan commented on June 27, 2024

I have tried these:
in mle/multiwoz/: python train.py
in ppo/: python train.py --load_path ../mle/multiwoz/save/best_mle
in policy/: python evaluate.py --model_name PPO --load_path ppo/save/9_ppo (I choose 9_ppo randomly)
and get:
All 100 0.73
reward: 11.143058441558441

Hi @zqwerty , Thanks for the tips. This works for the commit 2422980!
But when I run these commands for the latest commit (c6372b1), I found the problem is still unsolved. The performance of PPO will still rise to 65% at the beginning and then start dropping at later training stage (to around 35%).
Could you help look into it? Thanks!

from convlab-2.

zqwerty commented on June 27, 2024

move to #54

from convlab-2.

ShuoZhangXJTU commented on June 27, 2024

I have tried these:
in mle/multiwoz/: python train.py
in ppo/: python train.py --load_path ../mle/multiwoz/save/best_mle
in policy/: python evaluate.py --model_name PPO --load_path ppo/save/9_ppo (I choose 9_ppo randomly)
and get:
All 100 0.73
reward: 11.143058441558441

Hey guys, I tested PPO on the latest version of convab-2 today and I got a success rate of 84% that is way bigger than the reported 73%, I wonder if there are any mistakes? If not, I think the performance record should be updated.

from convlab-2.

Recommend Projects

[BUG] about convlab-2 HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent