Giter Site home page Giter Site logo

[BUG] about convlab-2 HOT 8 CLOSED

thu-coai avatar thu-coai commented on June 27, 2024
[BUG]

from convlab-2.

Comments (8)

zqwerty avatar zqwerty commented on June 27, 2024

Thanks! We will check it in a few days.

from convlab-2.

zqwerty avatar zqwerty commented on June 27, 2024

Please check if you re-install ConvLab-2 since the code has changed. (pip install -e . in ConvLab-2)

from convlab-2.

sherlock1987 avatar sherlock1987 commented on June 27, 2024

Yeah, I have already change the code, sure about that.

from convlab-2.

zqwerty avatar zqwerty commented on June 27, 2024

I have tried these:
in mle/multiwoz/: python train.py
in ppo/: python train.py --load_path ../mle/multiwoz/save/best_mle
in policy/: python evaluate.py --model_name PPO --load_path ppo/save/9_ppo (I choose 9_ppo randomly)
and get:

All 100 0.73
reward: 11.143058441558441

from convlab-2.

sherlock1987 avatar sherlock1987 commented on June 27, 2024

Oh, that is cool, for me the PPO is also good, but for GDPL, there are still some problem exists.

from convlab-2.

LinZichuan avatar LinZichuan commented on June 27, 2024

I have tried these:
in mle/multiwoz/: python train.py
in ppo/: python train.py --load_path ../mle/multiwoz/save/best_mle
in policy/: python evaluate.py --model_name PPO --load_path ppo/save/9_ppo (I choose 9_ppo randomly)
and get:

All 100 0.73
reward: 11.143058441558441

Hi @zqwerty , Thanks for the tips. This works for the commit 2422980!
But when I run these commands for the latest commit (c6372b1), I found the problem is still unsolved. The performance of PPO will still rise to 65% at the beginning and then start dropping at later training stage (to around 35%).
Could you help look into it? Thanks!

from convlab-2.

zqwerty avatar zqwerty commented on June 27, 2024

move to #54

from convlab-2.

ShuoZhangXJTU avatar ShuoZhangXJTU commented on June 27, 2024

I have tried these:
in mle/multiwoz/: python train.py
in ppo/: python train.py --load_path ../mle/multiwoz/save/best_mle
in policy/: python evaluate.py --model_name PPO --load_path ppo/save/9_ppo (I choose 9_ppo randomly)
and get:

All 100 0.73
reward: 11.143058441558441

Hey guys, I tested PPO on the latest version of convab-2 today and I got a success rate of 84% that is way bigger than the reported 73%, I wonder if there are any mistakes? If not, I think the performance record should be updated.

from convlab-2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.