Comments (5)
def reformat_state(state):
if 'belief_state' in state:
state = state['belief_state']
new_state = []
for domain in state.keys():
domain_data = state[domain]
if 'semi' in domain_data:
domain_data = domain_data['semi']
for slot in domain_data.keys():
val = domain_data[slot]
if val is not None and val not in ['', 'not mentioned', '未提及', '未提到', '没有提到']:
new_state.append(domain + '-' + slot + '-' + val)
# lower
new_state = [item.lower() for item in new_state]
return new_state
This code is in dst/evaluate.py
, I want to know about dialog state, can it be calculated using only the semi
part?
from convlab-2.
@function2-llx please look at this
from convlab-2.
Hi! When I tried to evaluate the translation training SUMBT model, I found that the
eval mode
was not set, which had a certain impact on the results. According to the results of my local test, I found that there is a difference of two points on theMutliWOZ-zh human-val
dataset. I think it may be necessary to re-evaluate the SUMBT model after modifying the code. The current results
are not real model performance.My Local Result on MultiWOZ-zh
not set eval mode
{'Joint Acc': 0.4821722435545804, 'Turn Acc': 0.9738983360760534, 'Joint F1': 0.8826705748001639}
set eval mode
{'Joint Acc': 0.49972572682391664, 'Turn Acc': 0.9751935149631128, 'Joint F1': 0.8885012208542876}
How do you get the above result? by running dst/evaluate.py
?
from convlab-2.
Hi! When I tried to evaluate the translation training SUMBT model, I found that the
eval mode
was not set, which had a certain impact on the results. According to the results of my local test, I found that there is a difference of two points on theMutliWOZ-zh human-val
dataset. I think it may be necessary to re-evaluate the SUMBT model after modifying the code. The current results
are not real model performance.
My Local Result on MultiWOZ-zh
not set eval mode
{'Joint Acc': 0.4821722435545804, 'Turn Acc': 0.9738983360760534, 'Joint F1': 0.8826705748001639}
set eval mode
{'Joint Acc': 0.49972572682391664, 'Turn Acc': 0.9751935149631128, 'Joint F1': 0.8885012208542876}
How do you get the above result? by running
dst/evaluate.py
?
Yes, but the results I report are not using the pre training model provided by the project. However, using the pre training model provided by the project, I also got similar results.
from convlab-2.
update SUMBT & test results #69
from convlab-2.
Related Issues (20)
- [BUG] `try` does NOT work HOT 1
- Training Data HOT 1
- RULEDST evaluation HOT 9
- [Maintenance] pip takes too long in finding boto3 versions (100+ tries & still failed to install) HOT 7
- [BUG] Failed to build agent on CoLab HOT 2
- Spacy latest version compatibility HOT 2
- Different end-to-end results of DAMD HOT 1
- spacy tokenizer HOT 1
- Integrating with my own dataset HOT 3
- [BUG] 关于中文数据集crosswoz上policy-rule代码的问题 HOT 1
- BERTNLU postprocess.py 为什么可以通过 if intent_logits[j] > 0: 来获得intent的预测呢? HOT 2
- [Maintenance] docker for m1 mac, please HOT 1
- Issue in BertNLU HOT 4
- Unable to get the pretrained BERTNLU model. HOT 7
- Unable to get the pretrained BERTNLU model even after updating the URLs HOT 7
- installation python 3.7 on CentOS7 failed[BUG] HOT 2
- The Link to the datasets of LAUG is unreachable HOT 3
- [BUG] Failed to build tokenizers HOT 3
- 是否可以使用不经过训练后的BERTNLU呢 HOT 7
- Why the end-2-end performance mismatch with component level evaluation? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from convlab-2.