Thanks for releasing code! I'm trying to understand how this impleme

multi-value slot values about ic-dst HOT 3 OPEN

rizar commented on July 2, 2024

multi-value slot values

from ic-dst.

Comments (3)

Yushi-Hu commented on July 2, 2024

Hi Dzmitry,

Thanks for your interest in our work!
Yes, you are correct that in this implementation, a slot is considered to be correctly predicted if at least one of the gold slot values is predicted. I followed the evaluation pipeline from a really popular prior work TODBERT (The evaluation implementation is in the "evaluate" function here). The prior work on MultiWOZ 2.4 also follows this ASSIST-DST

For MultiWOZ 2.1 and 2.2 this does not make much difference because these multi-value slots are not annotated well, and for most of the cases, only one value is there. I think that's the reason that most prior works just ignore this issue. For MultiWOZ 2.4 this makes a bigger difference because the annotators find that many slots actually have multiple values. Now in DST tasks, people are assuming that each slot only corresponds to one value. I totally agree that we should rethink carefully on this assumption.

from ic-dst.

rizar commented on July 2, 2024

Thanks Yushi for your fast response!

Yes, indeed SimpleTOD evaluation code compares multi-values in the same way as yours. Do you think some other implementations (including ASSIST-DST and the links I posted above) are effectively more strict and require the entire multi-value literal to be predicted correctly?

Thanks for the explanation about 2.1 and 2.4, I will take a look at the exact percentage of multi-values in different MultiWOZ versions.

As for what the right evaluation approach should be, that depends on the exact semantics of the "|" operator. My understanding is that if "|" is logical OR, then all values should be predicted correctly. But if somewhere in the dataset it is used to indicate alternative spellings, then the "one-of" evaluation approach would be more appropriate. As usual it all boils down to there being a consistent and well-documented annotation approach, something that MultiWOZ still seems to be lacking.

from ic-dst.

Yushi-Hu commented on July 2, 2024

Thanks rizar for your insights!

As for the first question, I checked some implementations, and in most cases, they didn't handle the multi-label scenario carefully. Some implementations just use the first possible value as the gold answer. I agree with the way ASSIST-DST handles the problem ---- normalize the labels by sorting the possible values. It effectively gives a more strict evaluation.

For your second comment, I totally agree! It boils down to the need for a well-documented annotation approach.

from ic-dst.

multi-value slot values about ic-dst HOT 3 OPEN

Comments (3)

Related Issues (2)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent