conll / reference-coreference-scorers Goto Github PK
View Code? Open in Web Editor NEWThis is the reference implementation of commonly used coreference metrics.
Home Page: http://conll.github.io/reference-coreference-scorers
License: Other
This is the reference implementation of commonly used coreference metrics.
Home Page: http://conll.github.io/reference-coreference-scorers
License: Other
Blanc scorer is implemented in scorer.pl but not in scorer.bat
Is there a reason?
does perl scorer.pl blanc ...
work even on Windows?
When I evaluate the Tüba-D/Z test set from SemEval 2010 shared task, I get the following error:
Found too many repeated mentions (> 10) in the response, so refusing to score. Please fix the output.
Does anyone know how to fix it?
Thanks
Hi, is there a way to represent and score disjoint multi-word units? I get the same result when i try both (A and B) options below, so presumably they are both joint multi-word units?
Case A
took 0 4 (1
her 5 8 -
life 9 13 1)
Case B
took 0 4 (1
her 5 8 1
life 9 13 1)
Thanks!
Filip
As far as I understand this is the new repository location for the software linked in the following paper.
Pradhan, S., Luo, X., Recasens, M., Hovy, E., Ng, V., & Strube, M. (2014). Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation. In Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 30–35). http://www.aclweb.org/anthology/P/P14/P14-2006
If this is the case, to make citations easier I would suggest to add this reference to the README file that is visible on the entry page of the repository.
Hi,
I feel that CEAFe precision and recall are reversed. I feel so because the trend of these values are consistently opposite to the trend of precision and recall in B-cub and MUC metrics.
Can you please check ?
Thanks,
Joe
It does not output these metrics anymore, neither when specifying them explicitly nor when asking to compute "all" metrics.
I've come accross the paper from ACL's website (https://www.aclweb.org/anthology/P16-1060/) which states that the traditional methods from conll2012 scripts are not so great methods to evaluate the coreference resolution task, and also introduce the LEA scorer (which has been implemented in this repository). However, the recent publications of this task are yet mainly evaluated by the old methods, and I can't see the reason why. Would be grateful for an appropriate answer.
Thanks :)
During a recent preparation of a shared task, I found that the metrics have slightly different behavior against no coreference case. MUC will report 0% F1 for coreference, while the rest (blanc, ceaf, bcub) reports 100%
A very small test case can be the following, we can check the behavior by running the scorer use this file against itself.
0001 0 A (0)
0001 1 B (1)
The inconsistency may hurt when one want to compute a document level average, the 0% score produced by MUC can make the result change dramatically. In addition, I think it is reasonable to score it as 100% when both the key and response are identical.
Are there suggested practices for cross-document coreference? My thought was just to call each multi-document set one document. Let me know if there is any support or best practice.
I've come accross the paper from ACL's website (https://www.aclweb.org/anthology/P16-1060/) which states that the traditional methods from conll2012 scripts are not so great methods to evaluate the coreference resolution task, and also introduce the LEA scorer (which has been implemented in this repository). However, the recent publications of this task are yet mainly evaluated by the old methods, and I can't see the reason why. Would be grateful for an appropriate answer.
Thanks :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.