Giter Site home page Giter Site logo

evaluation_measures's Introduction

About

evaluation_measures is a framework that implements evaluation measures for IR systems. Following algorithm are implicated.

  • MRR (Mean Reciprocal Rank)

E.M. Voorhees (1999). "Proceedings of the 8th Text Retrieval Conference". TREC-8 Question Answering Track Report. pp. 77–82.

  • DCG (Discounted cumulative gain) and nDCG (Normalized Discounted cumulative gain)

Kalervo Jarvelin, Jaana Kekalainen: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4), 422–446 (2002)Cumulated gain-based evaluation of IR techniques

  • ERR (Expected Reciprocal Rank for Graded Relevance)

Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM conference on Information and knowledge management (CIKM '09). ACM, New York, NY, USA, 621-630. DOI=10.1145/1645953.1646033 http://doi.acm.org/10.1145/1645953.1646033

  • session nDCG

K. J ̈arvelin, S. L. Price, L. M. L. Delcambre, and M. L. Nielsen. Discounted cumulated gain based evaluation of multiple-query ir sessions. In ECIR, pages 4–15, 2008.

  • session ERR

Our original method.

  • q-measure

Tetsuya Sakai. 2004. Ranking the NTCIR systems based on multigrade relevance. In Proceedings of the 2004 international conference on Asian Information Retrieval Technology (AIRS'04), Sung Hyon Myaeng, Ming Zhou, Kam-Fai Wong, and Hong-Jiang Zhang (Eds.). Springer-Verlag, Berlin, Heidelberg, 251-262. DOI=10.1007/978-3-540-31871-2_22 http://dx.doi.org/10.1007/978-3-540-31871-2_22

  • Risk-sensitive measure

L. Wang, P. N. Bennet and K. C-Thompson, Robust Ranking Mpodels via Risk-Sensitive Optimazation. In Proc. of the SIGIR 2012. See also TREC WebTRAC 2013 http://research.microsoft.com/en-us/projects/trec-web-2013/

==================

License

evaluation_measures is BSD 2-Clause licensed.

evaluation_measures's People

Contributors

miyamamoto avatar

Stargazers

lshang avatar Mononito Goswami avatar  avatar Shashank Gupta avatar Alonso avatar Michał Datberg avatar Jeremiah Via avatar Doug Cox avatar  avatar Smrutiranjan Sahu avatar Eike  Lurz avatar  avatar Maximilian Michel avatar

Watchers

Yuichi Yoshida avatar Kohta Ishikawa avatar  avatar Satoshi Kondo avatar Makoto P. Kato avatar

Forkers

mrvege dacox

evaluation_measures's Issues

session ERR

In your list of evaluation measures you included a variant of the expected reciprocal rank that allows for evaluating sessions. Could you please give me a hint, where to find the reference paper where the adaptation of ERR for sessions is further described?

Thank you.

Why *should* max_grade be 2 for ERR?

There is a comment in the source,

# NOTE: max_grade should be *2

However, max_grade is a configurable parameter. Furthermore, the ERR paper does not seem to imply that there is an acceptable range of grades.
What is meant by this comment?

Thanks!

The example might be misleading

In ERR metric, the max_grade is set to 2 as default,however in the testing example, the grade has 3 without max_grade=3, which can be misleading.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.