Just some comment on this here: <a href="https://github.com/roaminsight/roamresear

hammering home the point about roamresearch HOT 6 OPEN

roamanalytics commented on June 24, 2024

hammering home the point

from roamresearch.

Comments (6)

ndingwall commented on June 24, 2024

Oh, the "Hammering home the point" section is still about P-R curves. I now realize that it's confusing because we talk about ROC AUC and then go back to P-R without making the switch clear. I'll update the text to clarify this. Anyway, I think we agree: linear interpolation is fine for ROC AUC, but not P-R.

I'm not sure I follow your second point. Is there any reason to think that a better-than-chance classifier could be improved by randomly increasing or decreasing all of the scores? (Okay, not quite randomly, but the classifier doesn't know how rounding works, so from its point of view these changes are essentially random!)

Finally, I would assume that anyone using this function is computing the operating points on a test set, so now it's just a matter of how we convert the list of operating points into a single number. It makes sense for that number to represent the area under a curve defined by those operating points, and so we just need to choose how to interpolate. We agree that we shouldn't interpolate linearly. Step interpolation arises naturally in the same way that coin-flipping leads to linear interpolation for ROC, so that seems like the right choice. Another option is to return to the ROC space, compute the curve and transform it into the P-R space. But that would require a pretty big change to the API: lists containing precision and recall numbers aren't enough to compute ROC.

I might have missed your point entirely though, in which case please let me know!

Anyway, thanks for your feedback on this!

from roamresearch.

amueller commented on June 24, 2024

I think we agree on most things. I really recommend reading the paper that I cited, though ;) I commented on your PR on what I think would be the right thing to do.

from roamresearch.

amueller commented on June 24, 2024

Hm but maybe the simple point that I'm not sure got across is: in the IR book, they remove the dips by computing a maximum (on the test set!!). You don't do that in your code at all.

from roamresearch.

ndingwall commented on June 24, 2024

Yes, I was confused by that as well:

The justification is that almost anyone would be prepared to look at a few more documents if it would increase the percentage of the viewed set that were relevant (that is, if the precision of the larger set is higher). Source for anyone else following this conversation

This seems wrong to me, because you wouldn't know if the precision of the larger set is higher without peeking at the gold labels. My reference to the paper was just in their use of horizontal segments to interpolate between points, but I'm not convinced by their overall strategy.

I'm about to get on a flight so I'll review the rest of your comments and read that paper. Thanks for all your feedback - super helpful!

from roamresearch.

amueller commented on June 24, 2024

Yeah I just discussed exactly the same point with a colleague and we both
have the same view. Also thanks for you input, totally helped me wrap my
head around this. I think I'm good now ;)

Sent from phone. Please excuse spelling and brevity.

On Sep 8, 2016 18:27, "ndingwall" [email protected] wrote:

Yes, I was confused by that as well:

The justification is that almost anyone would be prepared to look at a few
more documents if it would increase the percentage of the viewed set that
were relevant (that is, if the precision of the larger set is higher). Source
for anyone else following this conversation
http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html

This seems wrong to me, because you wouldn't know if the precision of the
larger set is higher without peeking at the gold labels. My reference to
the paper was just in their use of horizontal segments to interpolate
between points, but I'm not convinced by their overall strategy.

I'm about to get on a flight so I'll review the rest of your comments and
read that paper. Thanks for all your feedback - super helpful!

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAbcFgDT4JgYSOxeBRm34B062fNHfhDoks5qoIu9gaJpZM4J4ZD-
.

from roamresearch.

ndingwall commented on June 24, 2024

Great - happy to help!

from roamresearch.

hammering home the point about roamresearch HOT 6 OPEN

Comments (6)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent