Hi! Thank you for providing help! Here's the weird thing I encounter

Negative outlier scores found when running function "maxdiv_score_intervals(X, params, intervals)" about libmaxdiv HOT 3 CLOSED

Smu-Tan commented on June 9, 2024

Negative outlier scores found when running function "maxdiv_score_intervals(X, params, intervals)"

from libmaxdiv.

Comments (3)

Callidior commented on June 9, 2024

Hi!

Since maxdiv_score_intervals is a fairly new function, I cannot rule out possible bugs right away. However, it is totally possible for cross-entropy to become negative. Only the KL divergence is guaranteed to be strictly non-negative. Cross-entropy on continuous distributions (as opposed to discrete distributions) can actually become negative.

In any case, maxdiv_score_intervals should give identical scores to the ones returned from maxdiv/maxdiv_exec when using the same distribution. Could you try to re-score the detected intervals with the same method used by the detector (both for the ID and the full distribution) and confirm that the scores are the same?

from libmaxdiv.

Smu-Tan commented on June 9, 2024

Hi!

Since maxdiv_score_intervals is a fairly new function, I cannot rule out possible bugs right away. However, it is totally possible for cross-entropy to become negative. Only the KL divergence is guaranteed to be strictly non-negative. Cross-entropy on continuous distributions (as opposed to discrete distributions) can actually become negative.

In any case, maxdiv_score_intervals should give identical scores to the ones returned from maxdiv/maxdiv_exec when using the same distribution. Could you try to re-score the detected intervals with the same method used by the detector (both for the ID and the full distribution) and confirm that the scores are the same?

Thank you for providing help! Sure, I will try to confirm that the scores are the same, I will keep you updated. Thx!

from libmaxdiv.

Callidior commented on June 9, 2024

As an addendum: The scores of our "unbiased KL divergence" can also become negative, if the distributions are very similar.
It is true that for the normal KL divergence, 0 is the minimum. In the paper, the unbiased KL divergence is defined as:
$\mathrm{KL}_{\mathrm{U}} = 2 \cdot |I| \cdot \mathrm{KL}$

However, in the source code we made one additional tweak to make scores more comparable for time-series of different dimensionality d: We subtract the theoretical mean of this statistic, which is distributed according to a chi-square distribution, and divide by the standard deviation. Thus, the unbiased KL divergence computed by libmaxdiv is actually:

Due to the subtraction of this constant factor, the unbiased KL divergence can become negative. The order of anomaly scores is preserved, however.

from libmaxdiv.

Negative outlier scores found when running function "maxdiv_score_intervals(X, params, intervals)" about libmaxdiv HOT 3 CLOSED

Comments (3)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent