ACTUAL
T1 T2 T3 .. .. . . .
Doc1
Doc2
Doc3
PREDICTED - Selected from Dominant topic from doc topic distribution.
W1 W2 W3 .. .. . . .
Doc1
Doc2
Doc3
**According to literature, If a document is asked to belong to one of the dominant
topic (hard assignment), the top words from the dominant topic should be in the
actual document. If not:
- then the probability of dominant topic is very less and there might be other topic which
can be made dominant.
- or the top words are wrongly selected. The weights of words could be better to find
the same dominant topic.**
We have now x no of documents. For eg x=4, k(no of topics)=3
for x=4, we have [D1,D2,D3,D4]
Actual=[1,1,2,0]
Predicted=[1,0,2,0]
The score is = 2/4=0.50