mariyahendriksen / improving-semantic-topic-clustering-for-search-queries-with-word-co-occurrence Goto Github PK
View Code? Open in Web Editor NEWThis project forked from aakashsinha19/improving-semantic-topic-clustering-for-search-queries-with-word-co-occurrence
Uncovering common themes from a large number of unor- ganized search queries is a primary step to mine insights about aggregated user interests. Common topic model- ing techniques for document modeling often face sparsity problems with search query data as these are much shorter than documents. We present two novel techniques that can discover semantically meaningful topics in search queries: i) word co-occurrence clustering generates topics from words frequently occurring together; ii) weighted bigraph cluster- ing uses URLs from Google search results to induce query similarity and generate topics. We exemplify our proposed methods on a set of Lipton brand as well as make-up & cos- metics queries. A comparison to standard LDA clustering demonstrates the usefulness and improved performance of the two proposed methods. keywords: search queries, topic clustering, word co- occurrence, bipartite graph, co-clustering.