Short text similarity based on probabilistic topics

Quan, X.; Liu, G.; Lu, Z.; Ni, X.; Wenyin, L.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/87844

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Journal article
Title:	Short text similarity based on probabilistic topics
Author:	Quan, X. Liu, G. Lu, Z. Ni, X. Wenyin, L.
Citation:	Knowledge and Information Systems, 2010; 25(3):473-491
Publisher:	Springer-Verlag
Issue Date:	2010
ISSN:	0219-1377 0219-3116
Statement of Responsibility:	Xiaojun Quan, Gang Liu, Zhi Lu, Xingliang Ni, Liu Wenyin
Abstract:	In this paper, we propose a new method for measuring the similarity between two short text snippets by comparing each of them with the probabilistic topics. Specifically, our method starts by firstly finding the distinguishing terms between the two short text snippets and comparing them with a series of probabilistic topics, extracted by Gibbs sampling algorithm. The relationship between the distinguishing terms of the short text snippets can be discovered by examining their probabilities under each topic. The similarity between two short text snippets is calculated based on their common terms and the relationship of their distinguishing terms. Extensive experiments on paraphrasing and question categorization show that the proposed method can calculate the similarity of short text snippets more accurately than other methods including the pure TF-IDF measure.
Keywords:	Text similarity measures Information retrieval Query expansion Text mining Question answering
Rights:	© Springer-Verlag London Limited 2009
DOI:	10.1007/s10115-009-0250-y
Published version:	http://dx.doi.org/10.1007/s10115-009-0250-y
Appears in Collections:	Aurora harvest 2 Computer Science publications

Files in This Item:

There are no files associated with this item.

Show full item record

Adelaide Research & Scholarship