Browse wiki

Jump to: navigation, search
Clustering short texts using Wikipedia
Abstract Subscribers to the popular news or blog feSubscribers to the popular news or blog feeds (RSS/Atom) often face the problem of information overload as these feed sources usually deliver large number of items periodically. One solution to this problem could be clustering similar items in the feed reader to make the information more manageable for a user. Clustering items at the feed reader end is a challenging task as usually only a small part of the actual article is received through the feed. In this paper, we propose a method of improving the accuracy of clustering short texts by enriching their representation with additional features from Wikipedia. Empirical results indicate that this enriched representation of text items can substantially improve the clustering accuracy when compared to the conventional bag of words representatione conventional bag of words representation
Abstractsub Subscribers to the popular news or blog feSubscribers to the popular news or blog feeds (RSS/Atom) often face the problem of information overload as these feed sources usually deliver large number of items periodically. One solution to this problem could be clustering similar items in the feed reader to make the information more manageable for a user. Clustering items at the feed reader end is a challenging task as usually only a small part of the actual article is received through the feed. In this paper, we propose a method of improving the accuracy of clustering short texts by enriching their representation with additional features from Wikipedia. Empirical results indicate that this enriched representation of text items can substantially improve the clustering accuracy when compared to the conventional bag of words representatione conventional bag of words representation
Bibtextype misc  +
Citeulike 1510682  +
Doi 10.1145/1277741.1277909  +
Has author Somnath Banerjee + , Krishnan Ramanathan + , Ajay Gupta +
Has remote mirror http://www.hpl.hp.com/techreports/2008/HPL-2008-41.pdf  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 787-788  +
Title Clustering short texts using Wikipedia +
Type unknown  +
Year 2007 +
Creation dateThis property is a special property in this wiki. 28 January 2012 21:14:35  +
Categories Publications without published in parameter  + , Publications without keywords parameter  + , Publications without license parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 February 2012 20:49:22  +
DateThis property is a special property in this wiki. 2007  +
hide properties that link here 
Clustering short texts using Wikipedia + Title
 

 

Enter the name of the page to start browsing from.