Browse wiki

Jump to: navigation, search
Improving text classification by using encyclopedia knowledge
Abstract The exponential growth of text documents aThe exponential growth of text documents available on the Internet has created an urgent need for accurate, fast, and general purpose text classification algorithms. However, the "bag of words" representation used for these classification methods is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. In order to deal with this problem, we integrate background knowledge - in our application: Wikipedia - into the process of classifying text documents. The experimental evaluation on Reuters newsfeeds and several other corpus shows that our classification results with encyclopedia knowledge are much better than the baseline "bag of words" methods. than the baseline "bag of words" methods.
Abstractsub The exponential growth of text documents aThe exponential growth of text documents available on the Internet has created an urgent need for accurate, fast, and general purpose text classification algorithms. However, the "bag of words" representation used for these classification methods is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. In order to deal with this problem, we integrate background knowledge - in our application: Wikipedia - into the process of classifying text documents. The experimental evaluation on Reuters newsfeeds and several other corpus shows that our classification results with encyclopedia knowledge are much better than the baseline "bag of words" methods. than the baseline "bag of words" methods.
Bibtextype inproceedings  +
Doi 10.1109/ICDM.2007.77  +
Has author Pu Wang + , Jian Hu + , Zeng H.-J. + , Long Chen + , Zheng Chen +
Has extra keyword Administrative data processing + , Classification (of information) + , Data mining + , Decision support systems + , Information management + , Information retrieval systems + , Mining + , Search engine + , Background knowledge + , Bag of words + , Classification methods + , Classification results + , Experimental evaluations + , Exponential growth + , International conferences + , Newsfeeds + , Reuters + , Text classification + , Text classification algorithms + , Text documents + , Wikipedia + , Text processing +
Isbn 0769530184; 9780769530185  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 332–341  +
Published in Proceedings - IEEE International Conference on Data Mining, ICDM +
Title Improving text classification by using encyclopedia knowledge +
Type conference paper  +
Year 2007 +
Creation dateThis property is a special property in this wiki. 7 November 2014 19:35:01  +
Categories Publications without keywords parameter  + , Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 7 November 2014 19:35:01  +
DateThis property is a special property in this wiki. 2007  +
hide properties that link here 
Improving text classification by using encyclopedia knowledge + Title
 

 

Enter the name of the page to start browsing from.