Browse wiki

Jump to: navigation, search
Building a text classifier by a keyword and Wikipedia knowledge
Abstract Traditional approach for building text claTraditional approach for building text classifiers usually require a lot of labeled documents, which are expensive to obtain. In this paper, we propose a new text classification approach based on a keyword and Wikipedia knowledge, so as to avoid labeling documents manually. Firstly, we retrieve a set of related documents about the keyword from Wikipedia. And then, with the help of related Wikipedia pages, more positive documents are extracted from the unlabeled documents. Finally, we train a text classifier with these positive documents and unlabeled documents. The experiment result on 20Newsgroup dataset show that the proposed approach performs very competitively compared with NB-SVM, a PU learner, and NB, a supervised learner. PU learner, and NB, a supervised learner.
Abstractsub Traditional approach for building text claTraditional approach for building text classifiers usually require a lot of labeled documents, which are expensive to obtain. In this paper, we propose a new text classification approach based on a keyword and Wikipedia knowledge, so as to avoid labeling documents manually. Firstly, we retrieve a set of related documents about the keyword from Wikipedia. And then, with the help of related Wikipedia pages, more positive documents are extracted from the unlabeled documents. Finally, we train a text classifier with these positive documents and unlabeled documents. The experiment result on 20Newsgroup dataset show that the proposed approach performs very competitively compared with NB-SVM, a PU learner, and NB, a supervised learner. PU learner, and NB, a supervised learner.
Bibtextype inproceedings  +
Doi 10.1007/978-3-642-03348-3_28  +
Has author Qiang Qiu + , YanChun Zhang + , Junping Zhu + , Qu W. +
Has extra keyword Dataset + , Keyword + , Labeled documents + , Positive documents + , Text classification + , Text classifiers + , Unlabeled document + , Unlabeled documents + , Wikipedia + , Classifiers + , Learning systems + , Text processing + , Information retrieval systems +
Has keyword Keyword + , Text classification + , Unlabeled document + , Wikipedia +
Isbn 3642033474; 9783642033476  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 277–287  +
Published in Lecture Notes in Computer Science +
Title Building a text classifier by a keyword and Wikipedia knowledge +
Type conference paper  +
Volume 5678 LNAI  +
Year 2009 +
Creation dateThis property is a special property in this wiki. 7 November 2014 09:18:30  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 7 November 2014 09:18:30  +
DateThis property is a special property in this wiki. 2009  +
hide properties that link here 
Building a text classifier by a keyword and Wikipedia knowledge + Title
 

 

Enter the name of the page to start browsing from.