Browse wiki

Jump to: navigation, search
A novel weighting scheme for efficient document indexing and classification
Abstract In this paper we propose and illustrate thIn this paper we propose and illustrate the effectiveness of a new topic-based document classification method. The proposed method utilizes the Wikipedia, a large scale Web encyclopaedia that has high-quality and huge-scale articles and a category system. Wikipedia is used using an Ngram technique to transform the document from being a "bag of words" to become a "bag of concepts". Based on this transformation, a novel concept-based weighting scheme (denoted as Conf.idf) is proposed to index the text with the flavor of the traditional tf.idf indexing scheme. Moreover, a genetic algorithm-based support vector machine optimization method is used for the purpose of feature subset and instance selection. Experimental results showed that proposed weighting scheme outperform the traditional indexing and weighting scheme.traditional indexing and weighting scheme.
Abstractsub In this paper we propose and illustrate thIn this paper we propose and illustrate the effectiveness of a new topic-based document classification method. The proposed method utilizes the Wikipedia, a large scale Web encyclopaedia that has high-quality and huge-scale articles and a category system. Wikipedia is used using an Ngram technique to transform the document from being a "bag of words" to become a "bag of concepts". Based on this transformation, a novel concept-based weighting scheme (denoted as Conf.idf) is proposed to index the text with the flavor of the traditional tf.idf indexing scheme. Moreover, a genetic algorithm-based support vector machine optimization method is used for the purpose of feature subset and instance selection. Experimental results showed that proposed weighting scheme outperform the traditional indexing and weighting scheme.traditional indexing and weighting scheme.
Bibtextype inproceedings  +
Doi 10.1109/ITSIM.2010.5561553  +
Has author Tahayna B. + , Ayyasamy R.K. + , Alhashmi S. + , Eu-Gene S. +
Has extra keyword Bag of words + , Category systems + , Document Classification + , Document indexing + , Feature subset + , High quality + , Indexing scheme + , Instance selection + , Novel concept + , Optimization method + , Support vector + , Term weighting scheme + , Weighting scheme + , Wikipedia + , Feature extraction + , Gears + , Genetic algorithms + , Indexing (of information) + , Information retrieval systems + , Support vector machines + , Weighing +
Has keyword Feature subset seletion + , Genetic algorithms + , Support vector machines + , Term weighting scheme + , Wikipedia +
Isbn 9781424467181  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 783–788  +
Published in Proceedings 2010 International Symposium on Information Technology - Engineering Technology, ITSim'10 +
Title A novel weighting scheme for efficient document indexing and classification +
Type conference paper  +
Volume 2  +
Year 2010 +
Creation dateThis property is a special property in this wiki. 6 November 2014 19:29:23  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 November 2014 19:29:23  +
DateThis property is a special property in this wiki. 2010  +
hide properties that link here 
A novel weighting scheme for efficient document indexing and classification + Title
 

 

Enter the name of the page to start browsing from.