A novel weighting scheme for efficient document indexing and classification

From WikiPapers
Revision as of 19:29, November 6, 2014 by Nemo bis (Talk | contribs) (CSV import from another resource for wiki stuff; all data is PD-ineligible, abstracts quoted under quotation right. Skipping when title already exists. Sorry for authors and references to be postprocessed, please edit and create redirects.)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A novel weighting scheme for efficient document indexing and classification is a 2010 conference paper written in English by Tahayna B., Ayyasamy R.K., Alhashmi S., Eu-Gene S. and published in Proceedings 2010 International Symposium on Information Technology - Engineering Technology, ITSim'10.

[edit] Abstract

In this paper we propose and illustrate the effectiveness of a new topic-based document classification method. The proposed method utilizes the Wikipedia, a large scale Web encyclopaedia that has high-quality and huge-scale articles and a category system. Wikipedia is used using an Ngram technique to transform the document from being a "bag of words" to become a "bag of concepts". Based on this transformation, a novel concept-based weighting scheme (denoted as Conf.idf) is proposed to index the text with the flavor of the traditional tf.idf indexing scheme. Moreover, a genetic algorithm-based support vector machine optimization method is used for the purpose of feature subset and instance selection. Experimental results showed that proposed weighting scheme outperform the traditional indexing and weighting scheme.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 6 time(s)