Topic based semantic clustering using Wikipedia knowledge

From WikiPapers
Jump to: navigation, search

Topic based semantic clustering using Wikipedia knowledge is a 2012 conference paper written in English by Ramudu B., Murty M.N. and published in Proceedings - 2012 International Conference on Data Science and Engineering, ICDSE 2012.

[edit] Abstract

Most of the existing academic social networks like DBLP and ArnetMiner suggest co-authors as being related to each author based on author and co-author details, which are extracted from academic researchers publications. Instead of this approach, we propose grouping of researchers on basis of topics they have in common that would make more semantic sense for a user searching for publications in an area. Identification of relevant complete topic phrases, discovery of equivalent topics (synonyms) using terms in document collections are fundamental problems in information retrieval, natural language processing, pattern recognition, etc. In this paper, we propose a novel approach, wherein we identify a document's topic using Wikipedia. Wikipedia is the largest freely available crowd-source with up-to-date knowledge containing around 20 millions topics. The document collection we consider is the collection of academic researchers' publications. The documents are mapped to topics extracted from documents using Wikipedia to generate a document-topic representation. Then clustering algorithm is applied on document-topic representation to group semantically related researchers and generate a topic based academic researchers' Social Network. We use Adaptive Rough Fuzzy Leader (ARFL) a soft clustering algorithm, since each researcher can have expertise in more than one area and they can belong to more than one group. We present the empirical evaluation of our proposed scheme. We also demonstrate how our proposed solution is scalable to various domain areas and can be used to design topic based retrieval systems.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.