Multi-view LDA for semantics-based document representation
|Multi-view LDA for semantics-based document representation|
|Author(s)||Yun J., Jing L., Huang H., Yu J.|
|Published in||Journal of Computational Information Systems|
|Keyword(s)||Latent dirichlet allocation, Semantics, Topic model, Wikipedia category (Extra: Classification and clustering, Document Representation, Feature space, Latent Dirichlet allocation, Latent dirichlet allocations, Multi-views, Semantic information, Topic model, Wikipedia, Statistics, Semantics)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of journal articles|
Multi-view LDA for semantics-based document representation is a 2011 journal article written in English by Yun J., Jing L., Huang H., Yu J. and published in Journal of Computational Information Systems.
Each document and word can be modeled as a mixture of topics by Latent Dirichlet Allocation (LDA), which does not contain any external semantic information. In this paper, we represent documents as two feature spaces consisting of words and Wikipedia categories respectively, and propose a new method called Multi-View LDA (M-LDA) by combining LDA with explicit human-defined concepts in Wikipedia. M-LDA improves document topic model by taking advantage of both two feature spaces and their mapping relationship. Experimental results on classification and clustering tasks show M-LDA outperforms traditional LDA.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.