Multi-view LDA for semantics-based document representation

From WikiPapers
Jump to: navigation, search

Multi-view LDA for semantics-based document representation is a 2011 journal article written in English by Yun J., Jing L., Huang H., Yu J. and published in Journal of Computational Information Systems.

[edit] Abstract

Each document and word can be modeled as a mixture of topics by Latent Dirichlet Allocation (LDA), which does not contain any external semantic information. In this paper, we represent documents as two feature spaces consisting of words and Wikipedia categories respectively, and propose a new method called Multi-View LDA (M-LDA) by combining LDA with explicit human-defined concepts in Wikipedia. M-LDA improves document topic model by taking advantage of both two feature spaces and their mapping relationship. Experimental results on classification and clustering tasks show M-LDA outperforms traditional LDA.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.