Identifying word translations from comparable corpora using latent topic models
|Identifying word translations from comparable corpora using latent topic models|
|Author(s)||Vulic I., De Smet W., Moens M.-F.|
|Published in||ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies|
|Keyword(s)||Unknown (Extra: Comparable corpora, Latent Dirichlet allocation, Latent topic model, Linguistic resources, Multinomial distributions, Similarity measure, Topic model, Wikipedia, Word translation, Computational linguistics, Software agents, Statistics, Translation (languages))|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Identifying word translations from comparable corpora using latent topic models is a 2011 conference paper written in English by Vulic I., De Smet W., Moens M.-F. and published in ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
A topic model outputs a set of multinomial distributions over words for each topic. In this paper, we investigate the value of bilingual topic models, i.e., a bilingual Latent Dirichlet Allocation model for finding translations of terms in comparable corpora without using any linguistic resources. Experiments on a document-aligned English-Italian Wikipedia corpus confirm that the developed methods which only use knowledge from word-topic distributions outperform methods based on similarity measures in the original word-document space. The best results, obtained by combining knowledge from word-topic distributions with similarity measures in the original space, are also reported.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 9 time(s)