Cross-lingual document similarity

From WikiPapers
Jump to: navigation, search

Cross-lingual document similarity is a 2012 conference paper written in English by Muhic A., Rupnik J., Skraba P. and published in Proceedings of the International Conference on Information Technology Interfaces, ITI.

[edit] Abstract

In this paper we investigated how to compute similarities between documents written in different languages based on a weekly aligned multi-lingual collection of documents. Computing the cross-lingual similarities is based on an aligned set of basis vectors obtained by either latent semantic indexing or the k-means algorithm on an aligned multi-lingual corpus. We evaluated the methods on two data sets: Wikipedia and European Parliament Proceedings Parallel Corpus.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.