Cross-lingual document similarity
|Cross-lingual document similarity|
|Author(s)||Muhic A., Rupnik J., Skraba P.|
|Published in||Proceedings of the International Conference on Information Technology Interfaces, ITI|
|Keyword(s)||cross-lingual, information retrieval, k-means, LSI, similarity, Wikipedia (Extra: Cross-lingual, K-means, LSI, similarity, Wikipedia, Information retrieval, Linguistics, Websites, Information technology)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Cross-lingual document similarity is a 2012 conference paper written in English by Muhic A., Rupnik J., Skraba P. and published in Proceedings of the International Conference on Information Technology Interfaces, ITI.
In this paper we investigated how to compute similarities between documents written in different languages based on a weekly aligned multi-lingual collection of documents. Computing the cross-lingual similarities is based on an aligned set of basis vectors obtained by either latent semantic indexing or the k-means algorithm on an aligned multi-lingual corpus. We evaluated the methods on two data sets: Wikipedia and European Parliament Proceedings Parallel Corpus.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.