| Xiaochuan Ni|
(Alternative names for this author)
|Co-authors||Jian Hu, Jian T. Sun, Sun J.-T., Zheng Chen|
|Authorship||Publications (2), datasets (0), tools (0)|
|Citations||Total (0), average (0), median (0), max (0), min (0)|
|DBLP · Google Scholar|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of authors|
Xiaochuan Ni is an author.
PublicationsOnly those publications related to wikis are shown here.
|Title||Keyword(s)||Published in||Language||DateThis property is a special property in this wiki.||Abstract||R||C|
|Cross lingual text classification by mining multilingual topics from Wikipedia||Cross lingual text classification
|Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011||English||2011||This paper investigates how to effectively do cross lingual text classification by leveraging a large scale and multilingual knowledge base, Wikipedia. Based on the observation that each Wikipedia concept is described by documents of different languages, we adapt existing topic modeling algorithms for mining multilingual topics from this knowledge base. The extracted topics have multiple types of representations, with each type corresponding to one language. In this work, we regard such topics extracted from Wikipedia documents as universal-topics, since each topic corresponds with same semantic information of different languages. Thus new documents of different languages can be represented in a space using a group of universal-topics. We use these universal-topics to do cross lingual text classification. Given the training data labeled for one language, we can train a text classifier to classify the documents of another language by mapping all documents of both languages into the universal-topic space. This approach does not require any additional linguistic resources, like bilingual dictionaries, machine translation tools, or labeling data for the target language. The evaluation results indicate that our topic modeling approach is effective for building cross lingual text classifier. Copyright 2011 ACM.||0||0|
|Mining multilingual topics from Wikipedia||English||2009||In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages. Based on the observation that one Wikipedia concept may be described by articles in different languages, we adapt existing topic modeling algorithm for mining multilingual topics from this knowledge base. The extracted 'universal' topics have multiple types of representations, with each type corresponding to one language. Accordingly, new documents of different languages can be represented in a space using a group of universal topics, which makes various multilingual Web applications feasible.||0||0|