Japanese/english blog distillation and cross-lingual blog analysis with multilingual wikipedia entries as fundamental knowledge source

From WikiPapers
Jump to: navigation, search

Japanese/english blog distillation and cross-lingual blog analysis with multilingual wikipedia entries as fundamental knowledge source is a 2010 journal article written in Japanese by Nakasaki H., Kawaba M., Yokomoto D., Utsuro T., Fukuhara T. and published in Transactions of the Japanese Society for Artificial Intelligence.

[edit] Abstract

The overall goal of this paper is to cross-lingually analyze multilingual blogs collected with a topic keyword. The framework of collecting multilingual blogs with a topic keyword is designed as the blog feed retrieval procedure. In this paper, we take an approach of collecting blog feeds rather than blog posts, mainly because we regard the former as a larger information unit in the blogosphere and prefer it as the information source for cross-lingual blog analysis. In the blog feed retrieval procedure, we also regard Wikipedia as a large scale ontological knowledge base for conceptually indexing the blogosphere. The underlying motivation of employing Wikipedia is in linking a knowledge base of well known facts and relatively neutral opinions with rather raw, user generated media like blogs, which include less well known facts and much more radical opinions. In our framework, first, in order to collect candidates of blog feeds for a given query, we use existing Web search engine APIs, which return a ranked list of blog posts, given a topic keyword. Next, we re-rank the list of blog feeds according to the number of hits of the topic keyword as well as closely related terms extracted from the Wikipedia entry in each blog feed. We compare the proposed blog feed retrieval method to existing Web search engine APIs and achieve significant improvement. We then apply the proposed blog distillation framework to the task of cross-lingually analyzing multilingual blogs collected with a topic keyword. Here, we cross-lingually and cross-culturally compare less well known facts and opinions that are closely related to a given topic. Results of cross-lingual blog analysis support the effectiveness of the proposed framework.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.