Segmentation of review texts by using thesaurus and corpus-based word similarity

From WikiPapers
Jump to: navigation, search

Segmentation of review texts by using thesaurus and corpus-based word similarity is a 2012 conference paper written in English by Suzuki Y., Fukumoto F. and published in KEOD 2012 - Proceedings of the International Conference on Knowledge Engineering and Ontology Development.

[edit] Abstract

Recently, we can refer to user reviews in the shopping or hotel reservation sites. However, with the exponential growth of information of the Internet, it is becoming increasingly difficult for a user to read and understand all the materials from a large-scale reviews that is potentially of interest. In this paper, we propose a method for review texts segmentation by guest's criteria, such as service, location and facilities. Our system firstly extracts words which represent criteria from hotel review texts. We focused on topic markers such as "ha" in Japanese to extract guest's criteria. The extracted words are classified into classes with similar words. The classification is proceeded by using Japanese WordNet. Then, for each hotel, each text with all of the guest reviews is segmented into word sequence by using criteria classes. Review text segmentation is difficult because of short text. We thus used Japanese WordNet, extracted similar word pairs, and indexes of Wikipedia. We performed text segmentation of hotel review. The results showed the effectiveness of our method and indicated that it can be used for review summarization by guest's criteria.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.