Kazuhiro Morita

From WikiPapers
Jump to: navigation, search

Kazuhiro Morita is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
A new approach for Arabic text classification using Arabic field-association terms Journal of the American Society for Information Science and Technology English 2011 Field-association (FA) terms give us the knowledge to identify document fields using a limited set of discriminating terms. Although many earlier methods tried to extract automatically relevant FA terms to build a comprehensive dictionary, the problem lies in the lack of an effective method to extract automatically relevant FA terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other languages such as Arabic could benefit future research in the field. We present a new method to build a comprehensive Arabic dictionary using part-of-speech, pattern rules, and corpora in Arabic language. Experimental evaluation is carried out for various fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhayah news selected average of 2,825 FA terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79%, respectively. We propose amended text classification methodology based on field association terms. Our approach is compared with Nave Bayes (NB) and kNN classifiers on 5,959 documents from Wikipedia dumps and Alhayah news. The new approach achieved a precision of 80.65% followed by NB (72.79%) and kNN (36.15%). 0 0
Extraction, selection and ranking of Field Association (FA) Terms from domain-specific corpora for building a comprehensive FA terms dictionary Knowledge and Information Systems 2010 0 0
A method of building Chinese field association knowledge from Wikipedia Chinese documents
Feature fields
Field association terms
Field recognition
Wikipedia
2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009 English 2009 Field Association (FA) terms form a limited set of discriminating terms that give us the knowledge to identify document fields. The primary goal of this research is to make a system that can imitate the process whereby humans recognize the fields by looking at a few Chinese FA terms in a document. This paper proposes a new approach to build a Chinese FA terms dictionary automatically from Wikipedia. 104,532 FA terms are added in the dictionary. The resulting FA terms by using this dictionary are applied to recognize the fields of 5,841 documents. The average accuracy in the experiment is 92.04%. The results show that the presented method is effective in building FA terms from Wikipedia automatically. 0 0