Automatic readability classification of crowd-sourced data based on linguistic and information-theoretic features

From WikiPapers
Jump to: navigation, search

Automatic readability classification of crowd-sourced data based on linguistic and information-theoretic features is a 2013 journal article written in English by Islam Z., Mehler A. and published in Computacion y Sistemas.

[edit] Abstract

This paper presents a classifier of text readability based on information-theoretic features. The classifier was developed based on a linguistic approach to readability that explores lexical, syntactic and semantic features. For this evaluation we extracted a corpus of 645 articles from Wikipedia together with their quality judgments. We show that information-theoretic features perform as well as their linguistic counterparts even if we explore several linguistic levels at once.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 1 time(s)