Identifying featured articles in Wikipedia: Writing style matters
|Identifying featured articles in Wikipedia: Writing style matters|
|Author(s)||Lipka N., Stein B.|
|Published in||Proceedings of the 19th International Conference on World Wide Web, WWW '10|
|Keyword(s)||domain transfer, information quality, wikipedia (Extra: Classification tasks, Experiment design, F-measure, Information quality assessment, Machine-learning, Meta-features, Transfer information, Wikipedia, Writing style, Information analysis, World Wide Web)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Identifying featured articles in Wikipedia: Writing style matters is a 2010 conference paper written in English by Lipka N., Stein B. and published in Proceedings of the 19th International Conference on World Wide Web, WWW '10.
Wikipedia provides an information quality assessment model with criteria for human peer reviewers to identify featured articles. For this classification task "Is an article featured or not?" we present a machine learning approach that exploits an article's character trigram distribution. Our approach differs from existing research in that it aims to writing style rather than evaluating meta features like the edit history. The approach is robust, straightforward to implement, and outperforms existing solutions. We underpin these claims by an experiment design where, among others, the domain transferability is analyzed. The achieved performances in terms of the F-measure for featured articles are 0.964 within a single Wikipedia domain and 0.880 in a domain transfer situation.
- This section requires expansion. Please, help!
Cited byThis publication has 1 citations. Only those publications available in WikiPapers are shown here:
Cited 10 time(s)