A corpus-based study of edit categories in featured and non-featured wikipedia articles
|A corpus-based study of edit categories in featured and non-featured wikipedia articles|
|Author(s)||Daxenberger J., Gurevych I.|
|Published in||24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers|
|Keyword(s)||Collaborative writing, Quality assessment, Revision history, Wikipedia (Extra: Classification scheme, Collaborative writing, Information contents, Multi-label annotation, Quality assessment, Spelling errors, Wikipedia, Wikipedia articles, Computational linguistics, Websites)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
A corpus-based study of edit categories in featured and non-featured wikipedia articles is a 2012 conference paper written in English by Daxenberger J., Gurevych I. and published in 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers.
In this paper, we present a study of the collaborative writing process in Wikipedia. Our work is based on a corpus of 1,995 edits obtained from 891 article revisions in the English Wikipedia. We propose a 21-category classification scheme for edits based on Faigley and Witte's (1981) model. Example edit categories include spelling error corrections and vandalism. In a manual multi-label annotation study with 3 annotators, we obtain an inter-annotator agreement of α = 0.67. We further analyze the distribution of edit categories for distinct stages in the revision history of 10 featured and 10 non-featured articles. Our results show that the information content in featured articles tends to become more stable after their promotion. On the opposite, this is not true for non-featured articles. We make the resulting corpus and the annotation guidelines freely available.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 3 time(s)