| Vandalism detection|
(Alternative names for this keyword)
|Related keyword(s)||vandalism, spam, vandal fighting|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of keywords|
Vandalism detection is included as keyword or extra keyword in 0 datasets, 0 tools and 6 publications.
There is no datasets for this keyword.
There is no tools for this keyword.
|Title||Author(s)||Published in||Language||DateThis property is a special property in this wiki.||Abstract||R||C|
|Multilingual Vandalism Detection using Language-Independent & Ex Post Facto Evidence||Andrew G. West
|PAN-CLEF||English||September 2011||There is much literature on Wikipedia vandalism detection. However, this writing addresses two facets given little treatment to date. First, prior efforts emphasize zero-delay detection, classifying edits the moment they are made. If classification can be delayed (e.g., compiling offline distributions), it is possible to leverage ex post facto evidence. This work describes/evaluates several features of this type, which we find to be overwhelmingly strong vandalism indicators.
Second, English Wikipedia has been the primary test-bed for research. Yet, Wikipedia has 200+ language editions and use of localized features impairs portability. This work implements an extensive set of language-independent indicators and evaluates them using three corpora (German, English, Spanish). The work then extends to include language-specific signals. Quantifying their performance benefit, we find that such features can moderately increase classifier accuracy, but significant effort and language fluency are required to capture this utility.Aside from these novel aspects, this effort also broadly addresses the task, implementing 65 total features. Evaluation produces 0.840 PR-AUC on thezero-delay task and 0.906 PR-AUC with ex post facto evidence (averaging languages). Performance matches the state-of-the-art (English), sets novel baselines (German, Spanish), and is validated by a first-place finish over the 2011 PAN-CLEF test set.
|Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features||B. Thomas Adler
Luca de Alfaro
Santiago M. Mola Velasco
Andrew G. West
|Lecture notes in computer science||English||February 2011||Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions.||0||1|
|Vandalism detection in Wikipedia: a high-performing, feature-rich model and its reduction through Lasso||Sara Javanmardi
David W. McDonald
Cristina V. Lopes
|Crowdsourcing a Wikipedia Vandalism Corpus||Martin Potthast||SIGIR||English||2010||We report on the construction of the PAN Wikipedia vandalism corpus, PAN-WVC-10, using Amazon’s Mechanical Turk. The corpus compiles 32 452 edits on 28 468 Wikipedia articles, among which 2 391 vandalism edits have been identified. 753 human annotators cast a total of 193 022 votes on the edits, so that each edit was reviewed by at least 3 annotators, whereas the achieved level of agreement was analyzed in order to label an edit as “regular” or “vandalism.” The corpus is available free of charge.||6||1|
|Elusive vandalism detection in wikipedia: a text stability-based approach||Qinyi Wu
|Detector y corrector automático de ediciones maliciosas en Wikipedia||Emilio J. Rodríguez-Posada||Spanish||2009||El proyecto desarrolla AVBOT (acrónimo de Anti-Vandalism BOT), un programa que detecta y corrige automáticamente ediciones maliciosas en Wikipedia en español. Está programado en Python y utiliza las librerías pywikipediabot y python-irclib.||0||0|