Santiago M. Mola Velasco

From WikiPapers
Jump to: navigation, search

Santiago M. Mola Velasco is an author from Spain.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features Wikipedia
Wiki
Collaboration
Vandalism
Machine learning
Metadata
Natural Language Processing
Reputation
Lecture Notes in Computer Science English February 2011 Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions. 0 1
Wikipedia vandalism detection Wikipedia vandalism detection
Machine learning
Natural Language Processing
Reputation
World Wide Web English 2011 0 0
Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals CLEF English 2010 Wikipedia is an online encyclopedia that anyone can edit. In this open model, some people edits with the intent of harming the integrity of Wikipedia. This is known as vandalism. We extend the framework presented in (Potthast, Stein, and Gerling, 2008) for Wikipedia vandalism detection. In this approach, several vandalism indicating features are extracted from edits in a vandalism corpus and are fed to a supervised learning algorithm. The best performing classifiers were LogitBoost and Random Forest. Our classifier, a Random Forest, obtained an AUC of 0.92236, ranking in the first place of the PAN’10 Wikipedia vandalism detection task. 4 0