Automatic vandalism detection in Wikipedia with active associative classification
|Automatic vandalism detection in Wikipedia with active associative classification|
|Author(s)||Sumbana M., Goncalves M.A., Silva R., Almeida J., Veloso A.|
|Published in||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Keyword(s)||Unknown (Extra: Active sampling, Associative classification, Crowdsourcing, Detection methods, Dynamic patterns, Training sets, Under-sampling, Wikipedia, Classification (of information), Digital libraries, Websites)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Automatic vandalism detection in Wikipedia with active associative classification is a 2012 conference paper written in English by Sumbana M., Goncalves M.A., Silva R., Almeida J., Veloso A. and published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
Wikipedia and other free editing services for collaboratively generated content have quickly grown in popularity. However, the lack of editing control has made these services vulnerable to various types of malicious actions such as vandalism. State-of-the-art vandalism detection methods are based on supervised techniques, thus relying on the availability of large and representative training collections. Building such collections, often with the help of crowdsourcing, is very costly due to a natural skew towards very few vandalism examples in the available data as well as dynamic patterns. Aiming at reducing the cost of building such collections, we present a new active sampling technique coupled with an on-demand associative classification algorithm for Wikipedia vandalism detection. We show that our classifier enhanced with a simple undersampling technique for building the training set outperforms state-of-the-art classifiers such as SVMs and kNNs. Furthermore, by applying active sampling, we are able to reduce the need for training in almost 96% with only a small impact on detection results.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.