CLEF

From WikiPapers
Jump to: navigation, search
Conferences
CLEF · Iberocoop · MathWikis · PAN · RecentChangesCamp

SemWiki · SMWCon · Wiki Conference India · WikiAI

WikiCon · Wikimania · Wikipedia Academy

Wikipedia CPOV Conference · WikiSym · WikiViz


Publications

Only those publications related to wikis already available at WikiPapers are shown here.
Title Author(s) Keyword(s) Language DateThis property is a special property in this wiki. Abstract R C
Overview of the 1st International Competition on Quality Flaw Prediction in Wikipedia Maik Anderka
Benno Stein
Information quality
Wikipedia
Quality Flaw Prediction
English 2012 The paper overviews the task "Quality Flaw Prediction in Wikipedia" of the PAN'12 competition. An evaluation corpus is introduced which comprises 1,592,226 English Wikipedia articles, of which 208,228 have been tagged to contain one of ten important quality flaws. Moreover, the performance of three quality flaw classifiers is evaluated. 0 0
Wiki Vandalysis - Wikipedia Vandalism Analysis Manoj Harpalani
Thanadit Phumprao
Megha Bassi
Michael Hart
Rob Johnson
English 2010 Wikipedia describes itself as the "free encyclopedia that anyone can edit". Along with the helpful volunteers who contribute by improving the articles, a great number of malicious users abuse the open nature of Wikipedia by vandalizing articles. Deterring and reverting vandalism has become one of the

major challenges of Wikipedia as its size grows. Wikipedia editors fight vandalism both manually and with automated bots that use regular expressions and other simple rules to recognize malicious edits. Researchers have also proposed Machine Learning algorithms for vandalism detection, but these algorithms are still in their infancy and have much room for improvement. This paper presents an approach to fighting vandalism by extracting various features from the edits for machine learning classification. Our classifier uses information about the editor, the sentiment of the edit, the "quality" of the edit (i.e. spelling errors), and targeted regular expressions to capture patterns common in blatant

vandalism, such as insertion of obscene words or multiple exclamations. We have successfully been able to achieve an area under the ROC curve (AUC) of 0.91 on a training set of 15000 human annotated edits and 0.887 on a random sample of 17472 edits from 317443.
0 0
Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals Santiago M. Mola Velasco English 2010 Wikipedia is an online encyclopedia that anyone can edit. In this open model, some people edits with the intent of harming the integrity of Wikipedia. This is known as vandalism. We extend the framework presented in (Potthast, Stein, and Gerling, 2008) for Wikipedia vandalism detection. In this approach, several vandalism indicating features are extracted from edits in a vandalism corpus and are fed to a supervised learning algorithm. The best performing classifiers were LogitBoost and Random Forest. Our classifier, a Random Forest, obtained an AUC of 0.92236, ranking in the first place of the PAN’10 Wikipedia vandalism detection task. 4 0
Crosslanguage retrieval based on Wikipedia statistics Andreas Juffinger
Roman Kern
Michael Granitzer
English 2009 0 0
GikiCLEF topics and Wikipedia articles: did they blend? Nuno Cardoso English 2009 0 0
GikiP at GeoCLEF 2008: joining GIR and QA forces for querying Wikipedia Diana Santos
Nuno Cardoso
Paula Carvalho
Iustin Dornescu
Sven Hartrumpf
Johannes Leveling
Yvonne Skalban
English 2009 0 0
WikiTranslate: query translation for cross-lingual information retrieval using only Wikipedia Dong Nguyen
Arnold Overwijk
Claudia Hauff
Dolf R. B. Trieschnigg
Djoerd Hiemstra
Franciska De Jong
Wikipedia
Comparable corpus
Cross-lingual information retrieval
Query translation
Word sense disambiguation
English 2009 0 0

External links[edit]