Germany

From WikiPapers
Jump to: navigation, search
Countries

Argentina

Australia

Austria

Belgium

Brazil

Canada

China

Denmark

Egypt

France

Germany

Hungary

India

Israel

Italy

Japan

Macau

Netherlands

Poland

Portugal

Spain

Switzerland

United States

This page compiles all the information regarding Germany.

Events

This is a list of events celebrated in this country.
Name Type DateThis property is a special property in this wiki. Website
Wikipedia CPOV Conference 2010 Leipzig conference 24 September 2010 http://www.cpov.de
Wikimania 2005 conference 4 August 2005 http://wikimania2005.wikimedia.org

Authors

This is a list of authors in this country.
Name Affiliation Website
Andre Köhler
Anja Haake
Christian Pentzold
Denny Vrandečić
Elena Simperl
Fabian Flöck Karlsruhe Institute of Technology (KIT)
Frank Fuchs-Kittowski Fraunhofer ISST
Johanna Niesyto http://transnationalspaces.wordpress.com/
Maik Anderka University of Paderborn http://maik.anderka.com
Martin Potthast http://www.uni-weimar.de/cms/medien/webis/people/martin-potthast.html
Oliver Ferschke http://www.ukp.tu-darmstadt.de/people/doctoral-researchers/oliver-ferschke
Stephan Lukosch
Till Schümmer

Publications

This is a list of publications by authors of this country.
Title Author(s) Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Reverts Revisited: Accurate Revert Detection in Wikipedia Fabian Flöck
Denny Vrandečić
Elena Simperl
Wikipedia
Revert detection
Editing behavior
User modeling
Collaboration systems
Community-driven content creation
Social dynamics
Hypertext and Social Media 2012 English June 2012 Wikipedia is commonly used as a proving ground for research in collaborative systems. This is likely due to its popularity and scale, but also to the fact that large amounts of data about its formation and evolution are freely available to inform and validate theories and models of online collaboration. As part of the development of such approaches, revert detection is often performed as an important pre-processing step in tasks as diverse as the extraction of implicit networks of editors, the analysis of edit or editor features and the removal of noise when analyzing the emergence of the con-tent of an article. The current state of the art in revert detection is based on a rather naïve approach, which identifies revision duplicates based on MD5 hash values. This is an efficient, but not very precise technique that forms the basis for the majority of research based on revert relations in Wikipedia. In this paper we prove that this method has a number of important drawbacks - it only detects a limited number of reverts, while simultaneously misclassifying too many edits as reverts, and not distinguishing between complete and partial reverts. This is very likely to hamper the accurate interpretation of the findings of revert-related research. We introduce an improved algorithm for the detection of reverts based on word tokens added or deleted to adresses these drawbacks. We report on the results of a user study and other tests demonstrating the considerable gains in accuracy and coverage by our method, and argue for a positive trade-off, in certain research scenarios, between these improvements and our algorithm’s increased runtime. 13 0
A Breakdown of Quality Flaws in Wikipedia Maik Anderka
Benno Stein
Quality Flaws
Information quality
Wikipedia
User-generated Content Analysis
2nd Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality 12) English 2012 The online encyclopedia Wikipedia is a successful example of the increasing popularity of user generated content on the Web. Despite its success, Wikipedia is often criticized for containing low-quality information, which is mainly attributed to its core policy of being open for editing by everyone. The identification of low-quality information is an important task since Wikipedia has become the primary source of knowledge for a huge number of people around the world. Previous research on quality assessment in Wikipedia either investigates only small samples of articles, or else focuses on single quality aspects, like accuracy or formality. This paper targets the investigation of quality flaws, and presents the first complete breakdown of Wikipedia's quality flaw structure. We conduct an extensive exploratory analysis, which reveals (1) the quality flaws that actually exist, (2) the distribution of flaws in Wikipedia, and (3) the extent of flawed content. An important finding is that more than one in four English Wikipedia articles contains at least one quality flaw, 70% of which concern article verifiability. 0 0
FlawFinder: A Modular System for Predicting Quality Flaws in Wikipedia Oliver Ferschke
Iryna Gurevych
Marc Rittberger
PAN English 2012 With over 23 million articles in 285 languages, Wikipedia is the largest free knowledge base on the web. Due to its open nature, everybody is allowed to access and edit the contents of this huge encyclopedia. As a downside of this open access policy, quality assessment of the content becomes a critical issue and is hardly manageable without computational assistance. In this paper, we present FlawFinder, a modular system for automatically predicting quality flaws in unseen Wikipedia articles. It competed in the inaugural edition of the Quality Flaw Prediction Task at the PAN Challenge 2012 and achieved the best precision of all systems and the second place in terms of recall and F1-score. 0 0
On the Evolution of Quality Flaws and the Effectiveness of Cleanup Tags in the English Wikipedia Maik Anderka
Benno Stein
Matthias Busse
Wikipedia
Cleanup Tags
Quality Flaws
Information quality
Quality Flaw Evolution
Wikipedia Academy English 2012 The improvement of information quality is a major task for the free online encyclopedia Wikipedia. Recent studies targeted the analysis and detection of specific quality flaws in Wikipedia articles. To date, quality flaws have been exclusively investigated in current Wikipedia articles, based on a snapshot representing the state of Wikipedia at a certain time. This paper goes further, and provides the first comprehensive breakdown of the evolution of quality flaws in Wikipedia. We utilize cleanup tags to analyze the quality flaws that have been tagged by the Wikipedia community in the English Wikipedia, from its launch in 2001 until 2011. This leads to interesting findings regarding (1) the development of Wikipedia's quality flaw structure and (1) the usage and the effectiveness of cleanup tags. Specifically, we show that inline tags are more effective than tag boxes, and provide statistics about the considerable volume of rare and non-specific cleanup tags. We expect that this work will support the Wikipedia community in making quality assurance activities more efficient. 0 0
Overview of the 1st International Competition on Quality Flaw Prediction in Wikipedia Maik Anderka
Benno Stein
Information quality
Wikipedia
Quality Flaw Prediction
CLEF English 2012 The paper overviews the task "Quality Flaw Prediction in Wikipedia" of the PAN'12 competition. An evaluation corpus is introduced which comprises 1,592,226 English Wikipedia articles, of which 208,228 have been tagged to contain one of ten important quality flaws. Moreover, the performance of three quality flaw classifiers is evaluated. 0 0
Predicting Quality Flaws in User-generated Content: The Case of Wikipedia Maik Anderka
Benno Stein
Nedim Lipka
User-generated Content Analysis
Information quality
Wikipedia
Quality Flaw Prediction
One-class Classification
35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012) English 2012 The detection and improvement of low-quality information is a key concern in Web applications that are based on user-generated content; a popular example is the online encyclopedia Wikipedia. Existing research on quality assessment of user-generated content deals with the classification as to whether the content is high-quality or low-quality. This paper goes one step further: it targets the prediction of quality flaws, this way providing specific indications in which respects low-quality content needs improvement. The prediction is based on user-defined cleanup tags, which are commonly used in many Web applications to tag content that has some shortcomings. We apply this approach to the English Wikipedia, which is the largest and most popular user-generated knowledge source on the Web. We present an automatic mining approach to identify the existing cleanup tags, which provides us with a training corpus of labeled Wikipedia articles. We argue that common binary or multiclass classification approaches are ineffective for the prediction of quality flaws and hence cast quality flaw prediction as a one-class classification problem. We develop a quality flaw model and employ a dedicated machine learning approach to predict Wikipedia's most important quality flaws. Since in the Wikipedia setting the acquisition of significant test data is intricate, we analyze the effects of a biased sample selection. In this regard we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. The flaw prediction performance is evaluated with 10,000 Wikipedia articles that have been tagged with the ten most frequent quality flaws: provided test data with little noise, four flaws can be detected with a precision close to 1. 0 0
Wikidata: a new platform for collaborative data collection Denny Vrandečić Semantic web
Wikipedia
Linked data
DBpedia
International conference companion on World Wide Web English 2012 This year, Wikimedia starts to build a new platform for the collaborative acquisition and maintenance of structured data: Wikidata. Wikidata's prime purpose is to be used within the other Wikimedia projects, like Wikipedia, to provide well-maintained, high-quality data. The nature and requirements of the Wikimedia projects require to develop a few novel, or at least unusual features for Wikidata: Wikidata will be a secondary database, i.e. instead of containing facts it will contain references for facts. It will be fully internationalized. It will contain inconsistent and contradictory facts, in order to represent the diversity of knowledge about a given entity. 0 0
Towards a diversity-minded Wikipedia Fabian Flöck
Denny Vrandečić
Elena Simperl
Wikipedia
Diversity
Community-driven content creation
Social dynamics
Opinion mining
Sentiment analysis
WebSci Conference English June 2011 Wikipedia is a top-ten Web site providing a free encyclopedia created by an open community of volunteer contributors. As investigated in various studies over the past years, contributors have different backgrounds, mindsets and biases; however, the effects - positive and negative - of this diversity on the quality of the Wikipedia content, and on the sustainability of the overall project are yet only partially understood. In this paper we discuss these effects through an analysis of existing scholarly literature in the area and identify directions for future research and development; we also present an approach for diversity-minded content management within Wikipedia that combines techniques from semantic technologies, data and text mining and quantitative social dynamics analysis to create greater awareness of diversity-related issues within theWikipedia community, give readers access to indicators and metrics to understand biases and their impact on the quality of Wikipedia articles, and support editors in achieving balanced versions of these articles that leverage the wealth of knowledge and perspectives inherent to large-scale collaboration. 24 1
Critical Point of View: A Wikipedia Reader Amila Akdag Salah
Nicholas Carr
Shun-ling Chen
Florian Cramer
Morgan Currie
Edgar Enyedy
Andrew Famiglietti
Heather Ford
Mayo Fuster Morell
Cheng Gao
R. Stuart Geiger
Mark Graham
Gautam John
Dror Kamir
Peter B. Kaufman
Scott Kildall
Lawrence Liang
Patrick Lichty
Geert Lovink
Hans Varghese Mathews
Johanna Niesyto
Matheiu O’Neil
Dan O’Sullivan
Joseph M. Reagle
Andrea Scharnhorst
Alan Shapiro
Christian Stegbauer
Nathaniel Stern
Krzystztof Suchecki
Nathaniel Tkacz
Maja van der Velden
Institute of Network Cultures English 2011 For millions of internet users around the globe, the search for new knowledge begins with Wikipedia. The encyclopedia’s rapid rise, novel organization, and freely offered content have been marveled at and denounced by a host of commentators. Critical Point of View moves beyond unflagging praise, well-worn facts, and questions about its reliability and accuracy, to unveil the complex, messy, and controversial realities of a distributed knowledge platform. 0 2
Detection of Text Quality Flaws as a One-class Classification Problem Maik Anderka
Benno Stein
Nedim Lipka
Information quality
Wikipedia
Quality Flaw Prediction
One-class Classification
20th ACM Conference on Information and Knowledge Management (CIKM 11) English 2011 For Web applications that are based on user generated content the detection of text quality flaws is a key concern. Our research contributes to automatic quality flaw detection. In particular, we propose to cast the detection of text quality flaws as a one-class classification problem: we are given only positive examples (= texts containing a particular quality flaw) and decide whether or not an unseen text suffers from this flaw. We argue that common binary or multiclass classification approaches are ineffective in here, and we underpin our approach by a real-world application: we employ a dedicated one-class learning approach to determine whether a given Wikipedia article suffers from certain quality flaws. Since in the Wikipedia setting the acquisition of sensible test data is quite intricate, we analyze the effects of a biased sample selection. In addition, we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. Altogether, provided test data with little noise, four from ten important quality flaws in Wikipedia can be detected with a precision close to 1. 0 0
Towards automatic quality assurance in Wikipedia Maik Anderka
Benno Stein
Nedim Lipka
Wikipedia
Information quality
Flaw Detection
20th International Conference on World Wide Web (WWW 11) English 2011 Featured articles in Wikipedia stand for high information quality, and it has been found interesting to researchers to analyze whether and how they can be distinguished from "ordinary" articles. Here we point out that article discrimination falls far short of writer support or automatic quality assurance: Featured articles are not identified, but are made. Following this motto we compile a comprehensive list of information quality flaws in Wikipedia, model them according to the latest state of the art, and devise one-class classification technology for their identification. 0 0
Wiki-Based Maturing of Process Descriptions Business Process Management Frank Dengler
Denny Vrandečić
English 2011 Traditional process elicitation methods are expensive and time consuming. Recently, a trend toward collaborative, user-centric, on-line business process modeling can be observed. Current social software approaches, satisfying such a collaborative modeling, mostly focus on the graphical development of processes and do not consider existing textual process description like HowTos or guidelines. We address this issue by combining graphical process modeling techniques with a wiki-based light-weight knowledge capturing approach and a background semantic knowledge base. Our approach enables the collaborative maturing of process descriptions with a graphical representation, formal semantic annotations, and natural language. Existing textual process descriptions can be translated into graphical descriptions and formal semantic annotations. Thus, the textual and graphical process descriptions are made explicit and can be further processed. As a result, we provide a holistic approach for collaborative process development that is designed to foster knowledge reuse and maturing within the system. 0 0
Wikiing pro: semantic wiki-based process editor Frank Dengler
Denny Vrandečić
Elena Simperl
English 2011 Recently, a trend toward collaborative, user-centric, on-line process modeling can be observed. Unfortunately, current social software approaches mostly focus on the graphical development of processes and do not consider existing textual process description like HowTos or guidelines. We address this issue by combining graphical process modeling techniques with a wiki-based light-weight knowledge capturing approach and a background semantic knowledge base. Our approach enables the collaborative maturing of process descriptions with a graphical representation, formal semantic annotations, and natural language. By translating existing textual process descriptions into graphical descriptions and formal semantic annotations, we provide a holistic approach for collaborative process development that is designed to foster knowledge reuse and maturing within the system. 0 0
Wikipedia revision toolkit: efficiently accessing Wikipedia's edit history Oliver Ferschke
Torsten Zesch
Iryna Gurevych
HLT English 2011 0 0
Crowdsourcing a Wikipedia Vandalism Corpus Martin Potthast Wikipedia
Vandalism detection
Evaluation
Corpus
SIGIR English 2010 We report on the construction of the PAN Wikipedia vandalism corpus, PAN-WVC-10, using Amazon’s Mechanical Turk. The corpus compiles 32 452 edits on 28 468 Wikipedia articles, among which 2 391 vandalism edits have been identified. 753 human annotators cast a total of 193 022 votes on the edits, so that each edit was reviewed by at least 3 annotators, whereas the achieved level of agreement was analyzed in order to label an edit as “regular” or “vandalism.” The corpus is available free of charge. 6 1
Fixing the floating gap: The online encyclopaedia Wikipedia as a global memory place Christian Pentzold Collective memory
Consensus and contestation
Discourse
World Wide Web
Memory Studies English May 2009 The article proposes to interpret the web-based encyclopaedia Wikipedia as a global memory place. After presenting the core elements and basic characteristics of wikis and Wikipedia respectively, the article discusses four related issues of social memory studies: collective memory, communicative and cultural memory, `memory places' and the `floating gap'. In a third step, these theoretical premises are connected to the understanding of discourse as social cognition. Fourth, comparison is made between the potential of the World Wide Web as cyberspace for collective remembrance and the obstacles that stand in its way. On this basis, the article argues that Wikipedia presents a global memory place where memorable elements are negotiated. Its complex processes of discussion and article creation are a model of the discursive fabrication of memory. Thus, they can be viewed and analysed as the transition, the `floating gap' between communicative and collective frames of memory. 6 2
The ESA Retrieval Model Revisited Maik Anderka
Benno Stein
32th International ACM SIGIR Conference (SIGIR 09) English 2009 Among the retrieval models that have been proposed in the last years, the ESA model of Gabrilovich and Markovitch received much attention. The authors report on a significant improvement in the retrieval performance, which is explained with the semantic concepts introduced by the document collection underlying ESA. Their explanation appears plausible but our analysis shows that the connections are more involved and that the "concept hypothesis" does not hold. In our contribution we analyze several properties that in fact affect the retrieval performance. Moreover, we introduce a formalization of ESA, which reveals its close connection to existing retrieval models. 0 0
A Wikipedia-Based Multilingual Retrieval Model Martin Potthast
Benno Stein
Maik Anderka
30th European Conference on IR Research (ECIR 08) English 2008 This paper introduces CL-ESA, a new multilingual retrieval model for the analysis of cross-language similarity. The retrieval model exploits the multilingual alignment of Wikipedia: given a document d written in language L we construct a concept vector d for d, where each dimension i in d quantifies the similarity of d with respect to a document chosen from the “L-subset” of Wikipedia. Likewise, for a second document d′ written in language L′, , we construct a concept vector d′, using from the L′-subset of the Wikipedia the topic-aligned counterparts of our previously chosen documents. Since the two concept vectors d and d′ are collection-relative representations of d and d′ they are language-independent. I. e., their similarity can directly be computed with the cosine similarity measure, for instance. We present results of an extensive analysis that demonstrates the power of this new retrieval model: for a query document d the topically most similar documents from a corpus in another language are properly ranked. Salient property of the new retrieval model is its robustness with respect to both the size and the quality of the index document collection. 0 0
Automatic Vandalism Detection in Wikipedia Martin Potthast
Benno Stein
Robert Gerling
English 2008 We present results of a new approach to detect destructive article revisions, so-called vandalism, in Wikipedia. Vandalism detection is a one-class classification problem, where vandalism edits are the target to be identified among all revisions. Interestingly, vandalism detection has not been addressed in the Information Retrieval literature by now. In this paper we discuss the characteristics of vandalism as humans recognize it and develop features to render vandalism detection as a machine learning task. We compiled a large number of vandalism edits in a corpus, which allows for the comparison of existing and new detection approaches. Using logistic regression we achieve 83% precision at 77% recall with our model. Compared to the rule-based methods that are currently applied in Wikipedia, our approach increases the F-Measure performance by 49% while being faster at the same time. 0 4
Wikipedia in the pocket: indexing technology for near-duplicate detection and high similarity search Martin Potthast English 2007 0 0
Foucault@Wiki: first steps towards a conceptual framework for the analysis of Wiki discourses Christian Pentzold
Sebastian Seidenglanz
Wiki
Wikipedia
Computer-mediated communication
Online collaboration
Foucault
Discourse theory
WikiSym English 2006 In this paper, we examine the discursive situation of Wikipedia. The primary goal is to explore principle ways of analyzing and characterizing the various forms of communicative user interaction using Foucault"s discourse theory. First, the communicative situation of Wikipedia is addressed and a list of possible forms of communication is compiled. Second, the current research on the linguistic features of Wikis, especially Wikipedia, is reviewed. Third, some key issues of Foucault"s theory are explored: the notion of "discourse", the discursive formation, and the methods of archaeology and genealogy, respectively. Finally, first steps towards a qualitative discourse analysis of the English Wikipedia are elaborated. The paper argues, that Wikipedia can be understood as a discursive formation that regulates and structures the production of statements. Most of the discursive regularities named by Foucault are established in the collaborative writing processes of Wikipedia, too. Moreover, the editing processes can be described in Foucault"s terms as discursive knowledge production. 12 1
Semantic MediaWiki (ISWC 2006) Markus Krötzsch
Denny Vrandečić
Max Völkel
ISWC English 2006 Semantic MediaWiki is an extension of MediaWiki – a widely used wiki-engine that also powers Wikipedia. Its aim is to make semantic technologies available to a broad community by smoothly integrating them with the established usage of MediaWiki. The software is already used on a number of productive installations world-wide, but the main target remains to establish “Semantic Wikipedia” as an early adopter of semantic technologies on the web. Thus usability and scalability are as important as powerful semantic features. 0 0
Wiki Communities in the Context of Work Processes Frank Fuchs-Kittowski
Andre Köhler
Ontology
Wiki
Community
Cooperative knowledge generation
Knowledge work
Work processes
Knowledge process
Process-oriented knowledge structures
WikiSym English 2005 In this article we examine the integration of communities of practice supported by a wiki into work processes. Linear structures are often inappropriate for the execution of knowledge-intensive tasks and work processes. The latter are characterized by non-linear sequences and dynamic social interaction. Communities of practice, however, often lack the „guiding light” needed to structure their work. We discuss the primary requirements for the integration of formally described knowledge-intensive processes into the dynamic social processes of knowledge generation in communities of practice and use the wiki approach for their support. We present our approach for an appropriate interface to integrate wiki communities into process structures and an information retrieval algorithm based on it to connect the process-oriented structures with community-oriented wiki structures. We show the prototypical realization of the concept by a brief example. 0 1
Wiki Templates - Adding Structure Support to Wikis on Demand Anja Haake
Stephan Lukosch
Till Schümmer
Wiki
Template
Tailoring
Structural editing and viewing
WikiSym English 2005 This paper introduces the concept of wiki templates that allows end-users to determine the structure and appearance of a wiki page. In particular, this better supports editing of structured wiki pages. Wiki templates may be adapted (defined and redefined) by end-users. They may be applied if found helpful, but need not to be used, thus maintaining the simple wiki editing way. In addition, we introduce a methodology to reuse wiki templates among different wiki instances. We show how wiki templates have been successfully used in real-world applications in our CURE wiki engine. 1 0
Personal tools
Namespaces
Variants
Views
Actions
Navigation
Create new...
Activity
Data export
Toolbox