Germany
From WikiPapers
| Countries | |
|---|---|
|
Germany |
|
This page compiles all the information regarding Germany.
Events
This is a list of events celebrated in this country.| Name | Type | DateThis property is a special property in this wiki. | Website |
|---|---|---|---|
| Wikipedia CPOV Conference 2010 Leipzig | conference | 24 September 2010 | http://www.cpov.de |
| Wikimania 2005 | conference | 4 August 2005 | http://wikimania2005.wikimedia.org |
Authors
This is a list of authors in this country.Publications
This is a list of publications by authors of this country.| Title | Author(s) | Keyword(s) | Published in | Language | DateThis property is a special property in this wiki. | Abstract | R | C |
|---|---|---|---|---|---|---|---|---|
| Reverts Revisited: Accurate Revert Detection in Wikipedia | Fabian Flöck Denny Vrandečić Elena Simperl |
Wikipedia Revert detection Editing behavior User modeling Collaboration systems Community-driven content creation Social dynamics |
Hypertext and Social Media 2012 | English | June 2012 | Wikipedia is commonly used as a proving ground for research in collaborative systems. This is likely due to its popularity and scale, but also to the fact that large amounts of data about its formation and evolution are freely available to inform and validate theories and models of online collaboration. As part of the development of such approaches, revert detection is often performed as an important pre-processing step in tasks as diverse as the extraction of implicit networks of editors, the analysis of edit or editor features and the removal of noise when analyzing the emergence of the con-tent of an article. The current state of the art in revert detection is based on a rather naïve approach, which identifies revision duplicates based on MD5 hash values. This is an efficient, but not very precise technique that forms the basis for the majority of research based on revert relations in Wikipedia. In this paper we prove that this method has a number of important drawbacks - it only detects a limited number of reverts, while simultaneously misclassifying too many edits as reverts, and not distinguishing between complete and partial reverts. This is very likely to hamper the accurate interpretation of the findings of revert-related research. We introduce an improved algorithm for the detection of reverts based on word tokens added or deleted to adresses these drawbacks. We report on the results of a user study and other tests demonstrating the considerable gains in accuracy and coverage by our method, and argue for a positive trade-off, in certain research scenarios, between these improvements and our algorithm’s increased runtime. | 13 | 0 |
| A Breakdown of Quality Flaws in Wikipedia | Maik Anderka Benno Stein |
Quality Flaws Information quality Wikipedia User-generated Content Analysis |
2nd Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality 12) | English | 2012 | The online encyclopedia Wikipedia is a successful example of the increasing popularity of user generated content on the Web. Despite its success, Wikipedia is often criticized for containing low-quality information, which is mainly attributed to its core policy of being open for editing by everyone. The identification of low-quality information is an important task since Wikipedia has become the primary source of knowledge for a huge number of people around the world. Previous research on quality assessment in Wikipedia either investigates only small samples of articles, or else focuses on single quality aspects, like accuracy or formality. This paper targets the investigation of quality flaws, and presents the first complete breakdown of Wikipedia's quality flaw structure. We conduct an extensive exploratory analysis, which reveals (1) the quality flaws that actually exist, (2) the distribution of flaws in Wikipedia, and (3) the extent of flawed content. An important finding is that more than one in four English Wikipedia articles contains at least one quality flaw, 70% of which concern article verifiability. | 0 | 0 |
| FlawFinder: A Modular System for Predicting Quality Flaws in Wikipedia | Oliver Ferschke Iryna Gurevych Marc Rittberger |
PAN | English | 2012 | With over 23 million articles in 285 languages, Wikipedia is the largest free knowledge base on the web. Due to its open nature, everybody is allowed to access and edit the contents of this huge encyclopedia. As a downside of this open access policy, quality assessment of the content becomes a critical issue and is hardly manageable without computational assistance. In this paper, we present FlawFinder, a modular system for automatically predicting quality flaws in unseen Wikipedia articles. It competed in the inaugural edition of the Quality Flaw Prediction Task at the PAN Challenge 2012 and achieved the best precision of all systems and the second place in terms of recall and F1-score. | 0 | 0 | |
| On the Evolution of Quality Flaws and the Effectiveness of Cleanup Tags in the English Wikipedia | Maik Anderka Benno Stein Matthias Busse |
Wikipedia Cleanup Tags Quality Flaws Information quality Quality Flaw Evolution |
Wikipedia Academy | English | 2012 | The improvement of information quality is a major task for the free online encyclopedia Wikipedia. Recent studies targeted the analysis and detection of specific quality flaws in Wikipedia articles. To date, quality flaws have been exclusively investigated in current Wikipedia articles, based on a snapshot representing the state of Wikipedia at a certain time. This paper goes further, and provides the first comprehensive breakdown of the evolution of quality flaws in Wikipedia. We utilize cleanup tags to analyze the quality flaws that have been tagged by the Wikipedia community in the English Wikipedia, from its launch in 2001 until 2011. This leads to interesting findings regarding (1) the development of Wikipedia's quality flaw structure and (1) the usage and the effectiveness of cleanup tags. Specifically, we show that inline tags are more effective than tag boxes, and provide statistics about the considerable volume of rare and non-specific cleanup tags. We expect that this work will support the Wikipedia community in making quality assurance activities more efficient. | 0 | 0 |
| Overview of the 1st International Competition on Quality Flaw Prediction in Wikipedia | Maik Anderka Benno Stein |
Information quality Wikipedia Quality Flaw Prediction |
CLEF | English | 2012 | The paper overviews the task "Quality Flaw Prediction in Wikipedia" of the PAN'12 competition. An evaluation corpus is introduced which comprises 1,592,226 English Wikipedia articles, of which 208,228 have been tagged to contain one of ten important quality flaws. Moreover, the performance of three quality flaw classifiers is evaluated. | 0 | 0 |
| Predicting Quality Flaws in User-generated Content: The Case of Wikipedia | Maik Anderka Benno Stein Nedim Lipka |
User-generated Content Analysis Information quality Wikipedia Quality Flaw Prediction One-class Classification |
35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012) | English | 2012 | The detection and improvement of low-quality information is a key concern in Web applications that are based on user-generated content; a popular example is the online encyclopedia Wikipedia. Existing research on quality assessment of user-generated content deals with the classification as to whether the content is high-quality or low-quality. This paper goes one step further: it targets the prediction of quality flaws, this way providing specific indications in which respects low-quality content needs improvement. The prediction is based on user-defined cleanup tags, which are commonly used in many Web applications to tag content that has some shortcomings. We apply this approach to the English Wikipedia, which is the largest and most popular user-generated knowledge source on the Web. We present an automatic mining approach to identify the existing cleanup tags, which provides us with a training corpus of labeled Wikipedia articles. We argue that common binary or multiclass classification approaches are ineffective for the prediction of quality flaws and hence cast quality flaw prediction as a one-class classification problem. We develop a quality flaw model and employ a dedicated machine learning approach to predict Wikipedia's most important quality flaws. Since in the Wikipedia setting the acquisition of significant test data is intricate, we analyze the effects of a biased sample selection. In this regard we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. The flaw prediction performance is evaluated with 10,000 Wikipedia articles that have been tagged with the ten most frequent quality flaws: provided test data with little noise, four flaws can be detected with a precision close to 1. | 0 | 0 |
| Wikidata: a new platform for collaborative data collection | Denny Vrandečić | Semantic web Wikipedia Linked data DBpedia |
International conference companion on World Wide Web | English | 2012 | This year, Wikimedia starts to build a new platform for the collaborative acquisition and maintenance of structured data: Wikidata. Wikidata's prime purpose is to be used within the other Wikimedia projects, like Wikipedia, to provide well-maintained, high-quality data. The nature and requirements of the Wikimedia projects require to develop a few novel, or at least unusual features for Wikidata: Wikidata will be a secondary database, i.e. instead of containing facts it will contain references for facts. It will be fully internationalized. It will contain inconsistent and contradictory facts, in order to represent the diversity of knowledge about a given entity. | 0 | 0 |
| Towards a diversity-minded Wikipedia | Fabian Flöck Denny Vrandečić Elena Simperl |
Wikipedia Diversity Community-driven content creation Social dynamics Opinion mining Sentiment analysis |
WebSci Conference | English | June 2011 | Wikipedia is a top-ten Web site providing a free encyclopedia created by an open community of volunteer contributors. As investigated in various studies over the past years, contributors have different backgrounds, mindsets and biases; however, the effects - positive and negative - of this diversity on the quality of the Wikipedia content, and on the sustainability of the overall project are yet only partially understood. In this paper we discuss these effects through an analysis of existing scholarly literature in the area and identify directions for future research and development; we also present an approach for diversity-minded content management within Wikipedia that combines techniques from semantic technologies, data and text mining and quantitative social dynamics analysis to create greater awareness of diversity-related issues within theWikipedia community, give readers access to indicators and metrics to understand biases and their impact on the quality of Wikipedia articles, and support editors in achieving balanced versions of these articles that leverage the wealth of knowledge and perspectives inherent to large-scale collaboration. | 24 | 1 |
| Critical Point of View: A Wikipedia Reader | Amila Akdag Salah Nicholas Carr Shun-ling Chen Florian Cramer Morgan Currie Edgar Enyedy Andrew Famiglietti Heather Ford Mayo Fuster Morell Cheng Gao R. Stuart Geiger Mark Graham Gautam John Dror Kamir Peter B. Kaufman Scott Kildall Lawrence Liang Patrick Lichty Geert Lovink Hans Varghese Mathews Johanna Niesyto Matheiu O’Neil Dan O’Sullivan Joseph M. Reagle Andrea Scharnhorst Alan Shapiro Christian Stegbauer Nathaniel Stern Krzystztof Suchecki Nathaniel Tkacz Maja van der Velden |
Institute of Network Cultures | English | 2011 | For millions of internet users around the globe, the search for new knowledge begins with Wikipedia. The encyclopedia’s rapid rise, novel organization, and freely offered content have been marveled at and denounced by a host of commentators. Critical Point of View moves beyond unflagging praise, well-worn facts, and questions about its reliability and accuracy, to unveil the complex, messy, and controversial realities of a distributed knowledge platform. | 0 | 2 | |
| Detection of Text Quality Flaws as a One-class Classification Problem | Maik Anderka Benno Stein Nedim Lipka |
Information quality Wikipedia Quality Flaw Prediction One-class Classification |
20th ACM Conference on Information and Knowledge Management (CIKM 11) | English | 2011 | For Web applications that are based on user generated content the detection of text quality flaws is a key concern. Our research contributes to automatic quality flaw detection. In particular, we propose to cast the detection of text quality flaws as a one-class classification problem: we are given only positive examples (= texts containing a particular quality flaw) and decide whether or not an unseen text suffers from this flaw. We argue that common binary or multiclass classification approaches are ineffective in here, and we underpin our approach by a real-world application: we employ a dedicated one-class learning approach to determine whether a given Wikipedia article suffers from certain quality flaws. Since in the Wikipedia setting the acquisition of sensible test data is quite intricate, we analyze the effects of a biased sample selection. In addition, we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. Altogether, provided test data with little noise, four from ten important quality flaws in Wikipedia can be detected with a precision close to 1. | 0 | 0 |
| Towards automatic quality assurance in Wikipedia | Maik Anderka Benno Stein Nedim Lipka |
Wikipedia Information quality Flaw Detection |
20th International Conference on World Wide Web (WWW 11) | English | 2011 | Featured articles in Wikipedia stand for high information quality, and it has been found interesting to researchers to analyze whether and how they can be distinguished from "ordinary" articles. Here we point out that article discrimination falls far short of writer support or automatic quality assurance: Featured articles are not identified, but are made. Following this motto we compile a comprehensive list of information quality flaws in Wikipedia, model them according to the latest state of the art, and devise one-class classification technology for their identification. | 0 | 0 |
| Wiki-Based Maturing of Process Descriptions Business Process Management | Frank Dengler Denny Vrandečić |
English | 2011 | Traditional process elicitation methods are expensive and time consuming. Recently, a trend toward collaborative, user-centric, on-line business process modeling can be observed. Current social software approaches, satisfying such a collaborative modeling, mostly focus on the graphical development of processes and do not consider existing textual process description like HowTos or guidelines. We address this issue by combining graphical process modeling techniques with a wiki-based light-weight knowledge capturing approach and a background semantic knowledge base. Our approach enables the collaborative maturing of process descriptions with a graphical representation, formal semantic annotations, and natural language. Existing textual process descriptions can be translated into graphical descriptions and formal semantic annotations. Thus, the textual and graphical process descriptions are made explicit and can be further processed. As a result, we provide a holistic approach for collaborative process development that is designed to foster knowledge reuse and maturing within the system. | 0 | 0 | ||
| Wikiing pro: semantic wiki-based process editor | Frank Dengler Denny Vrandečić Elena Simperl |
English | 2011 | Recently, a trend toward collaborative, user-centric, on-line process modeling can be observed. Unfortunately, current social software approaches mostly focus on the graphical development of processes and do not consider existing textual process description like HowTos or guidelines. We address this issue by combining graphical process modeling techniques with a wiki-based light-weight knowledge capturing approach and a background semantic knowledge base. Our approach enables the collaborative maturing of process descriptions with a graphical representation, formal semantic annotations, and natural language. By translating existing textual process descriptions into graphical descriptions and formal semantic annotations, we provide a holistic approach for collaborative process development that is designed to foster knowledge reuse and maturing within the system. | 0 | 0 | ||
| Wikipedia revision toolkit: efficiently accessing Wikipedia's edit history | Oliver Ferschke Torsten Zesch Iryna Gurevych |
HLT | English | 2011 | 0 | 0 | ||
| Crowdsourcing a Wikipedia Vandalism Corpus | Martin Potthast | Wikipedia Vandalism detection Evaluation Corpus |
SIGIR | English | 2010 | We report on the construction of the PAN Wikipedia vandalism corpus, PAN-WVC-10, using Amazon’s Mechanical Turk. The corpus compiles 32 452 edits on 28 468 Wikipedia articles, among which 2 391 vandalism edits have been identified. 753 human annotators cast a total of 193 022 votes on the edits, so that each edit was reviewed by at least 3 annotators, whereas the achieved level of agreement was analyzed in order to label an edit as “regular” or “vandalism.” The corpus is available free of charge. | 6 | 1 |
| Fixing the floating gap: The online encyclopaedia Wikipedia as a global memory place | Christian Pentzold | Collective memory Consensus and contestation Discourse World Wide Web |
Memory Studies | English | May 2009 | The article proposes to interpret the web-based encyclopaedia Wikipedia as a global memory place. After presenting the core elements and basic characteristics of wikis and Wikipedia respectively, the article discusses four related issues of social memory studies: collective memory, communicative and cultural memory, `memory places' and the `floating gap'. In a third step, these theoretical premises are connected to the understanding of discourse as social cognition. Fourth, comparison is made between the potential of the World Wide Web as cyberspace for collective remembrance and the obstacles that stand in its way. On this basis, the article argues that Wikipedia presents a global memory place where memorable elements are negotiated. Its complex processes of discussion and article creation are a model of the discursive fabrication of memory. Thus, they can be viewed and analysed as the transition, the `floating gap' between communicative and collective frames of memory. | 6 | 2 |
| The ESA Retrieval Model Revisited | Maik Anderka Benno Stein |
32th International ACM SIGIR Conference (SIGIR 09) | English | 2009 | Among the retrieval models that have been proposed in the last years, the ESA model of Gabrilovich and Markovitch received much attention. The authors report on a significant improvement in the retrieval performance, which is explained with the semantic concepts introduced by the document collection underlying ESA. Their explanation appears plausible but our analysis shows that the connections are more involved and that the "concept hypothesis" does not hold. In our contribution we analyze several properties that in fact affect the retrieval performance. Moreover, we introduce a formalization of ESA, which reveals its close connection to existing retrieval models. | 0 | 0 | |
| A Wikipedia-Based Multilingual Retrieval Model | Martin Potthast Benno Stein Maik Anderka |
30th European Conference on IR Research (ECIR 08) | English | 2008 | This paper introduces CL-ESA, a new multilingual retrieval model for the analysis of cross-language similarity. The retrieval model exploits the multilingual alignment of Wikipedia: given a document d written in language L we construct a concept vector d for d, where each dimension i in d quantifies the similarity of d with respect to a document chosen from the “L-subset” of Wikipedia. Likewise, for a second document d′ written in language L′, , we construct a concept vector d′, using from the L′-subset of the Wikipedia the topic-aligned counterparts of our previously chosen documents. Since the two concept vectors d and d′ are collection-relative representations of d and d′ they are language-independent. I. e., their similarity can directly be computed with the cosine similarity measure, for instance. We present results of an extensive analysis that demonstrates the power of this new retrieval model: for a query document d the topically most similar documents from a corpus in another language are properly ranked. Salient property of the new retrieval model is its robustness with respect to both the size and the quality of the index document collection. | 0 | 0 | |
| Automatic Vandalism Detection in Wikipedia | Martin Potthast Benno Stein Robert Gerling |
English | 2008 | We present results of a new approach to detect destructive article revisions, so-called vandalism, in Wikipedia. Vandalism detection is a one-class classification problem, where vandalism edits are the target to be identified among all revisions. Interestingly, vandalism detection has not been addressed in the Information Retrieval literature by now. In this paper we discuss the characteristics of vandalism as humans recognize it and develop features to render vandalism detection as a machine learning task. We compiled a large number of vandalism edits in a corpus, which allows for the comparison of existing and new detection approaches. Using logistic regression we achieve 83% precision at 77% recall with our model. Compared to the rule-based methods that are currently applied in Wikipedia, our approach increases the F-Measure performance by 49% while being faster at the same time. | 0 | 4 | ||
| Wikipedia in the pocket: indexing technology for near-duplicate detection and high similarity search | Martin Potthast | English | 2007 | 0 | 0 | |||
| Foucault@Wiki: first steps towards a conceptual framework for the analysis of Wiki discourses | Christian Pentzold Sebastian Seidenglanz |
Wiki Wikipedia Computer-mediated communication Online collaboration Foucault Discourse theory |
WikiSym | English | 2006 | In this paper, we examine the discursive situation of Wikipedia. The primary goal is to explore principle ways of analyzing and characterizing the various forms of communicative user interaction using Foucault"s discourse theory. First, the communicative situation of Wikipedia is addressed and a list of possible forms of communication is compiled. Second, the current research on the linguistic features of Wikis, especially Wikipedia, is reviewed. Third, some key issues of Foucault"s theory are explored: the notion of "discourse", the discursive formation, and the methods of archaeology and genealogy, respectively. Finally, first steps towards a qualitative discourse analysis of the English Wikipedia are elaborated. The paper argues, that Wikipedia can be understood as a discursive formation that regulates and structures the production of statements. Most of the discursive regularities named by Foucault are established in the collaborative writing processes of Wikipedia, too. Moreover, the editing processes can be described in Foucault"s terms as discursive knowledge production. | 12 | 1 |
| Semantic MediaWiki (ISWC 2006) | Markus Krötzsch Denny Vrandečić Max Völkel |
ISWC | English | 2006 | Semantic MediaWiki is an extension of MediaWiki – a widely used wiki-engine that also powers Wikipedia. Its aim is to make semantic technologies available to a broad community by smoothly integrating them with the established usage of MediaWiki. The software is already used on a number of productive installations world-wide, but the main target remains to establish “Semantic Wikipedia” as an early adopter of semantic technologies on the web. Thus usability and scalability are as important as powerful semantic features. | 0 | 0 | |
| Wiki Communities in the Context of Work Processes | Frank Fuchs-Kittowski Andre Köhler |
Ontology Wiki Community Cooperative knowledge generation Knowledge work Work processes Knowledge process Process-oriented knowledge structures |
WikiSym | English | 2005 | In this article we examine the integration of communities of practice supported by a wiki into work processes. Linear structures are often inappropriate for the execution of knowledge-intensive tasks and work processes. The latter are characterized by non-linear sequences and dynamic social interaction. Communities of practice, however, often lack the „guiding light” needed to structure their work. We discuss the primary requirements for the integration of formally described knowledge-intensive processes into the dynamic social processes of knowledge generation in communities of practice and use the wiki approach for their support. We present our approach for an appropriate interface to integrate wiki communities into process structures and an information retrieval algorithm based on it to connect the process-oriented structures with community-oriented wiki structures. We show the prototypical realization of the concept by a brief example. | 0 | 1 |
| Wiki Templates - Adding Structure Support to Wikis on Demand | Anja Haake Stephan Lukosch Till Schümmer |
Wiki Template Tailoring Structural editing and viewing |
WikiSym | English | 2005 | This paper introduces the concept of wiki templates that allows end-users to determine the structure and appearance of a wiki page. In particular, this better supports editing of structured wiki pages. Wiki templates may be adapted (defined and redefined) by end-users. They may be applied if found helpful, but need not to be used, thus maintaining the simple wiki editing way. In addition, we introduce a methodology to reuse wiki templates among different wiki instances. We show how wiki templates have been successfully used in real-world applications in our CURE wiki engine. | 1 | 0 |
