List of peer-reviewed publications

From WikiPapers
Jump to: navigation, search
See also: List of publications, List of non peer-reviewed publications.

This is a list of all the peer-reviewed publications available in WikiPapers. Currently, there are 664 peer-reviewed publications.

Export: BibTeX, CSV, RDF, JSON

To create a new "publication" go to Form:Publication.


Title Author(s) Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Identifying Frostquakes in Central Canada and Neighbouring Regions in the United States with Social Media Anderw C.W. Leung
William A. Gough
Yehong Shi
Social media
Collaborative mapping
Citizen Empowered Mapping English 2017 Following the ice storm of December 2013 in southern Ontario, the general public heard noises that resembled falling trees and reported these occurrences on social media. These were identified as a rare phenomenon called cryoseism, or more commonly known as frostquakes. These occurrences became the first large-scale documented frostquakes in Canada. Using meteorological metrics, we were able to forecast two subsequent frostquake events in January 2014 that coincided with reports on social media. In total, six more episodes of frostquakes as well as their locations were identified in January and February of 2014. Results showed that in central Canada, frostquake occurrences ranged from Windsor, Ontario to the west to Montreal, Quebec to the east and from Niagara Falls, Ontario to the south to North Bay, Ontario to the north. In the United States, the reports came from states bordering the Great Lakes and the New England areas. Two frostquake clusters were identified, one in and around the Greater Toronto Area and the other in eastern Wisconsin. Frostquakes were most frequently heard at nighttime. We critically assess the use of social media as an observation network including the possibility of false positives and population bias. This study demonstrates that rare phenomena such as frostquakes can be identified and assessed using data gathered through social media. 0 0
Change in access to heritage after digitization: ethnographic collections in Wikipedia Trilce Navarrete
Karol J. Borowiecki
Heritage consumption
Digital heritage
Exhibition history
Cultural Trends English 17 October 2016 Visits to museums have been studied as hedonic and utilitarian forms of cultural consumption, though limited attention has been given to the access of museum collections online. We perform a unique historic analysis of the visibility of collections in a museum of ethnographic collections and compare 100 years of onsite visits to 5 years online visits. We find two main results: first, access to collections increased substantially online. From a selection of objects available both onsite and online, access grew from an average of 156,000 onsite visits per year to over 1.5 million views online per year. Onsite, the museum received 15.5 million visits in a span of a century while online, collections were viewed 7.9 million times in only the last 5 years. Second, we find a difference in consumer preference for type of object, favouring 3D onsite and 2D online (photographs of objects, particularly when showing them being used). Results support understanding of online heritage consumption and emerging dynamics, particularly outside of an institutional environment, such as Wikipedia. 0 0
Similar Gaps, Different Origins? Women Readers and Editors at Greek Wikipedia Ioannis Protonotarios
Vasiliki Sarimpei
Jahna Otterbacher
Gender gap
Quantitative analysis
Tenth International AAAI Conference on Web and Social Media English 17 May 2016 As a global, multilingual project, Wikipedia could serve as a repository for the world’s knowledge on an astounding range of topics. However, questions of participation and diversity among editors continue to be burning issues. We present the first targeted study of participants at Greek Wikipedia, with the goal of better understanding their motivations. Smaller Wikipedias play a key role in fostering the project’s global character, but typically receive little attention from researchers. We developed two survey instruments, administered in Greek, based on the 2011 Wikipedia Readership and Editors Surveys. Consistent with previous studies, we found a gender gap, with women making up only 38% and 15% of readers and editors, respectively, and with men editors being much more active. Our data suggest two salient explanations: 1) women readers more often lack confidence with respect to their knowledge and technical skills as compared to men, and 2) women’s behaviors may be driven by personal motivations such as enjoyment and learning, rather than by “leaving their mark” on the community, a concern more common among men. Interestingly, while similar proportions of men and women readers use multiple language editions, more women contribute to English Wikipedia in addition to the Greek language community. Future research should consider how this impacts their participation at Greek Wikipedia 11 0
Linguistic neighbourhoods: explaining cultural borders on Wikipedia through multilingual co-editing activity Anna Samoilenko
Fariba Karimi
Daniel Edler
Jérôme Kunegis
Markus Strohmaier
Wikipedia multilingual cultural similarity network digital language divide socio-linguistics digital humanities hypothesis testing EPJ Data Science English 11 March 2016 In this paper, we study the network of global interconnections between language communities, based on shared co-editing interests of Wikipedia editors, and show that although English is discussed as a potential lingua franca of the digital space, its domination disappears in the network of co-editing similarities, and instead local connections come to the forefront. Out of the hypotheses we explored, bilingualism, linguistic similarity of languages, and shared religion provide the best explanations for the similarity of interests between cultural communities. Population attraction and geographical proximity are also significant, but much weaker factors bringing communities together. In addition, we present an approach that allows for extracting significant cultural borders from editing activity of Wikipedia users, and comparing a set of hypotheses about the social mechanisms generating these borders. Our study sheds light on how culture is reflected in the collective process of archiving knowledge on Wikipedia, and demonstrates that cross-lingual interconnections on Wikipedia are not dominated by one powerful language. Our findings also raise some important policy questions for the Wikimedia Foundation. 0 0
Competencias informacionales básicas y uso de Wikipedia en entornos educativos Jesús Tramullas Wikipedia
Information literacy
Gestión de la Innovación en Educación Superior/Journal of Innovation Management in Higher Education Spanish 2016 This paper reviews the relationship of Wikipedia with educational processes, thereby adopting an approach based on the approximation that offers information literacy approach. To this end it is necessary to: a) review the common criticism of Wikipedia; b) review the basic concepts of information literacy; c) propose a generic integration framework of Wikipedia in teaching and learning. To do this, aims to establish a blueprint for the design, implementation and development of processes and actions of information literacy, integrated into the core educational process, using Wikipedia. Finally, suggests the use of information literacy as a continuous element within the overall process of teaching and learning. 0 0
Una comparazione delle reti di ringraziamenti di Wikipedia di alcuni paesi europei Valerio Perticone
Marco Elio Tabacchi
Linguaggio, Cognizione e Società Italian December 2015 Da maggio 2013 l’enciclopedia collaborativa Wikipedia fornisce ad ogni collaboratore la possibilità di esprimere agli altri autori apprezzamento per la creazione o modifica di una specifica voce. Attraverso la funzionalità ringraziamenti l'utente può inviare all'autore un messaggio standard premendo l'apposito pulsante ‘ringrazia’. Il sistema dei ringraziamenti è stato successivamente esteso alle edizioni nelle principali lingue europee. È possibile considerare l’insieme dei ringraziamenti come una rete sociale, rappresentata da un multigrafo in cui gli utenti sono i nodi ed i ringraziamenti archi. Ignorando gli archi multipli ed il verso dell’arco si ha una rete in cui l'esistenza di un ringraziamento stabilisce una relazione, come nei modelli di social network descritti da Boyd e Ellison.

Lo studio della topologia di questa rete può rivelare informazioni sulle relazioni tra i collaboratori, senza dover conoscere in dettaglio le modifiche effettuate dagli utenti e studiare eventuali interazioni pregresse tra i soggetti, come ad es. modifiche effettuate dagli utenti nelle stesse voci, discussioni svolte nelle pagine comunitarie, interessi comuni dichiarati dagli utenti nei relativi profili. Nonostante l'assenza di una esplicita componente sociale nella redazione di una enciclopedia è possibile ipotizzare, a partire dalle modalità di formazione e tenendo conto delle evidenti analogie tra essa e le reti sociali più diffuse, che la rete dei ringraziamenti abbia una topologia small world e scale-free. In letteratura esistono numerosi esempi naturali ed artificiali di reti con tale topologia, che garantisce doti di robustezza e resilienza alla rete. Nelle reti small world, tipiche dei social network di tipo simmetrico, i nodi hanno un alto coefficiente di clustering rispetto ad una rete casuale di pari dimensioni: chi fa parte di una cerchia tende ad essere collegato a molti altri membri. Il cammino medio per andare da un nodo all’altro è inoltre breve rispetto alla dimensione del network (sei gradi di separazione).

Le reti scale-free presentano un alto numero di nodi con pochi collegamenti, e un ristretto numero di nodi (i cd. hub) con moltissimi collegamenti, secondo la distribuzione esponenziale P(x) = x^-α, proprietà verificabile usando un algoritmo basato sul test di Kolmogorov-Smirnov. In questo articolo verificheremo se le doti di robustezza e resilienza e la presenza di hub della rete dei ringraziamenti tipiche delle reti prima descritte sono presenti nelle versioni linguistiche di Wikipedia a maggior diffusione, e commenteremo una particolarità riscontrata su Wikipedia in tedesco.
0 0
Change in access after digitization: Ethnographic collections in Wikipedia Trilce Navarrete
Karol J. Borowiecki
Heritage consumpti on
Digital heritage
A ccess
E xhibition history
ACEI Working Paper Series English October 2015 The raison d’être of memory institutions revolves around collecting, preserving and giving access to heritage collections. Increasingly, access takes place in social networked markets characterized by communities of users that serve to select and rank content to facilitate reuse. Publication of heritage in such digital medium transforms patterns of consumption. We performed a quantitative analysis on the access to a museum collection and compared results before and after publication on Wikimedia. Analysis of the difference in access showed two main results: first, access to collections increased substantially online. From a selection of the most viewed objects, access grew from an average of 156,000 onsite visitors per year (or 15.5 million in a century) to over 1.5 million views online per year (or 7.9 million in five years). Second, we find a long tail in both mediums, where 8% of objects were exhibited onsite and 11% of available objects online were used in Wikipedia articles (representing 1% of the total collection). We further document differences in consumer preference for type of object, favouring 3D onsite and 2D online, as well as topic and language preference, favouring Wikipedia articles about geography and in English. Online publication is hence an important complement to onsite exhibitions to increase access to collections. Results shed light on online consumption of heritage content by consumers who may not necessarily visit heritage sites. 0 0
Content Volatility of Scientific Topics in Wikipedia: A Cautionary Tale Adam M. Wilson
Gene E. Likens
PLoS ONE English 14 August 2015 Wikipedia has quickly become one of the most frequently accessed encyclopedic references, despite the ease with which content can be changed and the potential for ‘edit wars’ surrounding controversial topics. Little is known about how this potential for controversy affects the accuracy and stability of information on scientific topics, especially those with associated political controversy. Here we present an analysis of the Wikipedia edit histories for seven scientific articles and show that topics we consider politically but not scientifically “controversial” (such as evolution and global warming) experience more frequent edits with more words changed per day than pages we consider “noncontroversial” (such as the standard model in physics or heliocentrism). For example, over the period we analyzed, the global warming page was edited on average (geometric mean ±SD) 1.9±2.7 times resulting in 110.9±10.3 words changed per day, while the standard model in physics was only edited 0.2±1.4 times resulting in 9.4±5.0 words changed per day. The high rate of change observed in these pages makes it difficult for experts to monitor accuracy and contribute time-consuming corrections, to the possible detriment of scientific accuracy. As our society turns to Wikipedia as a primary source of scientific information, it is vital we read it critically and with the understanding that the content is dynamic and vulnerable to vandalism and other shenanigans. (cc-by) 0 0
Un'analisi preliminare della rete dei ringraziamenti su Wikipedia Valerio Perticone
Marco Elio Tabacchi
Il futuro prossimo della Scienza Cognitiva Italian July 2015 L’enciclopedia gratuita online Wikipedia fornisce ad ogni collaboratore la possibilità di esprimere agli altri autori apprezzamento per la creazione o modifica di una specifica voce attraverso la funzionalità ​ringraziamenti​, implementata dall’aprile 2013 per mezzo del sistema di notifiche ​echo​: ​accanto ad ogni modifica, un link ‘ringrazia’ permette di inviare all’autore un messaggio standard di ringraziamento. L’insieme dei ringraziamenti può essere visto come una rete sociale direzionata, rappresentata da un multigrafo in cui gli utenti sono i nodi ed i ringraziamenti archi orientati. Ignorando gli archi multipli ed il verso dell’arco si ha una rete in cui l'esistenza di un ringraziamento stabilisce una relazione tra due collaboratori, come nei modelli di social network descritti da Boyd e Ellison.

Lo studio della topologia di questa rete può rivelare informazioni sulle relazioni tra i collaboratori, senza dover conoscere in dettaglio le modifiche effettuate dagli utenti e studiare eventuali interazioni pregresse tra i soggetti, come ad es. modifiche effettuate dagli utenti nelle stesse voci, discussioni svolte nelle pagine comunitarie, interessi comuni dichiarati dagli utenti nei relativi profili.

Scopo di questo lavoro pilota è verificare con l’ausilio dei dati disponibili le doti di robustezza e resilienza e la presenza di hub della rete dei ringraziamenti ed eventualmente di esprimere delle ipotesi su un eventuale scostamento.
0 0
A Platform for Visually Exploring the Development of Wikipedia Articles Erik Borra
David Laniado
Esther Weltevrede
Michele Mauri
Giovanni Magni
Tommaso Venturini
Paolo Ciuccarelli
Richard Rogers
Andreas Kaltenbrunner
ICWSM '15 - 9th International AAAI Conference on Web and Social Media English May 2015 When looking for information on Wikipedia, Internet users generally just read the latest version of an article. However, in its back-end there is much more: associated to each article are the edit history and talk pages, which together entail its full evolution. These spaces can typically reach thousands of contributions, and it is not trivial to make sense of them by manual inspection. This issue also affects Wikipedians, especially the less experienced ones, and constitutes a barrier for new editor engagement and retention. To address these limitations, Contropedia offers its users unprecedented access to the development of an article, using wiki links as focal points. 0 0
Societal Controversies in Wikipedia Articles Erik Borra
Esther Weltevrede
Paolo Ciuccarelli
Andreas Kaltenbrunner
David Laniado
Giovanni Magni
Michele Mauri
Richard Rogers
Controversy Mapping
Social Science
Data Visualization
CHI '15 - Proceedings of the 33rd annual ACM conference on Human factors in computing systems English April 2015 Collaborative content creation inevitably reaches situations where different points of view lead to conflict. We focus on Wikipedia, the free encyclopedia anyone may edit, where disputes about content in controversial articles often reflect larger societal debates. While Wikipedia has a public edit history and discussion section for every article, the substance of these sections is difficult to phantom for Wikipedia users interested in the development of an article and in locating which topics were most controversial. In this paper we present Contropedia, a tool that augments Wikipedia articles and gives insight into the development of controversial topics. Contropedia uses an efficient language agnostic measure based on the edit history that focuses on wiki links to easily identify which topics within a Wikipedia article have been most controversial and when. 0 0
"The sum of all human knowledge": A systematic review of scholarly research on the content of Wikipedia Mostafa Mesgari
Chitu Okoli
Mohamad Mehdi
Finn Årup Nielsen
Arto Lanamäki
Systematic literature review
Information quality
Journal of the Association for Information Science and Technology English February 2015 Wikipedia may be the best-developed attempt thus far to gather all human knowledge in one place. Its accomplishments in this regard have made it a point of inquiry for researchers from different fields of knowledge. A decade of research has thrown light on many aspects of the Wikipedia community, its processes, and its content. However, due to the variety of fields inquiring about Wikipedia and the limited synthesis of the extensive research, there is little consensus on many aspects of Wikipedia's content as an encyclopedic collection of human knowledge. This study addresses the issue by systematically reviewing 110 peer-reviewed publications on Wikipedia content, summarizing the current findings, and highlighting the major research trends. Two major streams of research are identified: the quality of Wikipedia content (including comprehensiveness, currency, readability, and reliability) and the size of Wikipedia. Moreover, we present the key research trends in terms of the domains of inquiry, research design, data source, and data gathering methods. This review synthesizes scholarly understanding of Wikipedia content and paves the way for future studies. 0 0
Organización del conocimiento en entornos wiki: una experiencia de organización de información sobre lecturas académicas Jesús Tramullas
Ana I. Sánchez
Piedad Garrido-Picazo
Higher education
User generated content
Organización del conocimiento: sistemas de información abiertos. Actas del XII Congreso ISKO España y II Congreso ISKO España y Portugal Spanish 2015 This paper reviews the informational behavior of a community of university students during the development of a learning activity with a wiki. Through a case study, analyzes the data available on the wiki, and identifies patterns of creating and organizing content. The wiki study is also done within the information management framework proposed by Rowley. The findings support the conclusion that students apply the principle of economy of effort in their informational behavior, guided by the assessment requirements of that activity, and Rowley's proposal is not suitable for analyzing and evaluating educational processes technologically mediated. 0 0
Wikipedia como objeto de investigación Jesús Tramullas Wikipedia
Research literature
Anuario ThinkEPI Spanish 2015 This short paper analyzes Wikipedia as an object of scientific research, contrasting various studies dealing with that popular encyclopedia. The conclusion is that Wikipedia, as a manifestation of collaborative production and consumption of knowledge, is a valid subject of scientific research. 0 0
Motivations for Contributing to Health-Related Articles on Wikipedia: An Interview Study Farič N
Potts HWW
Social media
Consumer health information
Journal of Medical Internet Research English 3 December 2014 Background: Wikipedia is one of the most accessed sources of health information online. The current English-language Wikipedia contains more than 28,000 articles pertaining to health.

Objective: The aim was to characterize individuals’ motivations for contributing to health content on the English-language Wikipedia.

Methods: A set of health-related articles were randomly selected and recent contributors invited to complete an online questionnaire and follow-up interview (by Skype, by email, or face-to-face). Interviews were transcribed and analyzed using thematic analysis and a realist grounded theory approach.

Results: A total of 32 Wikipedians (31 men) completed the questionnaire and 17 were interviewed. Those completing the questionnaire had a mean age of 39 (range 12-59) years; 16 had a postgraduate qualification, 10 had or were currently studying for an undergraduate qualification, 3 had no more than secondary education, and 3 were still in secondary education. In all, 15 were currently working in a health-related field (primarily clinicians). The median period for which they have been an active editing Wikipedia was 3-5 years. Of this group, 12 were in the United States, 6 were in the United Kingdom, 4 were in Canada, and the remainder from another 8 countries. Two-thirds spoke more than 1 language and 90% (29/32) were also active contributors in domains other than health. Wikipedians in this study were identified as health professionals, professionals with specific health interests, students, and individuals with health problems. Based on the interviews, their motivations for editing health-related content were summarized in 5 strongly interrelated categories: education (learning about subjects by editing articles), help (wanting to improve and maintain Wikipedia), responsibility (responsibility, often a professional responsibility, to provide good quality health information to readers), fulfillment (editing Wikipedia as a fun, relaxing, engaging, and rewarding activity), and positive attitude to Wikipedia (belief in the value of Wikipedia). An additional factor, hostility (from other contributors), was identified that negatively affected Wikipedians’ motivations.

Conclusions: Contributions to Wikipedia’s health-related content in this study were made by both health specialists and laypeople of varying editorial skills. Their motivations for contributing stem from an inherent drive based on values, standards, and beliefs. It became apparent that the community who most actively monitor and edit health-related articles is very small. Although some contributors correspond to a model of “knowledge philanthropists,” others were focused on maintaining articles (improving spelling and grammar, organization, and handling vandalism). There is a need for more people to be involved in Wikipedia’s health-related content.
0 0
Wikipedia in the eyes of its beholders: A systematic review of scholarly research on Wikipedia readers and readership Chitu Okoli
Mohamad Mehdi
Mostafa Mesgari
Finn Årup Nielsen
Arto Lanamäki
Systematic literature review
Journal of the Association for Information Science and Technology English December 2014 Hundreds of scholarly studies have investigated various aspects of Wikipedia. Although a number of literature reviews have provided overviews of this vast body of research, none has specifically focused on the readers of Wikipedia and issues concerning its readership. In this systematic literature review, we review 99 studies to synthesize current knowledge regarding the readership of Wikipedia and provide an analysis of research methods employed. The scholarly research has found that Wikipedia is popular not only for lighter topics such as entertainment but also for more serious topics such as health and legal information. Scholars, librarians, and students are common users, and Wikipedia provides a unique opportunity for educating students in digital literacy. We conclude with a summary of key findings, implications for researchers, and implications for the Wikipedia community. 0 1
On Measuring Malayalam Wikipedia Vasudevan T V Wiki pedia
Quantitative Analysis
International Journal of Emerging Engineering Research and Technology English September 2014 Wikipedia is a popular, multilingual, free internet encyclopedia. Anyone can edit articles in it. This paper presents an overview of research in the Malayalam edition of Wikipedia. History of Malayalam Wikipedia

is explained first. Different research lines related with Wikipedia are explored next. This is followed by an analysis of Malayalam Wikipedia’s fundamental components such as Articles, Authors and Edits along with

Growth and Quality. General trends are measured comparing with Wikipedias in other languages.
0 0
Fidarsi di Wikipedia Simone Dezaiacomo Wikipedia
Teoria delle decisioni e processi cognitivi
Italian 15 July 2014 Lo scopo dello studio è comprendere i fenomeni alla base della fiducia degli utenti verso l'enciclopedia online Wikipedia. Per farlo è necessario prima di tutto comprendere e modellizzare l'organizzazione della struttura dei processi socio-produttivi sottostanti alla produzione del contenuto di Wikipedia, procedendo quindi nel verificare empiricamente e descrivere le capacità di autocorrezione della stessa. Oltre a quelli utilizzati in questo studio, saranno anche descritti gli approcci e i risultati trattati in letteratura, riportando i principali studi che nel corso degli anni hanno affrontato questi argomenti, sebbene mantenendoli indipendenti.

Per comprendere la struttura della community degli editor di Wikipedia, si è ipotizzata l'esistenza di un modello di tipo Core-Periphery. Per studiare il modello sono state eseguite delle analisi su dati derivanti da un campione di pagine della versione italiana di Wikipedia. I risultati ottenuti dall'analisi di queste informazioni rappresentano le basi utilizzate per la selezione delle pagine oggetto dell'iniezione degli errori, costituendo un metodo per stimare le diverse probabilità di autocorrezione per ciascuna pagina. Per quanto riguarda le capacità di resilienza di Wikipedia, i risultati sono ricavati utilizzando un approccio empirico. Questo consiste nell'inserimento di errori all'interno del campione di pagine sotto specifici vincoli metodologici per poi valutare in quanto tempo e con quali modalità questi errori vengono corretti.

E' stata effettuata un'analisi specifica per la scelta delle tipologie di errore e delle variabili da considerare nell'inserimento di questi.

Questa analisi ha portato alla definizione di 2 esperimenti tra loro distinti, i cui risultati portano ad interessanti conclusioni sia visti separatamente che combinati tra loro. Sulla base dei risultati di questi esperimenti è stato possibile discutere sulle capacità di autocorrezione del sistema, elemento chiave nello studio delle dinamiche della fiducia verso Wikipedia.
0 0
Brede tools and federating online neuroinformatics databases Finn Årup Nielsen Data federation
Open science
Semantic web
Neuroinformatics English 2014 As open science neuroinformatics databases the Brede Database and Brede Wiki seek to make distribution and federation of their content as easy and transparent as possible. The databases rely on simple formats and allow other online tools to reuse their content. This paper describes the possible interconnections on different levels between the Brede tools and other databases. 0 0
Evaluation of gastroenterology and hepatology articles on Wikipedia: Are they suitable as learning resources for medical students? Samy A. Azer Gastroenterology
Learning resources
Medical education
Medical students
Self-directed learning
(Eur J Gastroenterol Hepatol. 2014 Feb;26(2):155-63) doi:10.1097/MEG.0000000000000003 2014 BACKGROUND: With the changes introduced to medical curricula, medical students use learning resources on the Internet such as Wikipedia. However, the credibility of the medical content of Wikipedia has been questioned and there is no evidence to respond to these concerns. The aim of this paper was to critically evaluate the accuracy and reliability of the gastroenterology and hepatology information that medical students retrieve from Wikipedia. METHODS: The Wikipedia website was searched for articles on gastroenterology and hepatology on 28 May 2013. Copies of these articles were evaluated by three assessors independently using an appraisal form modified from the DISCERN instrument. The articles were scored for accuracy of content, readability, frequency of updating, and quality of references. RESULTS: A total of 39 articles were evaluated. Although the articles appeared to be well cited and reviewed regularly, several problems were identified with regard to depth of discussion of mechanisms and pathogenesis of diseases, as well as poor elaboration on different investigations. Analysis of the content showed a score ranging from 15.6±0.6 to 43.6±3.2 (mean±SD). The total number of references in all articles was 1233, and the number of references varied from 4 to 144 (mean±SD, 31.6±27.3). The number of citations from peer-reviewed journals published in the last 5 years was 242 (28%); however, several problems were identified in the list of references and citations made. The readability of articles was in the range of -8.0±55.7 to 44.4±1.4; for all articles the readability was 26±9.0 (mean±SD). The concordance between the assessors on applying the criteria had mean κ scores in the range of 0.61 to 0.79. CONCLUSION: Wikipedia is not a reliable source of information for medical students searching for gastroenterology and hepatology articles. Several limitations, deficiencies, and scientific errors have been identified in the articles examined. 0 0
Experimental Implementation of a M2M System Controlled by a Wiki Network Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Sensor network
Social network
Applied Computing and Information Technology, Studies in Computational Intelligence English 2014 Experimental implementation of a M2M system, which is controlled by a wiki network, is discussed. This M2M system consists of mobile terminals at remote places and wiki servers on the Internet. A mobile terminal of the system consists of an Android terminal and it may have an Arduino board with sensors and actuators. The mobile terminal can read data from not only the sensors in the Arduino board but also wiki pages of the wiki servers. The mobile terminal can control the actuators of the Arduino board or can write sensor data to a wiki page. The mobile terminal performs such reading writing and controlling by reading and executing commands on a wiki page, and by reading and running a program on the wiki page, periodically. In order to run the program, the mobile terminal equipped with a data processor. After placing mobile terminals at remote places, the group of users of this system can control the M2M system by writing and updating such commands and programs of the wiki network without going to the places of the mobile terminals. This system realizes an open communication forum for not only people but also for machines . 3 0
Improving modern art articles on Wikipedia, a partnership between Wikimédia France and Centre Georges Pompidou Sylvain Machefert Museum
Préconférence IFLA 2014 - Bibliothèques d'art French 2014 The Centre Georges Pompidou is a structure in Paris hosting the "Musée National d'Art Moderne", largest museum for modern art in Europe. Wikimédia France is a French organization working on promoting Wikipedia and other Wikimedia projects, by organizing trainings or conducting partnerships for example. The project described in this proposal has been led by the GLAM (Galleries Libraries Archives and Museums) working group of Wikimédia France and Pompidou museum curators. 3 0
La connaissance est un réseau: Perspective sur l’organisation archivistique et encyclopédique Martin Grandjean Les Cahiers du Numérique French 2014 Network analysis is not revolutionizing our objects of study, it revolutionizes the perspective of the researcher on the latter. Organized as a network, information becomes relational. It makes potentially possible the creation of new information, as with an encyclopedia which links between records weave a web which can be analyzed in terms of structural characteristics or with an archive directory which sees its hierarchy fundamentally altered by an index recomposing the information exchange network within a group of people. On the basis of two examples of management, conservation and knowledge enhancement tools, the online encyclopedia Wikipedia and the archives of the Intellectual Cooperation of the League of Nations, this paper discusses the relationship between the researcher and its object understood as a whole.

[Preprint version available].

Abstract (french)

L’analyse de réseau ne transforme pas nos objets d’étude, elle transforme le regard que le chercheur porte sur ceux-ci. Organisée en réseau, l’information devient relationnelle. Elle rend possible en puissance la création d’une nouvelle connaissance, à l’image d’une encyclopédie dont les liens entre les notices tissent une toile dont on peut analyser les caractéristiques structurelles ou d’un répertoire d’archives qui voit sa hiérarchie bouleversée par un index qui recompose le réseau d’échange d’information à l’intérieur d’un groupe de personnes. Sur la base de deux exemples d’outils de gestion, conservation et valorisation de la connaissance, l’encyclopédie en ligne Wikipédia et les archives de la coopération intellectuelle de la Société des Nations, cet article questionne le rapport entre le chercheur et son objet compris dans sa globalité. [Version preprint disponible].
0 0
Les jeunes, leurs enseignants et Wikipédia : représentations en tension autour d’un objet documentaire singulier Sahut Gilles Wikipedia
Information behaviour
Credibility judgement
Citation behaviour
High school
Undergraduate student
Graduate student
(Documentaliste-Sciences de l'information. 2014 June;51(2):p. 70-79) DOI : 10.3917/docsi.512.0070 2014 The collaborative encyclopedia Wikipedia is a heavily used resource, especially by high school and college students, whether for school work or personal reasons. However, for most teachers and information professionals, the jury is still out on the validity of its contents. Are young persons aware of its controversial reputation ? What opinions, negative or positive, do they hold ? How much confidence do they place in this information resource ? This survey of high school and college students provides an opportunity to grasp the diversity of attitudes towards Wikipedia and also how these evolve as the students move up the grade ladder. More widely, this article studies the factors that condition the degree of acceptability of the contents of this unusual source of information. 0 0
Pautas para la implementación de wikis para el desarrollo colaborativo de sistemas de buenas prácticas Jesús Tramullas
Ana I. Sánchez-Casabón
Best practices
Estudios de Información, Documentación y archivos. Homenaje a la profesora Pilar Gay Molins Spanish 2014 This paper reviews the basic principles for developing collections of best practices through a wiki tool. Details several previous works, and proposes a pattern of generation and development of these types of resources. 0 0
WikiWho: Precise and Efficient Attribution of Authorship of Revisioned Content Fabian Flöck
Maribel Acosta
Version control
Content modeling
Community- driven content creation
Collaborative authoring
Online collaboration
World Wide Web Conference 2014 English 2014 Revisioned text content is present in numerous collaboration platforms on the Web, most notably Wikis. To track authorship of text tokens in such systems has many potential applications; the identification of main authors for licensing reasons or tracing collaborative writing patterns over time, to name some. In this context, two main challenges arise. First, it is critical for such an authorship tracking system to be precise in its attributions, to be reliable for further processing. Second, it has to run efficiently even on very large datasets, such as Wikipedia. As a solution, we propose a graph-based model to represent revisioned content and an algorithm over this model that tackles both issues effectively. We describe the optimal implementation and design choices when tuning it to a Wiki environment. We further present a gold standard of 240 tokens from English Wikipedia articles annotated with their origin. This gold standard was created manually and confirmed by multiple independent users of a crowdsourcing platform. It is the first gold standard of this kind and quality and our solution achieves an average of 95% precision on this data set. We also perform a first-ever precision evaluation of the state-of-the-art algorithm for the task, exceeding it by over 10% on average. Our approach outperforms the execution time of the state-of-the-art by one order of magnitude, as we demonstrate on a sample of over 240 English Wikipedia articles. We argue that the increased size of an optional materialization of our results by about 10% compared to the baseline is a favorable trade-off, given the large advantage in runtime performance. 0 0
« Citez vos sources » : archéologie d’une règle au cœur du savoir wikipédien (2002-2008) Gilles Sahut Wikipedia policy
Citation behaviour
Epistemic trust
Online community
Online ethnography
Etudes de communication French 2014 The encyclopedia Wikipedia is an innovative project based on the relationship between a socio-technical system and a community of editors. Throughout its history, the community has developed policies to evaluate and validate the knowledge which is collectively produced. This article aims to describe the evolution of policies about citations in Wikipedia in French and the debates these policies have generated between 2002 and 2008. Our anthropological approach highlights the role attributed to referencing to respond to Wikipedia’s lack of epistemic trustworthiness ‪.‪ 0 0
Demonstration of a Loosely Coupled M2M System Using Arduino, Android and Wiki Software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Sensor network
Social network
Message oriented middleware
The 38th IEEE Conference on Local Computer Networks (LCN) English 22 October 2013 A Machine-to-Machine (M2M) system, in which terminals are loosely coupled with Wiki software, is proposed. This system acquires sensor data from remote terminals, processes the data by remote terminals and controls actuators at remote terminals according to the processed data. The data is passed between terminals using wiki pages. Each terminal consists of an Android terminal and an Arduino board. The mobile terminal can be controlled by a series of commands which is written on a wiki page. The mobile terminal has a data processor and the series of commands may have a program which controls the processor. The mobile terminal can read data from not only the sensors of the terminal but also wiki pages on the Internet. The input data may be processed by the data processor of the terminal. The processed data may be sent to a wiki page. The mobile terminal can control the actuators of the terminal by reading commands on the wiki page or by running the program on the wiki page. This system realizes an open communication forum for not only people but also for machines. 8 0
An Inter-Wiki Page Data Processor for a M2M System Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
IIAI ESKM English September 2013 A data processor, which inputs data from wiki pages, processes the data, and outputs the processed data on a wiki page, is proposed. This data processor is designed for a Machine-to-Machine (M2M) system, which uses Arduino, Android, and Wiki software. This processor is controlled by the program which is written on a wiki page. This M2M system consists of mobile terminals and web sites with wiki software. A mobile terminal of the system consists of an Android terminal and it may have an Arduino board with sensors and actuators. The mobile terminal can read data from not only the sensors in the Arduino board but also wiki pages on the Internet. The input data may be processed by the data processor of this paper. The processed data may be sent to a wiki page. The mobile terminal can control the actuators of the Arduino board by reading commands on the wiki page or by running the program of the processor. This system realizes an open communication forum for not only people but also for machines. 2 0
Jointly They Edit: Examining the Impact of Community Identification on Political Interaction in Wikipedia Jessica J. Neff
David Laniado
Karolin E. Kappler
Yana Volkovich
Pablo Aragón
Andreas Kaltenbrunner
Online encyclopedias
Political parties
Social research
Political theory
Qualitative analysis
Social theory
PLOS ONE English 3 April 2013 Background

In their 2005 study, Adamic and Glance coined the memorable phrase ‘divided they blog’, referring to a trend of cyberbalkanization in the political blogosphere, with liberal and conservative blogs tending to link to other blogs with a similar political slant, and not to one another. As political discussion and activity increasingly moves online, the power of framing political discourses is shifting from mass media to social media.

Methodology/Principal Findings

Continued examination of political interactions online is critical, and we extend this line of research by examining the activities of political users within the Wikipedia community. First, we examined how users in Wikipedia choose to display their political affiliation. Next, we analyzed the patterns of cross-party interaction and community participation among those users proclaiming a political affiliation. In contrast to previous analyses of other social media, we did not find strong trends indicating a preference to interact with members of the same political party within the Wikipedia community.


Our results indicate that users who proclaim their political affiliation within the community tend to proclaim their identity as a ‘Wikipedian’ even more loudly. It seems that the shared identity of ‘being Wikipedian’ may be strong enough to triumph over other potentially divisive facets of personal identity, such as political affiliation.
0 0
A Malicious Bot Capturing System using a Beneficial Bot and Wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Information security
Network analysis
Journal of Information Processing English February 2013 Locating malicious bots in a large network is problematic because the internal firewalls and network address translation (NAT) routers of the network unintentionally contribute to hiding the bots’ host address and malicious packets. However, eliminating firewalls and NAT routers merely for locating bots is generally not acceptable. In the present paper, we propose an easy to deploy, easy to manage network security control system for locating a malicious host behind internal secure gateways. The proposed network security control system consists of a remote security device and a command server. The remote security device is installed as a transparent link (implemented as an L2 switch), between the subnet and its gateway in order to detect a host that has been compromised by a malicious bot in a target subnet, while minimizing the impact of deployment. The security device is controlled remotely by 'polling' the command server in order to eliminate the NAT traversal problem and to be firewall friendly. Since the remote security device exists in transparent, remotely controlled, robust security gateways, we regard this device as a beneficial bot. We adopt a web server with wiki software as the command server in order to take advantage of its power of customization, ease of use, and ease of deployment of the server. 5 2
A Wikipédia como diálogo entre universidade e sociedade: uma experiência em extensão universitária Juliana Bastos Marques
Otavio Saraiva Louvem
Anais do XIX Workshop de Informática na Escola Portuguese
List of publications in Portuguese
2013 Resumo.

O artigo apresenta uma experiência no trabalho com o uso crítico e edição de artigos da Wikipédia lusófona no ambiente universitário, em atividades de extensão, realizado na Universidade Federal do Estado do Rio de Janeiro (Unirio) em 2012. Foram realizados diferentes tipos de atividades, desde workshops de 4h até cursos de maior duração, tanto para o público adulto geral quanto para universitários segmentados por área de estudo. O objetivo do trabalho foi exercitar competências críticas de leitura e produção de textos de divulgação, trazendo e adaptando para o usuário da Wikipédia conhecimentos ensinados em nível de graduação e pós-graduação.


The paper presents an experience with critical reading and edition of Portuguese Wikipedia articles in the university, in extension activities, conducted at the Federal University of Rio de Janeiro State (Unirio), in 2012. Different types of activities were introduced, from 4h workshops to longer term courses, for both broader audiences and university students by field of study. The goal of the activities was to exercise critical proficiency in reading and writing skills, offering and adapting for the regular Wikipedia user academic knowledge produced in undergraduate and graduate levels.
4 0
Analyzing and Predicting Quality Flaws in User-generated Content: The Case of Wikipedia Maik Anderka Information quality
Quality Flaws
Quality Flaw Prediction
Bauhaus-Universität Weimar, Germany English 2013 Web applications that are based on user-generated content are often criticized for containing low-quality information; a popular example is the online encyclopedia Wikipedia. The major points of criticism pertain to the accuracy, neutrality, and reliability of information. The identification of low-quality information is an important task since for a huge number of people around the world it has become a habit to first visit Wikipedia in case of an information need. Existing research on quality assessment in Wikipedia either investigates only small samples of articles, or else deals with the classification of content into high-quality or low-quality. This thesis goes further, it targets the investigation of quality flaws, thus providing specific indications of the respects in which low-quality content needs improvement. The original contributions of this thesis, which relate to the fields of user-generated content analysis, data mining, and machine learning, can be summarized as follows:

(1) We propose the investigation of quality flaws in Wikipedia based on user-defined cleanup tags. Cleanup tags are commonly used in the Wikipedia community to tag content that has some shortcomings. Our approach is based on the hypothesis that each cleanup tag defines a particular quality flaw.

(2) We provide the first comprehensive breakdown of Wikipedia's quality flaw structure. We present a flaw organization schema, and we conduct an extensive exploratory data analysis which reveals (a) the flaws that actually exist, (b) the distribution of flaws in Wikipedia, and, (c) the extent of flawed content.

(3) We present the first breakdown of Wikipedia's quality flaw evolution. We consider the entire history of the English Wikipedia from 2001 to 2012, which comprises more than 508 million page revisions, summing up to 7.9 TB. Our analysis reveals (a) how the incidence and the extent of flaws have evolved, and, (b) how the handling and the perception of flaws have changed over time.

(4) We are the first who operationalize an algorithmic prediction of quality flaws in Wikipedia. We cast quality flaw prediction as a one-class classification problem, develop a tailored quality flaw model, and employ a dedicated one-class machine learning approach. A comprehensive evaluation based on human-labeled Wikipedia articles underlines the practical applicability of our approach.
0 0
Enseigner la révision à l'ère des wikis ou là où on trouve la technologie alors qu'on ne l'attendait pas Brunette
Louise et Gagnon
Enseignement de la révision
Enseignement de la traduction
Revisors training
Translators training
JoSTrans, , no 1 2013 In academic teaching, there are very few experiences on collaborative wiki revision. In a Quebec university, we experimented upon a wiki revision activity with translation students in their third final year. We specifically chose to revise texts in Wikipedia because its environment shares similarities with the labor market in the language industry and because we believed that the wiki allowed us to achieve the overall objectives of the revision class, such as we define them. Throughout the experience, we monitored the progress of students’ revision interventions on Wikipedia texts as well as exchanges taking place between revisees and reviewers. All our research observations were made possible by the convoluted but systematic structure in Wikipedia. Here, we report on the experiment at the Université du Québec en Outaouais and let our academic teaching readers decide whether the exercise is right for them. For us, it was convincing. RÉSUMÉ Dans l’enseignement universitaire, on dénombre très peu d’expériences de révision sur des wikis. Dans une université du Québec, nous nous sommes lancées dans une activité de révision wiki avec des étudiants de traduction en classe de terminale, soit en troisième année. Nous avons opté pour la révision d’un texte de Wikipédia en raison, entre autres, des similitudes de l’expérience avec le marché du travail et parce que nous croyions que le wiki assurait l’atteinte des objectifs généraux des cours de révision, tels que nous les définissons. Tout au cours de l’exercice, nous avons surveillé le progrès des révisions, les interventions des étudiants sur les textes de même que les échanges entre révisés et réviseurs. Toutes ces observations sont rendues possibles par la structure, alambiquée, mais systématique de Wikipédia. Nous livrons nos réflexions sur l’expérience menée à l’Université du Québec en Outaouais et laissons à nos lecteurs enseignants le soin de décider si l’exercice leur convient. Pour nous, il a été convaincant. 0 0
Erfolgsfaktoren von Social Media: Wie "funktionieren" Wikis? Florian L. Mayer Wiki
Organizational Communication
Online collaboration
Otto-Friedrich-Universität Bamberg German 2013 Wann sind Wikis oder allgemeiner: Social Media erfolgreich? Wenn sie kommunikativ "lebendig" sind! Diesem "kommunikativen Erfolg" liegen Strukturprinzipien zugrunde, die diese Arbeit sichtbar macht. Sie beschreibt konkrete Aufmerksamkeits-, Motivations- und Organisationsstrukturen, und macht so den Erfolg der Leuchttürme wie Wikipedia oder Facebook, aber auch die Schwierigkeiten im Einsatz von Social Media in Organisationen und Gruppen verstehbar. Mit den Begriffen Mikrokommunikation und Mikrokollaboration liefert sie darüber hinaus eine Beschreibung neuer Formen gesellschaftlicher Kommunikation. 0 0
Is Wikipedia a Relevant Model for E-Learning? Pierre-Carl Langlais E-learning
Collaborative learning
Social constructivism
English 2013 This article gives a global appraisal of wiki-based pedagogic projects. The growing influence of Wikipedia on students’ research practices have actually made these a promising area for educational research.

A compilation of data published by 30 previous academic case studies reveals several recurrent features. Wikis are not so easily adopted: most wiki learning programs begin by a slow initial phase, marked by a general unwillingness to adapt to an unusual environment. Some sociological factors, like age and, less clearly, gender may contribute to increase this initial reluctance.

In spite of their uneasiness, wikis proved precious tools on one major aspect: they give a vivid representation of scientific communities. Students get acquainted with some valuable epistemic practices and norms, such as collaborative work and critical thought. While not improving significantly the memorization of information, wikis clearly enhance research abilities.

This literature review can assist teachers in determining whether the use of wiki fits their pedagogic aims.
10 0
Making Peripheral Participation Legitimate: Reader Engagement Experiments in Wikipedia Aaron Halfaker
Oliver Keyes
Dario Taraborelli
Social learning
Legitimate peripheral participation
Open production
Computer-Supported Cooperative Work English 2013 Open collaboration communities thrive when participation is plentiful. Recent research has shown that the English Wikipedia community has constructed a vast and accurate information resource primarily through the monumental effort of a relatively small number of active, volunteer editors. Beyond Wikipedia's active editor community is a substantially larger pool of potential participants: readers. In this paper we describe a set of field experiments using the Article Feedback Tool, a system designed to elicit lightweight contributions from Wikipedia's readers. Through the lens of social learning theory and comparisons to related work in open bug tracking software, we evaluate the costs and benefits of the expanded participation model and show both qualitatively and quantitatively that peripheral contributors add value to an open collaboration community as long as the cost of identifying low quality contributions remains low. 8 0
Students’ Digital Strategies and Shortcuts – Searching for Answers on Wikipedia as a Core Literacy Practice in Upper Secondary School Marte Blikstad-Balas & Rita Hvistendahl Assessment
Digital literacy
School tasks
(Nordic Journal of Digital Literacy. 2013 , issue 01/02:32-48) ISSN: 1891-943X 2013 ABSTRACT:When the classroom is connected to the Internet, the number of possible sources of information is almost infinite. Nevertheless, students tend to systematically favor the online encyclopedia Wikipedia as a source for knowledge. The present study combines quantitative and qualitative data to investigate the role Wikipedia plays in the literacy practices of students working on school tasks. It also discusses how different tasks lead to different strategies. 0 0
The Rise and Decline of an Open Collaboration System: How Wikipedia's Reaction to Sudden Popularity is Causing its Decline Aaron Halfaker
R. Stuart Geiger
Jonathan Morgan
John T. Reidl
American Behavioral Scientist English 2013 Open collaboration systems like Wikipedia need to maintain a pool of volunteer contributors in order to remain relevant. Wikipedia was created through a tremendous number of contributions by millions of contributors. However, recent research has shown that the number of active contributors in Wikipedia has been declining steadily for years, and suggests that a sharp decline in the retention of newcomers is the cause. This paper presents data that show that several changes the Wikipedia community made to manage quality and consistency in the face of a massive growth in participation have ironically crippled the very growth they were designed to manage. Specifically, the restrictiveness of the encyclopedia's primary quality control mechanism and the algorithmic tools used to reject contributions are implicated as key causes of decreased newcomer retention. Further, the community's formal mechanisms for norm articulation are shown to have calcified against changes – especially changes proposed by newer editors. 22 0
The effects of perceived anonymity and anonymity states on conformity and groupthink in online communities: A Wikipedia study Michail Tsikerdekis Groupthink
Journal of the American Society for Information Science and Technology. 64(5), 1001–1015. DOI: 10.1002/asi.22795 2013 Groupthink behavior is always a risk in online groups and group decision support systems (GDSS), especially when not all potential alternatives for problem resolution are considered. It becomes a reality when individuals simply conform to the majority opinion and hesitate to suggest their own solutions to a problem. Anonymity has long been established to have an effect on conformity, but no previous research has explored the effects of different anonymity states in relation to an individual's likelihood to conform. Through a survey of randomly chosen participants from the English-language Wikipedia community, I explored the effects of anonymity on the likelihood of conforming to group opinion. In addition, I differentiated between actual states of anonymity and individuals' perceptions of anonymity. His findings indicate that although people perceive anonymity differently depending on their anonymity state, different states of anonymity do not have a strong effect on the likelihood of conforming to group opinion. Based on this evidence, I make recommendations for software engineers who have a direct hand in the design of online community platforms. 0 0
Trabalhando com a história romana na Wikipédia: uma experiência em conhecimento colaborativo na universidade Juliana Bastos Marques Wikipédia; educação e internet; conhecimento colaborativo História Hoje (ANPUH) Portuguese
List of publications in Portuguese
2013 Resumo

Em razão de sua imensa popularidade, a Wikipédia é hoje uma fonte inescapável, ainda que muito do seu conteúdo em História se apresente fraco, impróprio ou mesmo errôneo. Este texto visa apresentar os resultados de uma experiência didática com a leitura crítica e edição de artigos da Wikipédia em sala de aula em uma disciplina de História Antiga na graduação.


Nowadays, due to its tremendous popularity, Wikipedia is an unavoidable source, even though its History content may still be weak, improper, or even erroneous. This text aims to present the results of a learning experience with critical reading and editing of Wikipedia articles on classroom, at an Ancient History undergraduate course. Keywords: Wikipedia; education and internet; collaborative knowledge.
1 0
Wikis and Collaborative Writing Applications in Health Care: A Scoping Review Patrick M Archambault
Tom H van de Belt
Francisco J Grajales
Marjan J Faber
Craig E Kuziemsky
Susie Gagnon
Andrea Bilodeau
Simon Rioux
Willianne LDM Nelen
Marie-Pierre Gagnon
Alexis F Turgeon
Karine Aubin11
Irving Gold
Julien Poitras
Gunther Eysenbach
Jan AM Kremer
France Légaré
Collaborative writing applications; collaborative authoring; knowledge management; crowdsourcing; medical informatics; ehealth; Internet; Wiki; Wikipedia; Google Docs; Google Knol; Web 2.0; knowledge translation; evidence-based medicine; participatory med (J Med Internet Res 2013;15(10):e210) doi:10.2196/jmir.2787 2013 Background: Collaborative writing applications (eg, wikis and Google Documents) hold the potential to improve the use of evidence in both public health and health care. The rapid rise in their use has created the need for a systematic synthesis of the evidence of their impact as knowledge translation (KT) tools in the health care sector and for an inventory of the factors that affect their use. Objective: Through the Levac six-stage methodology, a scoping review was undertaken to explore the depth and breadth of evidence about the effective, safe, and ethical use of wikis and collaborative writing applications (CWAs) in health care. Methods: Multiple strategies were used to locate studies. Seven scientific databases and 6 grey literature sources were queried for articles on wikis and CWAs published between 2001 and September 16, 2011. In total, 4436 citations and 1921 grey literature items were screened. Two reviewers independently reviewed citations, selected eligible studies, and extracted data using a standardized form. We included any paper presenting qualitative or quantitative empirical evidence concerning health care and CWAs. We defined a CWA as any technology that enables the joint and simultaneous editing of a webpage or an online document by many end users. We performed qualitative content analysis to identify the factors that affect the use of CWAs using the Gagnon framework and their effects on health care using the Donabedian framework. Results: Of the 111 studies included, 4 were experimental, 5 quasi-experimental, 5 observational, 52 case studies, 23 surveys about wiki use, and 22 descriptive studies about the quality of information in wikis. We classified them by theme: patterns of use of CWAs (n=26), quality of information in existing CWAs (n=25), and CWAs as KT tools (n=73). A high prevalence of CWA use (ie, more than 50%) is reported in 58% (7/12) of surveys conducted with health care professionals and students. However, we found only one longitudinal study showing that CWA use is increasing in health care. Moreover, contribution rates remain low and the quality of information contained in different CWAs needs improvement. We identified 48 barriers and 91 facilitators in 4 major themes (factors related to the CWA, users’ knowledge and attitude towards CWAs, human environment, and organizational environment). We also found 57 positive and 23 negative effects that we classified into processes and outcomes. Conclusions: Although we found some experimental and quasi-experimental studies of the effectiveness and safety of CWAs as educational and KT interventions, the vast majority of included studies were observational case studies about CWAs being used by health professionals and patients. More primary research is needed to find ways to address the different barriers to their use and to make these applications more useful for different stakeholders. 0 0
Mass Collaboration or Mass Amateurism? A comparative study on the quality of scientific information produced using Wiki tools and concepts Fernando Rodrigues Mass Collaboration
Collective intelligence
Information Systems
Data Quality
Encyclopaedia Britannica
Universidade Évora Portuguese December 2012 With this PhD dissertation, we intend to contribute to a better understanding of the Wiki phenomenon as a knowledge management system which aggregates private knowledge. We also wish to check to what extent information generated through anonymous and freely bestowed mass collaboration is reliable as opposed to the traditional approach.

In order to achieve that goal, we develop a comparative study between Wikipedia and Encyclopaedia Britannica with regard to accuracy, depth and detail of information in both, in order to confront the quality of the knowledge repository produced by them. That will allow us to reach a conclusion about the efficacy of the business models behind them.

We will use a representative random sample which is composed by the articles that are comprised in both encyclopedias. Each pair of articles was previously reformatted and then graded by an expert in its subject area. At the same time, we collected a small convenience sample which only integrates Management articles. Each pair of articles was graded by several experts in order to determine the uncertainty associated with having diverse gradings of the same article and apply it to the evaluations carried out by just one expert. The conclusion was that the average quality of the Wikipedia articles which were analysed was superior to its peers’ and that this difference was statistically significant.

An inquiry was conducted within the academia which certified that traditional information sources were used by a minority as the first approach to seeking information. This inquiry also made clear that reliance on these sources was considerably larger than reliance on information obtained through Wikipedia. This quality perception, as well as the diametrically opposed results of its evaluation through a blind test, reinforces the evaluating panel’s exemption.

However much the chosen sample is representative of the universe to be studied, results have depended on the evaluators’ personal opinion and chosen criteria. This means that the reproducibility of this study’s conclusions using a different grading panel cannot be guaranteed. Nevertheless, this is not enough of a reason to reject the study results obtained through more than five hundred evaluations.

This thesis is thus an attempt to help clarifying this topic and contributing to a better perception of the quality of a tool which is daily used by millions of people, of the mass collaboration which feeds it and of the collaborative software that supports it.
0 0
WikiPapers, una recopilación colaborativa de literatura sobre wikis usando MediaWiki y su extensión semántica Emilio J. Rodríguez-Posada
Juan Manuel Dodero-Beardo
IV Jornadas Predoctorales de la ESI Spanish December 2012 El interés de los investigadores por los wikis, en especial Wikipedia, ha ido en aumento en los últimos años. La primera edición de WikiSym, un simposio sobre wikis, se celebró en 2005 y desde entonces han aparecido multitud de congresos, workshops, conferencias y competiciones en este área. El estudio de los wikis es un campo emergente y prolífico. Ha habido varios intentos, aunque con escaso éxito, de recopilar toda la literatura sobre wikis. En este artículo presentamos WikiPapers, un proyecto colaborativo para recopilar toda la literatura sobre wikis. Hasta noviembre de 2012 se han recopilado más de 1.700 publicaciones y sus metadatos, además de documentación sobre herramientas y datasets relacionados. 9 0
Capturing malicious bots using a beneficial bot and wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Information security
Vandal bot
SIGUCCS English October 2012 Locating malicious bots in a large network is problematic because its internal firewalls and NAT routers unintentionally contribute to hiding bots' host address and malicious packets. However, eliminating firewalls and NAT routers for merely locating bots is generally not acceptable. In this paper, we propose an easy to deploy, easy to manage network security controlling system for locating a malicious host behind the internal secure gateways. This network security controlling system consists of a remote security device and a command server. Each of the remote security devices is installed as a transparent link (implemented as a L2 switch), between the subnet and its gateway, to detect a host which is compromised with a malicious bot in a target subnet, while minimizing impact of deployment. The security devices are remote controlled by 'polling' the command server in order to eliminating NAT traversal problem and to be firewall friendly. Since the remote security device lives in transparent, remote controlled and robust to security gateways, we regard it as a beneficial bot. We adopt a web server with wiki software as the command server in order to take advantage of its power of customization, easy to use and easy to deployment of the server. 4 1
Military History on the Electronic Frontier: Wikipedia Fights the War of 1812 Richard Jensen War of 1812
Military history
The Journal of Military History October 2012 Wikipedia is written by and for the benefit of highly motivated amateurs within an anarchistic structure resembling the American frontier. Edit wars resemble frontier shootouts, and are handled by kangaroo courts. Military history is one of its strengths, with over 130,000 articles supervised by 700 well-organized volunteers who prevent mischief and work on upgrading quality. Most authors ("editors") rely on free online sources and popular books, and generally ignore historiography and scholarly monographs and articles. The military articles are old-fashioned, with an emphasis on tactics, battles, and technology, and are weak on social and cultural dimensions. This essay examines how the 14,000 word article on the “War of 1812” was worked on by 2,400 different people, with no overall coordinator or plan. Debates raged as the 1812 article attracted over 3,300 comments by 627 of the most active editors. The main dispute was over who won the war. 0 0
A M2M system using Arduino, Android and Wiki Software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Social network
IIAI ESKM English September 2012 A Machine-to-Machine (M2M) system, which uses Arduino, Android, and Wiki software, is discussed. ["proposed"?] This system consists of mobile terminals and web sites with wiki software. A mobile terminal of the system consists of an Android terminal and an Arduino board with sensors and actuators. The mobile terminal reads data from the sensors in the Arduino board and sends the data to a wiki page. The mobile terminal also reads commands on the wiki page and controls the actuators of the Arduino board. In addition, a wiki page can have a program that reads the page and outputs information such as a graph. This system realizes an open communication forum for not only people but also for machines 4 3
Wikipédia, espace fluide, espace à parcourir Rémi Mathis Wikipedia La Revue de la BNU French September 2012 Wikipédia est un espace foncièrement décentré : qui existe en plus de 280 langues, où les auteurs se comptent en centaines de milliers, qui évolue sans cesse pour coller au dernier état du savoir. Afin de faciliter la navigation, des portes d'entrée sont créées et des outils permettent de structurer cet espace. L'idée n'est toutefois pas d'imposer un parcours mais bien au contraire de favoriser la fluidité de la lecture, par des itinéraires sans cesse réinventés par les lecteurs - tendant à enrichir son expérience de découverte et l'amener vers des articles qu'ils n'aurait pas cherché par lui-même. 2 0
Citation needed: The dynamics of referencing in Wikipedia Chih-Chun Chen
Camille Roth
Collaborative system
WikiSym English August 2012 The extent to which a Wikipedia article refers to external sources to substantiate its content can be seen as a measure of its externally invoked authority. We introduce a protocol for characterising the referencing process in the context of general article editing. With a sample of relatively mature articles, we show that referencing does not occur regularly through an article’s lifetime but is associated with periods of more substantial editing, when the article has reached a certain level of maturity (in terms of the number of times it has been revised and its length). References also tend to be contributed by editors who have contributed more frequently and more substantially to an article, suggesting that a subset of more qualified or committed editors may exist for each article. 13 1
Classifying Wikipedia Articles Using Network Motif Counts and Ratios Guangyu Wu
Martin Harrigan
Pádraig Cuningham
Edit Networks
WikiSym English August 2012 Because the production of Wikipedia articles is a collaborative process, the edit network around a article can tell us something about the quality of that article. Articles that have received little attention will have sparse networks; at the other end of the spectrum, articles that are Wikipedia battle grounds will have very crowded networks. In this paper we evaluate the idea of characterizing edit networks as a vector of motif counts that can be used in clustering and classification. Our objective is not immediately to develop a powerful classifier but to assess what is the signal in network motifs. We show that this motif count vector representation is effective for classifying articles on the Wikipedia quality scale. We further show that ratios of motif counts can effectively overcome normalization problems when comparing networks of radically different sizes. 0 0
Design for Free Learning - a Case Study on Supporting a Service Design Course Teresa Consiglio
Gerrit C. van der Veer
Experience report
Open source
Cultural diversity
E- learning
Service design
Learner centered design
WikiSym August 2012 In this experience report, we provide a case study on the use of information and communication technology (ICT) in higher education, developing an open source interactive learning environment to support a blended course. Our aim is to improve the quality of adult distance learning, ultimately involving peers worldwide, by developing learning invironments as flexible as possible regardless of the culture and context of use, of individual learning style and age of the learners.

Our example concerns a course of Service Design where the teacher was physically present only intermittently for part of the course while in the remaining time students worked in teams using our online learning environment.

We developed a structure where students are guided through discovery learning and mutual teaching. We will show how we started from the students’ authentic goals and how we supported them by a simple structure of pacing the discovery process and merging theoretical understanding with practice in real life.

Based on these first empirical results practical guidelines have been developed regarding improvements on the structure provided for the learning material and on the interaction facilities for students, teachers and instructional designers.
0 0
Etiquette in Wikipedia: Weening New Editors into Productive Ones Ryan Faulkner
Steven Walling
Maryana Pinchuk
WikiSym English August 2012 Currently, the greatest challenge faced by the Wikipedia community involves reversing the decline of active editors on the site – in other words, ensuring that the encyclopedia’s contributors remain sufficiently numerous to fill the roles that keep it relevant. Due to the natural drop-off of old contributors, newcomers must constantly be socialized, trained and retained. However recent research has shown the Wikipedia community is failing to retain a large proportion of productive new contributors and implicates Wikipedia’s semi-automated quality control mechanisms and their interactions with these newcomers as an exacerbating factor. This paper evaluates the effectiveness of minor changes to the normative warning messages sent to newcomers from one of the most prolific of these quality control tools (Huggle) in preserving their rate of contribution. The experimental results suggest that substantial gains in newcomer participation can be attained through inexpensive changes to the wording of the first normative message that new contributors receive. 0 1
How Long Do Wikipedia Editors Keep Active? Dell Zhang
Karl Prior
Mark Levene
Social Media
User Modelling
Behaviour Mining
Survival Analysis
WikiSym English August 2012 In this paper, we use the technique of survival analysis to investigate how long Wikipedia editors remain active in editing. Our results show that although the survival function of occasional editors roughly follows a lognormal distribution, the survival function of customary editors can be better described by a Weibull distribution (with the median lifetime of about 53 days). Furthermore, for customary editors, there are two critical phases (0-2 weeks and 8-20 weeks) when the hazard rate of becoming inactive increases. Finally, customary editors who are more active in editing are likely to keep active in editing for longer time. 0 0
Identifying controversial articles in Wikipedia: A comparative study Hoda Sepehri Rad
Denilson Barbosa
WikiSym English August 2012 Wikipedia articles are the result of the collaborative editing of a diverse group of anonymous volunteer editors, who are passionate and knowledgeable about specific topics. One can argue that this plurality of perspectives leads to broader coverage of the topic, thus benefitting the reader. On the other hand, differences among editors on polarizing topics can lead to controversial or questionable content, where facts and arguments are presented and discussed to support a particular point of view. Controversial articles are manually tagged by Wikipedia editors, and span many interesting and popular topics, such as religion, history, and politics, to name a few. Recent works have been proposed on automatically identifying controversy within unmarked articles. However, to date, no systematic comparison of these efforts has been made. This is in part because the various methods are evaluated using different criteria and on different sets of articles by different authors, making it hard for anyone to verify the efficacy and compare all alternatives. We provide a first attempt at bridging this gap. We compare five different methods for modelling and identifying controversy, and discuss some of the unique difficulties and opportunities inherent to the way Wikipedia is produced. 0 0
In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-Language Link Network Morten Warncke-Wang
Anuradha Uduwage
Zhenhua Dong
John Riedl
Tobler's Law
First Law of Geography
WikiSym English August 2012 Wikipedia has become one of the primary encyclopaedic information repositories on the World Wide Web. It started in 2001 with a single edition in the English language and has since expanded to more than 20 million articles in 283 languages. Criss-crossing between the Wikipedias is an interlanguage link network, connecting the articles of one edition of Wikipedia to another. We describe characteristics of articles covered by nearly all Wikipedias and those covered by only a single language edition, we use the network to understand how we can judge the similarity between Wikipedias based on concept coverage, and we investigate the flow of translation between a selection of the larger Wikipedias. Our findings indicate that the relationships between Wikipedia editions follow Tobler's first law of geography: similarity decreases with increasing distance. The number of articles in a Wikipedia edition is found to be the strongest predictor of similarity, while language similarity also appears to have an influence. The English Wikipedia edition is by far the primary source of translations. We discuss the impact of these results for Wikipedia as well as user-generated content communities in general. 0 0
Manypedia: Comparing Language Points of View of Wikipedia Communities Paolo Massa
Federico Scrinzi
Cross-cultural comparison
Linguistic Point of View
Automatic translation
Web tool
Open source
WikiSym English August 2012 The 4 million articles of the English Wikipedia have been written in a collaborative fashion by more than 16 million volunteer editors. On each article, the community of editors strive to reach a neutral point of view, representing all significant views fairly, proportionately, and without biases. However, beside the English one, there are more than 280 editions of Wikipedia in different languages and their relatively isolated communities of editors are not forced by the platform to discuss and negotiate their points of view. So the empirical question is: do communities on different language Wikipedias develop their own diverse Linguistic Points of View (LPOV)? To answer this question we created and released as open source Manypedia, a web tool whose aim is to facilitate cross-cultural analysis of Wikipedia language communities by providing an easy way to compare automatically translated versions of their different representations of the same topic. 0 0
Mutual Evaluation of Editors and Texts for Assessing Quality of Wikipedia Articles Yu Suzuki
Masatoshi Yoshikawa
Peer review
Edit history
Link analysis
WikiSym English August 2012 In this paper, we propose a method to identify good quality Wikipedia articles by mutually evaluating editors and texts. A major approach for assessing article quality is a text survival ratio based approach. In this approach, when a text survives beyond multiple edits, the text is assessed as good quality. This approach assumes that poor quality texts are deleted by editors with high possibility. However, many vandals delete good quality texts frequently, then the survival ratios of good quality texts are improperly decreased by vandals. As a result, many good quality texts are unfairly assessed as poor quality. In our method, we consider editor quality for calculating text quality, and decrease the impacts on text qualities by the vandals who has low quality. Using this improvement, the accuracy of the text quality should be improved. However, an inherent problem of this idea is that the editor qualities are calculated by the text qualities. To solve this problem, we mutually calculate the editor and text qualities until they converge. We did our experimental evaluation, and we confirmed that the proposed method could accurately assess the text qualities. 0 0
Natural Language Processing for MediaWiki: The Semantic Assistants Approach Bahar Sateli
René Witte
WikiSym English August 2012 We present a novel architecture for the integration of Natural Language Processing (NLP) capabilities into wiki systems. The vision is that of a new generation of wikis that can help developing their own primary content and organize their structure by using state-of-the-art technologies from the NLP and Semantic Computing domains. The motivation for this integration is to enable wiki users – novice or expert – to benefit from modern text mining techniques directly within their wiki environment. We implemented these ideas based on MediaWiki and present a number of real-world application case studies that illustrate the practicability and effectiveness of this approach. 0 0
Psychological processes underlying Wikipedia representations of natural and manmade disasters Michela Ferron
Paolo Massa
Collective memory
Traumatic event
Man-made disasters
Natural disasters
Automated content analysis techniques
WikiSym English August 2012 Collective memories are precious resources for the society, because they help strengthening emotional bonding between community members, maintaining groups cohesion, and directing future behavior. Studying how people form their collective memories of emotional upheavals is important in order to better understand people's reactions and the consequences on their psychological health. Previous research investigated the effects of single traumatizing events, but few of them tried to compare different types of traumatic events like natural and man-made disasters. In this paper, interpreting Wikipedia as a collective memory place, we compare articles about natural and human-made disasters employing automated natural language techniques, in order to highlight the different psychological processes underlying users' sensemaking activities. 0 1
Staying in the Loop: Structure and Dynamics of Wikipedia's Breaking News Collaborations Brian Keegan
Darren Gergle
Noshir Contractor
High-tempo collaboration
Network analysis
Breaking news
WikiSym English August 2012 Despite the fact that Wikipedia articles about current events are more popular and attract more contributions than typical articles, canonical studies of Wikipedia have only analyzed articles about pre-existing information. We expect the co-authoring of articles about breaking news incidents to exhibit high-tempo coordination dynamics which are not found in articles about historical events and information. Using 1.03 million revisions made by 158,384 users to 3,233 English Wikipedia articles about disasters, catastrophes, and conflicts since 1990, we construct “article trajectories” of editor interactions as they coauthor an article. Examining a subset of this corpus, our analysis demonstrates that articles about current events exhibit structures and dynamics distinct from those observed among articles about non-breaking events. These findings have implications for how collective intelligence systems can be leveraged to process and make sense of complex information. 0 0
Towards Content-driven Reputation for Collaborative Code Repositories Andrew G. West
Insup Lee
Code repository
Trust management
Content persistence
Code quality
WikiSym English August 2012 As evidenced by SourceForge and GitHub, code repositories now integrate Web 2.0 functionality that enables global participation with minimal barriers-to-entry. To prevent detrimental contributions enabled by crowdsourcing, reputation is one proposed solution. Fortunately this is an issue that has been addressed in analogous version control systems such as the *wiki* for natural language content. The WikiTrust algorithm ("content-driven reputation"), while developed and evaluated in wiki environments operates under a possibly shared collaborative assumption: actions that "survive" subsequent edits are reflective of good authorship. In this paper we examine WikiTrust's ability to measure author quality in collaborative code development. We first define a mapping from repositories to wiki environments and use it to evaluate a production SVN repository with 92,000 updates. Analysis is particularly attentive to reputation loss events and attempts to establish ground truth using commit comments and bug tracking. A proof-of-concept evaluation suggests the technique is promising (about two-thirds of reputation loss is justified) with false positives identifying areas for future refinement. Equally as important, these false positives exemplify differences in content evolution and the cooperative process between wikis and code repositories. 0 0
Wikipedia Customization through Web Augmentation Techniques Oscar Díaz
Cristóbal Arellano
Gorka Puente
Web Augmentation
WikiSym English August 2012 Wikipedia is a successful example of collaborative knowledge construction. This can be synergistically complemented with personal knowledge construction whereby individuals are supported in their sharing, experimenting and building of information in a more private setting, without the scrutiny of the whole community. Ideally, both approaches should be seamlessly integrated so that wikipedians can easily transit from the public sphere to the private sphere, and vice versa. To this end, we introduce WikiLayer, a plugin for Wikipedia that permits wikipedians locally supplement Wikipedia articles with their own content (i.e. a layer). Layering additional content is achieved locally by seamlessly interspersing Wikipedia content with custom content. WikiLayer is driven by three main wiki principles: affordability (i.e., if you know how to edit articles, you know how to layer), organic growth (i.e., layers evolve in synchrony with the underlying articles) and shareability (i.e., layers can be shared in confidence through the wikipedian’s social network, e.g., Facebook ). The paper provides motivating scenarios for readers, contributors and editors. WikiLayer is available for download at 0 0
Writing up rather than writing down: Becoming Wikipedia Literate Heather Ford
R. Stuart Geiger
New literacies
Educational technology
WikiSym English August 2012 Editing Wikipedia is certainly not as simple as learning the MediaWiki syntax and knowing where the “edit” bar is, but how do we conceptualize the cultural and organizational understandings that make an effective contributor? We draw on work of literacy practitioner and theorist Richard Darville to advocate a multi-faceted theory of literacy that sheds light on what new knowledges and organizational forms are required to improve participation in Wikipedia’s communities. We outline what Darville refers to as the “background knowledges” required to be an empowered, literate member and apply this to the Wikipedia community. Using a series of examples drawn from interviews with new editors and qualitative studies of controversies in Wikipedia, we identify and outline several different literacy asymmetries. 0 1
A Casual Network Security Monitoring System using a Portable Sensor Device and Wiki Software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Information security
SAINT English July 2012 A casual network security monitoring system is proposed in this paper. The system is easy to deploy without reconfiguring the central network infrastructure, the firewall, and the intrusion detector system (IDS) of an organization. A virus-infected host, which is hidden by the network address translator (NAT) of a sub LAN, can be identified easily by using this monitoring system with the IDS. This monitoring system consists of a portable sensor device and a web site with wiki software. The portable sensor device, which is located on a target LAN that may have virus-infected hosts, is remote-controlled by a network manager's commands. The commands and the results are written on a wiki page. 3 2
Wikipedia de la A a la W Tomás Saorín-Pérez Wikipedia
Wikimedia projects
Editorial UOC Spanish July 2012 Wikipedia es una realidad que funciona, aunque en teoría pueda parecer un sueño irrealizable. Un puñado de entusiastas ha redefinido desde la nada el concepto clásico de enciclopedia y ha construido la fuente de referencia más usada de la historia. ¿Tiene suficiente calidad? La respuesta es afirmativa, y para justificarlo hay que profundizar en los mecanismos de los que está dotada, que le permiten alcanzar el nivel de calidad que se desee, combinando el esfuerzo de miles de editores voluntarios autoorganizados. Wikipedia es al mismo tiempo contenido y personas. Es el momento de conocerla por dentro y de potenciar su apuesta por el conocimiento abierto desde las instituciones culturales, científicas y educativas. Participar en Wikipedia permite aprender de este increíble laboratorio global de construcción social de información organizada. 0 0
Dynamics of Conflicts in Wikipedia Taha Yasseri
Róbert Sumi
András Rung
András Kornai
János Kertész
PLoS ONE English June 2012 In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only. 44 1
Reverts Revisited: Accurate Revert Detection in Wikipedia Fabian Flöck
Denny Vrandečić
Elena Simperl
Revert detection
Editing behavior
User modeling
Collaboration systems
Community-driven content creation
Social dynamics
Hypertext and Social Media 2012 English June 2012 Wikipedia is commonly used as a proving ground for research in collaborative systems. This is likely due to its popularity and scale, but also to the fact that large amounts of data about its formation and evolution are freely available to inform and validate theories and models of online collaboration. As part of the development of such approaches, revert detection is often performed as an important pre-processing step in tasks as diverse as the extraction of implicit networks of editors, the analysis of edit or editor features and the removal of noise when analyzing the emergence of the con-tent of an article. The current state of the art in revert detection is based on a rather naïve approach, which identifies revision duplicates based on MD5 hash values. This is an efficient, but not very precise technique that forms the basis for the majority of research based on revert relations in Wikipedia. In this paper we prove that this method has a number of important drawbacks - it only detects a limited number of reverts, while simultaneously misclassifying too many edits as reverts, and not distinguishing between complete and partial reverts. This is very likely to hamper the accurate interpretation of the findings of revert-related research. We introduce an improved algorithm for the detection of reverts based on word tokens added or deleted to adresses these drawbacks. We report on the results of a user study and other tests demonstrating the considerable gains in accuracy and coverage by our method, and argue for a positive trade-off, in certain research scenarios, between these improvements and our algorithm’s increased runtime. 13 0
Panorama of the wikimediasphere David Gómez-Fontanills WikiMedia
Free community good
Communities of editors
Editorial autonomy
Libre software
Digithum English
May 2012 The term wikimediasphere is proposed to refer to the group of WikiProjects, communities of editors, guidelines and organisations structured around the Wikimedia movement to generate free knowledge that is available to everyone. A description is made of the wikimediasphere, presenting the main projects and their characteristics, and its community, technological, regulatory, social and institutional dimensions are outlined. The wikimediasphere is placed in context and reference is made to its blurred boundaries. An explanation is provided of the role of the communities of editors of each project and their autonomy with respect to each other and to the Wikimedia Foundation. The author concludes by offering a panoramic view of the wikimediasphere. 10 0
The Truth of Wikipedia Nathaniel Tkacz Wikipedia
Neutral point of view
Digithum English
May 2012 What does it mean to assert that Wikipedia has a relation to truth? That there is, despite regular claims to the contrary, an entire apparatus of truth in Wikipedia? In this article, I show that Wikipedia has in fact two distinct relations to truth: one which is well known and forms the basis of existing popular and scholarly commentaries, and another which refers to equally well-known aspects of Wikipedia, but has not been understood in terms of truth. I demonstrate Wikipedia’s dual relation to truth through a close analysis of the Neutral Point of View core content policy (and one of the project’s “Five Pillars”). I conclude by indicating what is at stake in the assertion that Wikipedia has a regime of truth and what bearing this has on existing commentaries. 7 0
Wiki Loves Monuments 2011: the experience in Spain and reflections regarding the diffusion of cultural heritage Emilio J. Rodríguez-Posada
Ángel González Berdasco
Jorge A. Sierra Canduela
Santiago Navarro Sanz
Tomás Saorín-Pérez
Wiki Loves Monuments
Cultural heritage
Image bank
Wikimedia Commons
Libre knowledge
Digithum Spanish
May 2012 Wikipedia came into being in cyberspace. Its early years were marked by asynchronous work by users located all over the world who hardly ever related on a personal level outside the net. With time, some of the volunteers met at what were called wikimeetups, encounters initially aimed at tightening bonds which did not bring about any direct improvement to the project content. Face-to-face initiatives later took place that involved not just volunteers but also cultural entities. The most recent event and the one with the greatest impact was Wiki Loves Monuments 2011, a competition to photograph monuments in 18 European countries, including Spain. The high level of participation led to 160,000 photographs of monuments being taken, with Spain occupying the third place in terms of number of photographs. In this paper we explore the origins, implementation, development and results of Wiki Loves Monuments. The success of the 2011 edition and requests from other countries has led to organization of Wiki Loves Monuments 2012, which will be held at the global level. 3 0
Wikipedia's Role in Reputation Management: An Analysis of the Best and Worst Companies in the USA Marcia W. DiStaso
Marcus Messner
Reputation management
United States
Social media
Digithum English
May 2012 Being considered one of the best companies in the USA is a great honor, but this reputation does not exempt businesses from negativity in the collaboratively edited online encyclopedia Wikipedia. Content analysis of corporate Wikipedia articles for companies with the best and worst reputations in the USA revealed that negative content outweighed positive content irrespective of reputation. It was found that both the best and the worst companies had more negative than positive content in Wikipedia. This is an important issue because Wikipedia is not only one of the most popular websites in the world, but is also often the first place people look when seeking corporate information. Although there was more content on corporate social responsibility in the entries for the ten companies with the best reputations, this was still overshadowed by content referring to legal issues or scandals. Ultimately, public relations professionals need to regularly monitor and request updates to their corporate Wikipedia articles regardless of what kind of company they work for. 4 0
Spamming for Science: Active Measurement in Web 2.0 Abuse Research Andrew G. West
Pedram Hayati
Vidyasagar Potdar
Insup Lee
WECSR English March 2012 Spam and other electronic abuses have long been a focus of computer security research. However, recent work in the domain hasemphasized an *economic analysis* of these operations in the hope of understanding and disrupting the profit model of attackers. Such studies do not lend themselves to passive measurement techniques. Instead, researchers have become middle-men or active participants in spam behaviors; methodologies that lie at an interesting juncture of legal, ethical, and human subject (e.g., IRB) guidelines. In this work two such experiments serve as case studies: One testing a novel link spam model on Wikipedia and another using blackhat software to target blog comments and forums. Discussion concentrates on the experimental design process, especially as influenced by human-subject policy. Case studies are used to frame related work in the area, and scrutiny reveals the computer science community requires greater consistency in evaluating research of this nature. 0 0
A Breakdown of Quality Flaws in Wikipedia Maik Anderka
Benno Stein
Quality Flaws
Information quality
User-generated Content Analysis
2nd Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality 12) English 2012 The online encyclopedia Wikipedia is a successful example of the increasing popularity of user generated content on the Web. Despite its success, Wikipedia is often criticized for containing low-quality information, which is mainly attributed to its core policy of being open for editing by everyone. The identification of low-quality information is an important task since Wikipedia has become the primary source of knowledge for a huge number of people around the world. Previous research on quality assessment in Wikipedia either investigates only small samples of articles, or else focuses on single quality aspects, like accuracy or formality. This paper targets the investigation of quality flaws, and presents the first complete breakdown of Wikipedia's quality flaw structure. We conduct an extensive exploratory analysis, which reveals (1) the quality flaws that actually exist, (2) the distribution of flaws in Wikipedia, and (3) the extent of flawed content. An important finding is that more than one in four English Wikipedia articles contains at least one quality flaw, 70% of which concern article verifiability. 0 0
A Jester's Promenade: Citations to Wikipedia in Law Reviews, 2002-2008 Daniel J. Baker Wikipedia
Legal citation
Law reviews
Law journals
Legal writing
I/S: A Journal of Law and Policy for the Information Society 2012 Due to its perceived omniscience and ease-of-use, reliance on the online encyclopedia Wikipedia as a source for information has become pervasive. As a result, scholars and commentators have begun turning their attentions toward this resource and its uses. The main focus of previous writers, however, has been on the use of Wikipedia in the judicial process, whether by litigants relying on Wikipedia in their pleadings or judges relying on it in their decisions. No one, until now, has examined the use of Wikipedia in the legal scholarship context. This article intends to shine a light on the citation aspect of the Wikipedia-as-authority phenomenon by providing detailed statistics on the scope of its use and critiquing or building on the arguments of other commentators. Part II provides an overview of the debate regarding the citation of Wikipedia, beginning with a general discussion on the purposes of citation. In this Part, this article examines why some authors choose to cite to Wikipedia and explains why such citation is nonetheless problematic despite its perceived advantages. A citation analysis performed on works published by nearly 500 American law reviews between 2002 and 2008 is the focus of Part III, from a description of the methodology to an examination of the results of the analysis and any trends that may be discerned from the statistics. Finally, Part IV examines the propriety of citing to Wikipedia, culminating in a call for tighter editorial standards in law reviews. 0 0
A Simple Application Program Interface for Saving Java Program Data on a Wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Advances in Software Engineering English 2012 A simple application program interface (API) for Java programs running on a wiki is implemented experimentally. A Java program with the API can be running on a wiki, and the Java program can save its data on the wiki. The Java program consists of PukiWiki, which is a popular wiki in Japan, and a plug-in, which starts up Java programs and classes of Java. A Java applet with default access privilege cannot save its data at a local host. We have constructed an API of applets for easy and unified data input and output at a remote host. We also combined the proposed API and the wiki system by introducing a wiki tag for starting Java applets. It is easy to introduce new types of applications using the proposed API. We have embedded programs such as a simple text editor, a simple music editor, a simple drawing program, and programming environments in a PukiWiki system using this API. 10 7
A practical approach to language complexity: a Wikipedia case study Taha Yasseri
András Kornai
János Kertész
Submitted to PLoS ONE English 2012 In this paper we present statistical analysis of English texts from Wikipedia (WP). We try to address the issue of language complexity empirically by comparing samples of the main English WP (Main) and the simple English WP (Simple). Simple is supposed to use a more simplified language with a limited vocabulary, and editors are explicitly requested to follow this guideline, yet in practice the vocabulary richness of both samples are at the same level. However, detailed analysis of longer units (n-grams rather than words alone) shows that the language of Simple is indeed less complex than that of Main. Comparing the two language varieties by the Gunning readability index supports this conclusion. We also report on the topical dependence of language complexity, e.g. that the language is more advanced in conceptual articles compared to person-based (biographical) and object-based articles. Finally, we investigate the relation between conflict and language complexity by analysing the content of the talk pages associated to controversial and peacefully developing articles, concluding that controversy has the effect of reducing language complexity. 0 0
Automatic vandalism detection in Wikipedia with active associative classification Maria Sumbana
Goncalves M.A.
Rodrigo Silva
Jussara Almeida
Adriano Veloso
Lecture Notes in Computer Science English 2012 Wikipedia and other free editing services for collaboratively generated content have quickly grown in popularity. However, the lack of editing control has made these services vulnerable to various types of malicious actions such as vandalism. State-of-the-art vandalism detection methods are based on supervised techniques, thus relying on the availability of large and representative training collections. Building such collections, often with the help of crowdsourcing, is very costly due to a natural skew towards very few vandalism examples in the available data as well as dynamic patterns. Aiming at reducing the cost of building such collections, we present a new active sampling technique coupled with an on-demand associative classification algorithm for Wikipedia vandalism detection. We show that our classifier enhanced with a simple undersampling technique for building the training set outperforms state-of-the-art classifiers such as SVMs and kNNs. Furthermore, by applying active sampling, we are able to reduce the need for training in almost 96% with only a small impact on detection results. 0 0
Biographical Social Networks on Wikipedia: A cross-cultural study of links that made history Pablo Aragón
Andreas Kaltenbrunner
David Laniado
Yana Volkovich
Social network analysis
Cross language studies
WikiSym English 2012 It is arguable whether history is made by great men and women or vice versa, but undoubtably social connections shape history. Analysing Wikipedia, a global collective memory place, we aim to understand how social links are recorded across cultures. Starting with the set of biographies in the English Wikipedia we focus on the networks of links between these biographical articles on the 15 largest language Wikipedias. We detect the most central characters in these networks and point out culture-related peculiarities. Furthermore, we reveal remarkable similarities between distinct groups of language Wikipedias and highlight the shared knowledge about connections between persons across cultures. 0 0
Dynamics of conflicts in Wikipedia Taha Yasseri
Róbert Sumi
András Rung
András Kornai
János Kertész
Wikipedia; editorial activity; editors demography; circadian patterns To appear in PLoS ONE English 2012 In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only. 0 1
Emotions and dialogue in a peer-production community: the case of Wikipedia David Laniado
Carlos Castillo
Andreas Kaltenbrunner
Mayo Fuster Morell
Talk page
Gender gap
WikiSym English 2012 This paper presents a large-scale analysis of emotions in conversations among Wikipedia editors. Our focus is on the emotions expressed by editors in talk pages, measured by using the Affective Norms for English Words (ANEW).

We find evidence that to a large extent women tend to participate in discussions with a more positive tone, and that administrators are more positive than non-administrators. Surprisingly, female non-administrators tend to behave like administrators in many aspects.

We observe that replies are on average more positive than the comments they reply to, preventing many discussions from spiralling down into conflict. We also find evidence of emotional homophily: editors having similar emotional styles are more likely to interact with each other.

Our findings offer novel insights into the emotional dimension of interactions in peer-production communities, and contribute to debates on issues such as the flattening of editor growth and the gender gap.
0 0
FlawFinder: A Modular System for Predicting Quality Flaws in Wikipedia Oliver Ferschke
Iryna Gurevych
Marc Rittberger
PAN English 2012 With over 23 million articles in 285 languages, Wikipedia is the largest free knowledge base on the web. Due to its open nature, everybody is allowed to access and edit the contents of this huge encyclopedia. As a downside of this open access policy, quality assessment of the content becomes a critical issue and is hardly manageable without computational assistance. In this paper, we present FlawFinder, a modular system for automatically predicting quality flaws in unseen Wikipedia articles. It competed in the inaugural edition of the Quality Flaw Prediction Task at the PAN Challenge 2012 and achieved the best precision of all systems and the second place in terms of recall and F1-score. 0 1
Network Analysis of User Generated Content Quality in Wikipedia Myshkin Ingawale
Amitava Dutta
Rahul Roy
Priya Seetharaman
Network analysis
Social computing
Structural holes
User generated content
Online Information Review 2012 Social media platforms allow near-unfettered creation and exchange of User Generated Content (UGC). We use Wikipedia, which consists of interconnected user generated articles. Drawing from network science, we examine whether high and low quality UGC in Wikipedia differ in their connectivity structures. Using featured articles as a proxy for high quality, we undertake a network analysis of the revision history of six different language Wikipedias to offer a network-centric explanation for the emergence of quality in UGC. The network structure of interactions between articles and contributors plays an important role in the emergence of quality. Specifically, the analysis reveals that high quality articles cluster in hubs that span structural holes. The analysis does not capture the strength of interactions between articles and contributors. The implication of this limitation is that quality is viewed as a binary variable. Extensions to this research will relate strength of interactions to different levels of quality in user generated content. Practical implications Our findings help harness the ‘wisdom of the crowds’ effectively. Organizations should nurture users and articles at the structural hubs, from an early stage. This can be done through appropriate design of collaborative knowledge systems and development of organizational policies to empower hubs. Originality The network centric perspective on quality in UGC and the use of a dynamic modeling tool are novel. The paper is of value to researchers in the area of social computing and to practitioners implementing and maintaining such platforms in organizations. 0 0
On the Evolution of Quality Flaws and the Effectiveness of Cleanup Tags in the English Wikipedia Maik Anderka
Benno Stein
Matthias Busse
Cleanup Tags
Quality Flaws
Information quality
Quality Flaw Evolution
Wikipedia Academy English 2012 The improvement of information quality is a major task for the free online encyclopedia Wikipedia. Recent studies targeted the analysis and detection of specific quality flaws in Wikipedia articles. To date, quality flaws have been exclusively investigated in current Wikipedia articles, based on a snapshot representing the state of Wikipedia at a certain time. This paper goes further, and provides the first comprehensive breakdown of the evolution of quality flaws in Wikipedia. We utilize cleanup tags to analyze the quality flaws that have been tagged by the Wikipedia community in the English Wikipedia, from its launch in 2001 until 2011. This leads to interesting findings regarding (1) the development of Wikipedia's quality flaw structure and (1) the usage and the effectiveness of cleanup tags. Specifically, we show that inline tags are more effective than tag boxes, and provide statistics about the considerable volume of rare and non-specific cleanup tags. We expect that this work will support the Wikipedia community in making quality assurance activities more efficient. 0 0
On the Use of PU Learning for Quality Flaw Prediction in Wikipedia Edgardo Ferretti
Donato Hernández Fusilier
Rafael Guzmán Cabrera
Manuel Montes y Gómez
Marcelo Errecalde
Paolo Rosso
PAN English 2012 In this article we describe a new approach to assess Quality Flaw Prediction in Wikipedia. The partially supervised method studied, called PU Learning, has been successfully applied in classifications tasks with traditional corpora like Reuters-21578 or 20-Newsgroups. To the best of our knowledge, this is the first time that it is applied in this domain. Throughout this paper, we describe how the original PU Learning approach was evaluated for assessing quality flaws and the modifications introduced to get a quality flaws predictor which obtained the best F1 scores in the task “Quality Flaw Prediction in Wikipedia” of the PAN challenge. 0 1
Overview of the 1st International Competition on Quality Flaw Prediction in Wikipedia Maik Anderka
Benno Stein
Information quality
Quality Flaw Prediction
CLEF English 2012 The paper overviews the task "Quality Flaw Prediction in Wikipedia" of the PAN'12 competition. An evaluation corpus is introduced which comprises 1,592,226 English Wikipedia articles, of which 208,228 have been tagged to contain one of ten important quality flaws. Moreover, the performance of three quality flaw classifiers is evaluated. 0 0
Predicting Quality Flaws in User-generated Content: The Case of Wikipedia Maik Anderka
Benno Stein
Nedim Lipka
User-generated Content Analysis
Information quality
Quality Flaw Prediction
One-class Classification
35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012) English 2012 The detection and improvement of low-quality information is a key concern in Web applications that are based on user-generated content; a popular example is the online encyclopedia Wikipedia. Existing research on quality assessment of user-generated content deals with the classification as to whether the content is high-quality or low-quality. This paper goes one step further: it targets the prediction of quality flaws, this way providing specific indications in which respects low-quality content needs improvement. The prediction is based on user-defined cleanup tags, which are commonly used in many Web applications to tag content that has some shortcomings. We apply this approach to the English Wikipedia, which is the largest and most popular user-generated knowledge source on the Web. We present an automatic mining approach to identify the existing cleanup tags, which provides us with a training corpus of labeled Wikipedia articles. We argue that common binary or multiclass classification approaches are ineffective for the prediction of quality flaws and hence cast quality flaw prediction as a one-class classification problem. We develop a quality flaw model and employ a dedicated machine learning approach to predict Wikipedia's most important quality flaws. Since in the Wikipedia setting the acquisition of significant test data is intricate, we analyze the effects of a biased sample selection. In this regard we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. The flaw prediction performance is evaluated with 10,000 Wikipedia articles that have been tagged with the ten most frequent quality flaws: provided test data with little noise, four flaws can be detected with a precision close to 1. 0 0
Volunteered geographic information production as a spatial process Hardy
M.F. Goodchild
Distance decay; geotagging; user-generated content; volunteered geographic information; Wikipedia International Journal of Geographical Information Science 2012 Wikipedia is a free encyclopedia that anyone can edit and a popular example of user-generated content that includes volunteered geographic information (VGI). In this article, we present three main contributions: (1) a spatial data model and collection methods to study VGI in systems that may not explicitly support geographic data; (2) quantitative methods for measuring distance between online authors and articles; and (3) empirically calibrated results from a gravity model of the role of distance in VGI production. To model spatial processes of VGI contributors, we use an invariant exponential gravity model based on article and author proximity. We define a proximity metric called a ‘signature distance’ as a weighted average distance between an article and each of its authors, and we estimate the location of 2.8 million anonymous authors through IP geolocation. Our study collects empirical data directly from 21 language-specific Wikipedia databases, spanning 7 years of contributions (2001–2008) to nearly 1 million geotagged articles. We find empirical evidence that the spatial processes of anonymous contributors fit an exponential distance decay model. Our results are consistent with the prior results on information diffusion as a spatial process, but run counter to theories that a globalized Internet neutralizes distance as a determinant of social behaviors. 0 1
Accuracy and completeness of drug information in Wikipedia: an assessment Natalie Kupferberg
Bridget McCrate Protus
Journal of the Medical Library Association English October 2011 8 2
Autonomous Link Spam Detection in Purely Collaborative Environments Andrew G. West
Avantika Agrawal
Phillip Baker
Brittney Exline
Insup Lee
Collaborative security
Information security
Spam mitigation
Spatio- temporal features
Machine learning
Intelligent routing
WikiSym English October 2011 Collaborative models (e.g., wikis) are an increasingly prevalent Web technology. However, the open-access that defines such systems can also be utilized for nefarious purposes. In particular, this paper examines the use of collaborative functionality to add inappropriate hyperlinks to destinations outside the host environment (i.e., link spam). The collaborative encyclopedia, Wikipedia, is the basis for our analysis.

Recent research has exposed vulnerabilities in Wikipedia's link spam mitigation, finding that human editors are latent and dwindling in quantity. To this end, we propose and develop an autonomous classifier for link additions. Such a system presents unique challenges. For example, low barriers-to-entry invite a diversity of spam types, not just those with economic motivations. Moreover, issues can arise with how a link is presented (regardless of the destination).

In this work, a spam corpus is extracted from over 235,000 link additions to English Wikipedia. From this, 40+ features are codified and analyzed. These indicators are computed using "wiki" metadata, landing site analysis, and external data sources. The resulting classifier attains 64% recall at 0.5% false-positives (ROC-AUC=0.97). Such performance could enable egregious link additions to be blocked automatically with low false-positive rates, while prioritizing the remainder for human inspection. Finally, a live Wikipedia implementation of the technique has been developed.
0 0
What Wikipedia Deletes: Characterizing Dangerous Collaborative Content Andrew G. West
Insup Lee
User generated content
Content removal
Information security
WikiSym English October 2011 Collaborative environments, such as Wikipedia, often have low barriers-to-entry in order to encourage participation. This accessibility is frequently abused (e.g., vandalism and spam). However, certain inappropriate behaviors are more threatening than others. In this work, we study contributions which are not simply ``undone -- but *deleted* from revision histories and public view. Such treatment is generally reserved for edits which: (1) present a legal liability to the host (e.g., copyright issues, defamation), or (2) present privacy threats to individuals (i.e., contact information). Herein, we analyze one year of Wikipedia's public deletion log and use brute-force strategies to learn about privately handled redactions. This permits insight about the prevalence of deletion, the reasons that induce it, and the extent of end-user exposure to dangerous content. While Wikipedia's approach is generally quite reactive, we find that copyright issues prove most problematic of those behaviors studied. 0 1
Link Spamming Wikipedia for Profit Andrew G. West
Jian Chang
Krishna Venkatasubramanian
Oleg Sokolsky
Insup Lee
Web 2.0 spam
Collaborative security
Attack model
Measurement study
Spam economics
CEAS '11: Proc. of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse, and Spam Conference English September 2011 Collaborative functionality is an increasingly prevalent web technology. To encourage participation, these systems usually have low barriers-to-entry and permissive privileges. Unsurprisingly, ill-intentioned users try to leverage these characteristics for nefarious purposes. In this work, a particular abuse is examined -- link spamming -- the addition of promotional or otherwise inappropriate hyperlinks.

Our analysis focuses on the wiki model and the collaborative encyclopedia, Wikipedia, in particular. A principal goal of spammers is to maximize *exposure*, the quantity of people who view a link. Creating and analyzing the first Wikipedia link spam corpus, we find that existing spam strategies perform quite poorly in this regard. The status quo spamming model relies on link persistence to accumulate exposures, a strategy that fails given the diligence of the Wikipedia community. Instead, we propose a model that exploits the latency inherent in human anti-spam enforcement.

Statistical estimation suggests our novel model would produce significantly more link exposures than status quo techniques. More critically, the strategy could prove economically viable for perpetrators, incentivizing its exploitation. To this end, we address mitigation strategies.
0 0
Multilingual Vandalism Detection using Language-Independent & Ex Post Facto Evidence Andrew G. West
Insup Lee
PAN-CLEF English September 2011 There is much literature on Wikipedia vandalism detection. However, this writing addresses two facets given little treatment to date. First, prior efforts emphasize zero-delay detection, classifying edits the moment they are made. If classification can be delayed (e.g., compiling offline distributions), it is possible to leverage ex post facto evidence. This work describes/evaluates several features of this type, which we find to be overwhelmingly strong vandalism indicators.

Second, English Wikipedia has been the primary test-bed for research. Yet, Wikipedia has 200+ language editions and use of localized features impairs portability. This work implements an extensive set of language-independent indicators and evaluates them using three corpora (German, English, Spanish). The work then extends to include language-specific signals. Quantifying their performance benefit, we find that such features can moderately increase classifier accuracy, but significant effort and language fluency are required to capture this utility.

Aside from these novel aspects, this effort also broadly addresses the task, implementing 65 total features. Evaluation produces 0.840 PR-AUC on thezero-delay task and 0.906 PR-AUC with ex post facto evidence (averaging languages). Performance matches the state-of-the-art (English), sets novel baselines (German, Spanish), and is validated by a first-place finish over the 2011 PAN-CLEF test set.
0 0
The visibility of Wikipedia in scholarly publications Taemin Kim Park First Monday English 1 August 2011 Publications in the Institute of Scientific Information’s (ISI, currently Thomson Reuters) Web of Science (WoS) and Elsevier’s Scopus databases were utilized to collect data about Wikipedia research and citations to Wikipedia. The growth of publications on Wikipedia research, the most active researchers, their associated institutions, academic fields and their geographic distribution are treated in this paper. The impact and influence of Wikipedia were identified, utilizing cited work found in (WoS) and Scopus. Additionally, leading authors, affiliated institutions, countries, academic fields, and publications that frequently cite Wikipedia are identified. 18 2
Wikipédia e enciclopédia britânica: Informação confiável? Aline Luli Romero Ribeiro
Cláudio Gottschalg-Duque
Enciclopédia Britânica
Informação confiável
Revista Brasileira de Biblioteconomia e Documentação Portuguese July 2011 Este artigo apresenta os resultados obtidos em um trabalho acadêmico que estudou a confiabilidade das informações das duas obras de referência, no formato digital e em língua inglesa, Wikipédia e Enciclopédia Britânica, dentro da área de Biblioteconomia, por meio da avaliação de verbetes semelhantes. Com o intuito de determinar o nível de confiabilidade de cada uma destas Enciclopédias e com base nos conceitos de Arquitetura da Informação, pretende-se analisar se a proibição da citação da Wikipédia no ambiente acadêmico faz-se justificada. 11 0
La dimensió de les llengües a la Wikipedia i la seua relació amb els elements socials Borja Pellejero
Natxo Sorolla
Marina Nogué
Digital language community
Digithum Catalan May 2011 There would seem to be a contradiction in the fact that Catalan should have a Wikipedia with a similar number of pages to that in Chinese. There are fewer than ten million Catalan speakers, and they were marginalised in their own land for a long time, but they have still been able to produce content on the internet that in some cases matches that of China, a world economic superpower with nearly one billion Chinese speakers. Though it should be noted that the situation is not the same in China as it is in those places where Catalan is spoken. This article offers an initial look at the social, educational, technological, economic and demographic factors linked to a language’s position in the ranking of number of Wikipedia articles. This analysis is based on one key concept, that of the digital language community, and the observation that Catalan’s position on the internet is not due to the activism of its speakers, but to a position that resembles that of any other medium-sized language community. Hi ha un aparent contrasentit en el fet que el català tinga a la Wikipedia un nombre d’articles similar al xinès. Una comunitat que no arriba a deu milions de catalanoparlants, llargament minoritzada al propi territori, pot arribar a tenir una capacitat de producció a internet que en alguns casos és assimilable a la de la Xina, que, amb prop de mil milions de parlants de xinès, és una superpotència econòmica mundial. Els més àvids matisaran que la situació no és la mateixa a la Xina que als territoris de llengua catalana. Aquest text vol fer una primera aproximació a quins són aquests factors socials, educatius, tecnològics, econòmics i demogràfics que estan relacionats amb la posició d’una llengua en el rànquing del nombre d’articles a la Wikipedia. D’aquesta anàlisi naix un concepte clau, el de la comunitat lingüística digital, i l’observació que la posició del català en el món d’internet no es deu a un pretès activisme dels seus parlants, sinó més aviat a una posició força semblant a la d’altres comunitats lingüístiques de demografia mitjana. 8 0
Le copyleft appliqué à la création hors logiciel Antoine Moreau French May 2011 Copyleft is a legal notion stemming from the free software movement which, while observing the author's rights, allows copying, spreading and transforming works and forbids the exclusive enjoyment of them. It is the Free Software Foundation's GNU project, initiated by Richard Stallman, with the first free copyleft license for software: the General Public License. Our research deals with copyleft applied to non-software creation such as we have initiated it in 2000 with the Free Art License. Through our practicing it and noticing its effects, we raise questions about the status of the author in the digital age. We discover a history, a history of art, which is not determined by an end anymore but leads on to infinite creations made by an infinity of artists, both minor and consequent. We observe that copyleft is not an ordinary creation process, but a decreation process. It asserts, negatively and through the flaws, not negation or failure, but the beauty of a gesture graciously offering itself. This gesture combines ethics and aesthetics, it is « es-ethical ». We understand that with copyleft, technique serves a politics of « hyper-democratic » opening as seen in the Web's hypertext structure which punches holes through pages and opens onto otherness. It is about articulating the singular and the plural in an ecosystem preserving the common good from the passion of power. A broadened economy exceeds, without negating it, the sole market. Copyleft works assert that political and cultural reality where art forms the freedom common to all and to each. 0 1
Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features B. Thomas Adler
Luca de Alfaro
Santiago M. Mola Velasco
Paolo Rosso
Andrew G. West
Machine learning
Natural Language Processing
Lecture Notes in Computer Science English February 2011 Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions. 0 1
A simultaneous journal / wiki publication and dissemination of a new species description: Neobidessodes darwiniensis sp. n. from northern Australia (Coleoptera, Dytiscidae, Bidessini) Lars Hendrich
Michael Balke
Species ID
Online species pages
Sequence data
DNA barcoding
Molecular biodiversity assessment
ZooKeys English 2011 Here, we describe a new Australian species in journal format and simultaneously open the description in a wiki format on the The wiki format will always link to the fixed original journal description of the taxon, however it permits future edits and additions to species' taxonomy and biology. The diving beetle Neobidessodes darwiniensis sp. n. (Coleoptera: Dytiscidae, Bidessini) is described based on a single female, collected in a rest pool of the Harriet Creek in the Darwin Area, Northern Territory. Within Neobidessodes the new species is well characterized by its elongate oval body with rounded sides, short and stout segments of antennae, length of body and dorsal surface coloration. In addition to external morphology, we used mitochondrial cox1 sequence data to support generic assignment and to delineate the new species from other Australian Bidessini including all other known Neobidessodes. Illustrations based on digital images are provided here and as online resources. A modified key is provided. Altogether ten species of the genus are now known worldwide, nine from Australia and one from New Guinea. 0 1
Automatically assigning Wikipedia articles to macro-categories Jacopo Farina
Riccardo Tasso
David Laniado
Category graph
Topic coverage
Hypertext English 2011 The online encyclopedia Wikipedia offers millions of articles which are organized in a hierarchical category structure, created and updated by users. In this paper we present a technique which leverages this rich and disordered graph to assign each article to one or more topics. We modify an existing approach, based on the shortest paths between categories, in order to account for the direction of the hierarchy. 0 0
Bancos de imágenes para proyectos enciclopédicos: el caso de Wikimedia Commons = Image databanks in encyclopedia context: the case of Wikimedia Commons Saorín
T. and Pastor Sánchez
Wikipedia; Wikimedia Commons; Public domain; Encyclopedias; Image banks Http:// El profesional de la información, , julio-agosto, n.4, Spanish 2011 This paper presents the characteristics and functionalities of the Wikimedia Commons image databank shared by all Wikipedia projects. The process of finding images and ilustrating wikipedia articles is also explained, along with how to add images to the bank. The role of cultural institutions in promoting free and open cultural heritage content is highlighted. 0 0
Characterization and prediction of Wikipedia edit wars Róbert Sumi
Taha Yasseri
András Rung
András Kornai
János Kertész
WebSci Conference English 2011 We present a new, eficient method for automatically detecting conict cases and test it on five diferent language Wikipedias. We discuss how the number of edits, reverts, the length of discussions deviate in such pages from those following the general workow. 4 2
Co-authorship 2.0: patterns of collaboration in Wikipedia David Laniado
Riccardo Tasso
Collaboration network
Online production
Social network analysis
Hypertext English 2011 The study of collaboration patterns in wikis can help shed light on the process of content creation by online communities. To turn a wiki's revision history into a collaboration network, we propose an algorithm that identifies as authors of a page the users who provided the most of its relevant content, measured in terms of quantity and of acceptance by the community. The scalability of this approach allows us to study the English Wikipedia community as a co-authorship network. We find evidence of the presence of a nucleus of very active contributors, who seem to spread over the whole wiki, and to interact preferentially with inexperienced users. The fundamental role played by this elite is witnessed by the growing centrality of sociometric stars in the network. Isolating the community active around a category, it is possible to study its specific dynamics and most influential authors. 0 3
Collective memory building in Wikipedia: The case of North African uprisings Michela Ferron
Paolo Massa
Web 2.0
Collective memory
Traumatic event
North Africa
WikiSym English 2011 Since December 2010, a series of protests and uprisings have shocked North African countries such as Tunisia, Egypt, Libya, Syria, Yemen and more. In this paper, focusing mainly on the Egyptian revolution, we provide evidence of the intense edit activity occurred during these uprisings on the related Wikipedia pages. Thousands of people provided their contribution on the content pages and discussed improvements and disagreements on the associated talk pages as the traumatic events unfolded. We propose to interpret this phenomenon as a process of collective memory building and argue how on Wikipedia this can be studied empirically and quantitatively in real time. We explore and suggest possible directions for future research on collective memory formation of traumatic and controversial events in Wikipedia. 14 0
Credibility judgment and verification behavior of college students concerning Wikipedia Lim
S. and Simon
Wikipedia; credibility; theory of bounded rationality; verification; college students First Monday 2011 This study examines credibility judgments in relation to peripheral cues and genre of Wikipedia articles, and attempts to understand user information verification behavior based on the theory of bounded rationality. Data were collected employing both an experiment and a survey at a large public university in the midwestern United States in Spring 2010. This study shows some interesting patterns. It appears that the effect of peripheral cues on credibility judgments differed according to genre. Those who did not verify information displayed a higher level of satisficing than those who did. Students used a variety of peripheral cues of Wikipedia. The exploratory data show that peer endorsement may be more important than formal authorities for user generated information sources, such as Wikipedia, which calls for further research. 0 0
Detection of Text Quality Flaws as a One-class Classification Problem Maik Anderka
Benno Stein
Nedim Lipka
Information quality
Quality Flaw Prediction
One-class Classification
20th ACM Conference on Information and Knowledge Management (CIKM 11) English 2011 For Web applications that are based on user generated content the detection of text quality flaws is a key concern. Our research contributes to automatic quality flaw detection. In particular, we propose to cast the detection of text quality flaws as a one-class classification problem: we are given only positive examples (= texts containing a particular quality flaw) and decide whether or not an unseen text suffers from this flaw. We argue that common binary or multiclass classification approaches are ineffective in here, and we underpin our approach by a real-world application: we employ a dedicated one-class learning approach to determine whether a given Wikipedia article suffers from certain quality flaws. Since in the Wikipedia setting the acquisition of sensible test data is quite intricate, we analyze the effects of a biased sample selection. In addition, we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. Altogether, provided test data with little noise, four from ten important quality flaws in Wikipedia can be detected with a precision close to 1. 0 0
Don't bite the newbies: how reverts affect the quantity and quality of Wikipedia work Aaron Halfaker
Aniket Kittur
John Riedl
WikiSym English 2011 Reverts are important to maintaining the quality of Wikipedia. They fix mistakes, repair vandalism, and help enforce policy. However, reverts can also be damaging, especially to the aspiring editor whose work they destroy. In this research we analyze 400,000 Wikipedia revisions to understand the effect that reverts had on editors. We seek to understand the extent to which they demotivate users, reducing the workforce of contributors, versus the extent to which they help users improve as encyclopedia editors. Overall we find that reverts are powerfully demotivating, but that their net influence is that more quality work is done in Wikipedia as a result of reverts than is lost by chasing editors away. However, we identify key conditions – most specifically new editors being reverted by much more experienced editors – under which reverts are particularly damaging. We propose that reducing the damage from reverts might be one effective path for Wikipedia to solve the newcomer retention problem. 0 2
Experiences and perspectives of Wikipedia use in higher education Klaus Wannemacher International Journal of Management in Education 2011 University teaching is confronted with strong challenges through the emergence of new participatory web applications. While social software tools are strongly applied by students, instructors are often reluctant to adapt them in their teaching practice. Only recently have instructors begun to more strongly apply one of the most commonly used Web 2.0 applications, the online encyclopaedia Wikipedia, in university teaching. Based on an overview of international university projects, this contribution presents general data on the background, objectives, teaching approaches, assignment and feedback forms of Wikipedia-related courses and discusses adequate methods of enabling instructors to apply wiki systems within their teaching. 0 0
Experiences with Semantic Wikis for Architectural Knowledge Management Remco C. de Boer
Hans van Vliet
Architectural knowledge management
Semantic wiki
Experience report
WICSA English 2011 In this paper, we reflect on our experiences with using semantic wikis for architectural knowledge management in two different contexts: e-government and distributed software development. Whereas our applications of semantic wikis in e-government focus on organizing and structuring architectural knowledge for reuse, the applications in distributed software development focus on searching and querying architectural knowledge. Yet, the emerging research challenges - alignment of knowledge models, knowledge versioning, change acknowledgements - are very similar. 0 0
Exploring linguistic points of view of Wikipedia Paolo Massa
Federico Scrinzi
Linguistic point of view
Neutral point of view
Open source
WikiSym English 2011 The 3 million articles of the English Wikipedia has been written since 2011 by more than 14 million volunteers. On each article, the community of editors strive to reach a neutral point of view, representing all significant views fairly, proportionately, and without bias. However, beside the English one, there are more than 270 Wikipedias in different languages and their relatively isolated communities of editors are not forced by the platform to discuss and negotiate their points of view. So the empirical question is: do communities on different languages editions of Wikipedia develop their own diverse Linguistic Points of View (LPOV)? To answer this question we created Manypedia, a web tool whose goal is to ease cross-cultural comparisons of Wikipedia language communities by analyzing their different representations of the same topic. 0 1
Group Size and Incentives to Contribute: A Natural Experiment at Chinese Wikipedia Xiaoquan (Michael)
Feng Zhu
Wikipedia; blocking; incentives; public goods American Economic Review 2011 The literature on the private provision of public goods suggests an inverse relationship between incentives to contribute and group size. We find, however, that after an exogenous reduction of group size at Chinese Wikipedia, the nonblocked contributors decrease their contributions by 42.8\% on average. We attribute the cause to social effects: Contributors receive social benefits that increase with both the amount of their contributions and group size, and the shrinking group size weakens these social benefits. Consistent with our explanation, we find that the more contributors value social benefits, the more they reduce their contributions after the size, incentives to contribute, Internet censorship, public goods, social effects, Wikipedia 0 2
Imagining the Wikipedia community: what do Wikipedia authors mean when they write about their ˜community? Christian Pentzold New media & society XX(X) 1–18 2011 This article examines the way Wikipedia authors write their ˜community into being. Mobilizing concepts regarding the communicative constitution of communities, the computer-mediated conversation between editors were investigated using Grounded Theory procedures. The analysis yielded an empirically grounded theory of the users self-understanding of the Wikipedia community as ethos-action community. Hence, this study contributes to research on online community-building as it shifts the focus from structural criteria for communities to the discursive level of community formation. 0 0
Interlinking journal and wiki publications through joint citation: Working examples from ZooKeys and Plazi on Species-ID Lyubomir Penev
Gregor Hagedorn
Daniel Mietchen
Teodor Georgiev
Pavel Stoev
Guido Sautter
Donat Agosti
Andreas Plank
Michael Balke
Lars Hendrich
Terry Erwin
ZooKeys English 2011 Scholarly publishing and citation practices have developed largely in the absence of versioned documents. The digital age requires new practices to combine the old and the new. We describe how the original published source and a versioned wiki page based on it can be reconciled and combined into a single citation reference. We illustrate the citation mechanism by way of practical examples focusing on journal and wiki publishing of taxon treatments. Specifically, we discuss mechanisms for permanent cross-linking between the static original publication and the dynamic, versioned wiki, as well as for automated export of journal content to the wiki, to reduce the workload on authors, for combining the journal and the wiki citation and for integrating it with the attribution of wiki contributors. 9 0
Mediawikis for research, teaching and learning Daniel K. Schneider
Kalliopi Benetos
Martine Ruchat
World Conference on Educational Multimedia, Hypermedia and Telecommunications English 2011 This paper describes the design of various scenarios implemented with Mediawiki software. After discussing four case studies, we present and discuss pedagogical and technical design guidelines. This contribution is part of a long-term research and development program to design, implement and evaluate the use of ICT to support integrated scenarios for research, teaching and learning. We argue that properly configured Mediawikis are suitable tools if content production and knowledge integration are at center stage. 2 0
Modifier Wikipédia : un exercice de FLE Alexis D'Hautcourt Motivation
French as a Foreign Language
The review of inquiry and research French 2011 Dans cet article, nous présentons les différentes étapes d'un exercice de lecture active de textes français authentiques, réalisé dans une université japonaise en exploitant les possibilités de l'encyclopédie en ligne Wikipedia. Il s'agissait pour les étudiants de lire et modifier des articles de Wikipédia relatifs à la France ou à la culture française. Cet exercice a permis d'augmenter la motivation des étudiants et de les aider à développer leur esprit critique. Nous proposons ensuite brièvement d'autres exercices d'ecriture qui peuvent être menés sur l'encyclopédie électronique. Editing Wikipedia provides for a good exercise in language education because it motivates students to read and write in a foreign language. This article details the process students followed in a Japanese university class to edit French entries in Wikipedia, the online encyclopedia. 0 0
Multilingual Ontology Matching based on Wiktionary Data Accessible via SPARQL Endpoint Feiyu Lin
Andrew Krizhanovsky
Proceedings of the 13th Russian Conference on Digital Libraries RCDL’2011 English 2011 Interoperability is a feature required by the Semantic Web. It is provided by the ontology matching methods and algorithms. But now ontologies are presented not only in English, but in other languages as well. It is important to use an automatic translation for obtaining correct matching pairs in multilingual ontology matching. The translation into many languages could be based on the Google Translate API, the Wiktionary database, etc. From the point of view of the balance of presence of many languages, of manually crafted translations, of a huge size of a dictionary, the most promising resource is the Wiktionary. It is a collaborative project working on the same principles as the Wikipedia. The parser of the Wiktionary was developed and the machine-readable dictionary was designed. The data of the machine-readable Wiktionary are stored in a relational database, but with the help of D2R server the database is presented as an RDF store. Thus, it is possible to get lexicographic information (definitions, translations, synonyms) from web service using SPARQL requests. In the case study, the problem entity is a task of multilingual ontology matching based on Wiktionary data accessible via SPARQL endpoint. Ontology matching results obtained using Wiktionary were compared with results based on Google Translate API. 5 0
Posibilidades de Wikipedia en la docencia universitaria: elaboración colaborativa de conocimiento = Possibilities of Wikipedia in higher education: collaborative knowledge construction Http:// SaoríT. n
Pastor Sánchez
J.A. and De Haro de San Mateo
Informational literacy; Wikipedia; Didactic participatory strategies; Research methods; Production of scientific works Http:// Revista Ibersid , Spanish 2011 This work presents a guide for the incorporation of the direct work in the collaborative encyclopedia Wikipedia as a didactic resource in the university teaching. Whereas the use of wikis in the classroom is widely documented, the educational possibilities of Wikipedia itself is not so much. We offer a classification of participatory activities suitable for being carried out by the students in the development of the curricular contents. One of the most relevant aspects is the transformation of the critical and distrustful speech towards the Wikipedia in a direct knowledge of its scope, process of production and systems of quality control. In addition it is a good opportunity to improve a widespread source of information among university undergraduates that affects in a real impact on critical and active use of information sources.. 0 0
PukiWiki-Java Connector, a simple API for saving data of Java programs on a wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Java applets
Data store API
Social coding
WikiSym English 2011 Experimental implementation of SDK for Java programs, PukiWiki-Java Connector, which makes an illusion that wiki pages as persistent data store, is shown. A Java program of them can be running on a wiki page and it can save its data on the page. The Java program consists of PukiWiki which is a popular wiki in Japan, the plug-in which starts up Java Applets. .A Java Applet with default access privilege cannot store its data at the local host. We have constructed the API for the applets to ease data persistent at a remote host. We also combined the API and the wiki system by introducing a wiki plugin and tags for starting up Java Applets. Applet generated persistent data resides in wiki texts side by side. We have successfully ported useful programs such as a simple text editor, a simple music editor, a simple draw program and programming environments in a PukiWiki system using this connector. 2 5
Rfam: Wikipedia, clans and the "decimal" release Gardner PP
Daub J
Tate J
Moore BL
Osuch IH
Griffiths-Jones S
Finn RD
Nawrocki EP
Kolbe DL
Eddy SR
Bateman A
Nucleic Acids Research (Database issue):D141-5 2011 The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at 0 0
Social networks of Wikipedia Paolo Massa Wikipedia
Social network
Empirical analysis
Open source
Hypertext English 2011 Wikipedia, the free online encyclopedia anyone can edit, is a live social experiment: millions of individuals volunteer their knowledge and time to collective create it. It is hence interesting trying to understand how they do it. While most of the attention concentrated on article pages, a less known share of activities happen on user talk pages, Wikipedia pages where a message can be left for the specific user. This public conversations can be studied from a Social Network Analysis perspective in order to highlight the structure of the “talk” network. In this paper we focus on this preliminary extraction step by proposing different algorithms. We then empirically validate the differences in the networks they generate on the Venetian Wikipedia with the real network of conversations extracted manually by coding every message left on all user talk pages. The comparisons show that both the algorithms and the manual process contain inaccuracies that are intrinsic in the freedom and unpredictability of Wikipedia growth. Nevertheless, a precise description of the involved issues allows to make informed decisions and to base empirical findings on reproducible evidence. Our goal is to lay the foundation for a solid computational sociology of wikis. For this reason we release the scripts encoding our algorithms as open source and also some datasets extracted out of Wikipedia conversations, in order to let other researchers replicate and improve our initial effort. 14 2
Towards automatic quality assurance in Wikipedia Maik Anderka
Benno Stein
Nedim Lipka
Information quality
Flaw Detection
20th International Conference on World Wide Web (WWW 11) English 2011 Featured articles in Wikipedia stand for high information quality, and it has been found interesting to researchers to analyze whether and how they can be distinguished from "ordinary" articles. Here we point out that article discrimination falls far short of writer support or automatic quality assurance: Featured articles are not identified, but are made. Following this motto we compile a comprehensive list of information quality flaws in Wikipedia, model them according to the latest state of the art, and devise one-class classification technology for their identification. 0 0
Visualization of large category hierarchies Robert P. Biuk-Aghai
Cheong-Iao Pang
Felix Hon Hou Cheang
Hierarchical data
Information visualization
Large-scale data
Visual Information Communication - International Symposium English 2011 Large data repositories such as electronic journal databases, document corpora and wikis often organise their content into categories. Librarians, researchers, and interested users who wish to know the content distribution among different categories face the challenge of analysing large amounts of data. Information visualization can assist the user by shifting the analysis task to the human visual sub-system. In this paper we describe three visualization methods we have implemented, which help users understand category hierarchies and content distribution within large document repositories, and present an evaluation of these visualizations, pointing out each of their relative strengths for communicating information about the underlying category structure. 1 0
Visualizing author contribution statistics in Wikis using an edit significance metric Peter Kin-Fong Fong
Robert P. Biuk-Aghai
Edit significance
Information visualization
Revision history
WikiSym English 2011 Wiki articles tend to be edited multiple times by multiple authors. This makes it difficult to identify individual authors’ contributions by human observation alone. We calculate an edit significance metric, using different weights for different types of edits, which reflect the different value placed on them by wiki community members. We then aggregate edit significance values and present them as visualizations to the user to aid in perceiving extent and patterns of contributions. 4 0
When the Wikipedians Talk: Network and Tree Structure of Wikipedia Discussion Pages David Laniado
Riccardo Tasso
Yana Volkovich
Andreas Kaltenbrunner
ICWSM English 2011 Talk pages play a fundamental role in Wikipedia as the place for discussion and communication. In this work we use the comments on these pages to extract and study three networks, corresponding to different kinds of interactions. We find evidence of a specific assortativity profile which differentiates article discussions from personal conversations. An analysis of the tree structure of the article talk pages allows to capture patterns of interaction, and reveals structural differences among the discussions about articles from different semantic areas. 0 2
Wikipedia : an Example for Electronic Democracy? Decision, Discipline and Discourse in the Collaborative Encyclopedia Sylvain Firer-Blaess Wikipedia ; Social Theory ; Organisation ; Discipline ; Discourse Ethics ; Foucault ; Habermas Studies In social and Political Thought, 2011 Wikipédia and e-democracy projects have in common the establishment of a mass-scale decision process. The Wikipedian method to discuss and reach consensus is described in this article by Sylvain Firer-Blaess, using the theoretical frame of Michel Foucault and Jurgen Habermas. Can this method be applied to various e-democracy projects? In part, provided that building a free encyclopedia is not the same as living the life of the city. 0 0
Wikipedia as a Data Source for Political Scientists? Accuracy and Completeness of Coverage Http:// Brown
Adam R.
Wikipedia; reliability; accuracy; politics PS: Political Science & Politics 2011 In only 10 years, Wikipedia has risen from obscurity to become the dominant information source for an entire generation. However, any visitor can edit any page on Wikipedia, which hardly fosters confidence in its accuracy. In this article, I review thousands of Wikipedia articles about candidates, elections, and officeholders to assess both the accuracy and the thoroughness of Wikipedia's coverage. I find that Wikipedia is almost always accurate when a relevant article exists, but errors of omission are extremely frequent. These errors of omission follow a predictable pattern. Wikipedia's political coverage is often very good for recent or prominent topics but is lacking on older or more obscure topics. 0 0
Wikipedia category visualization using radial layout Robert P. Biuk-Aghai
Felix Hon Hou Cheang
Information visualization
Radial layout
WikiSym English 2011 Wikipedia is a large and popular daily information source for millions of people. How are articles distributed by topic area, and what is the semantic coverage of Wikipedia? Using manual methods it is impractical to determine this. We present the design of an information visualization tool that produces overview diagrams of Wikipedia’s articles distributed according to category relationships, and show examples of visualizing English Wikipedia. 3 0
Wikipedia world map: method and application of map-like wiki visualization Cheong-Iao Pang
Robert P. Biuk-Aghai
Information visualization
Semantic coverage
WikiSym English 2011 Wiki are popular platforms for collaborative editing. In volunteer-driven wikis such as Wikipedia, which attracts millions of authors editing articles on a diverse range of topics, contributors’ editing activity results in certain semantic coverage of topic areas. Obtaining an understanding of a given wiki’s semantic coverage is not easy. To solve this problem, we have devised a method for visualizing a wiki in a way similar to a geographic map. We have applied our method to Wikipedia, and generated visualizations for several Wikipedia language editions. This paper presents our wiki visualization method and its application. 8 0
Wikipedia: A Key Tool for Global Public Health Promotion James M Heilman
MD FACOG Eckhard Kemmann
MD MASc Michael Bonert
MRCP Anwesh Chatterjee
MD Brent Ragar
DSc Graham M Beards
David J Iberri
BMed Matthew Harvey
MD Brendan Thomas
MD Wouter Stomp
Michael F Martone
MD Daniel J Lodge
PhD Andrea Vondracek
MRCP Jacob F de Wolff
MBBS FRANZCP Casimir Liber
Samir C Grover1
PhD Tim J Vickers
Bertalan Meskó
Michaël R. Laurent
Internet; Wikipedia; public health; health information; knowledge dissemination; patient education; medical education J Med Internet Res 2011 The Internet allows unprecedented opportunities for patients and the general public to retrieve health information from across the globe. Surveys have shown that online health information retrieval is both common and increasing 1-4. Population-based studies have shown that 61% of American and 52% of European citizens have consulted the Internet for health-related information on at least one occasion 1,4. Similarly, numerous cross-sectional surveys in patient populations have shown variable but considerable rates of eHealth activities 5-10. Physicians frequently report that patients have searched the Internet regarding health issues 11,12, although patients do not always discuss these online activities with their doctors 13,14. Among American e-patients, 44% said this information had a minor impact and 13% said it had a major impact on their decisions about health care 4. Websites offering medical information differ widely in their quality 15. While physicians should reasonably view trustworthy information as useful, some have voiced concerns that Internet information may undermine their authority and lead to self-treatment 13. Furthermore, incorrect medical information could result in patient harm. Indeed, about 3% of users of health care information feel that they or someone they know has been seriously harmed by Web-based information 4. A potential solution for these drawbacks is that physicians direct online health information seekers to quality resources. This so-called Internet prescription has been evaluated in a few randomized trials, which showed that it increases use of the recommended websites 16-18. Despite concerns over the quality of health websites, the 2005 Health On the Net survey found that medical Internet users value information availability and ease-of-finding more than accuracy and trustworthiness 13. General search engines, of which Google is the market leader in Western countries, appear to be the most common starting point for laypeople seeking health information, despite the existence of eHealth quality labels and special search engines to explore health information 4,10,13,19,20. Search engines commonly lead seekers to Wikipedia 21. In the 2009 Pew Internet survey on health information, 53% of e-patients had consulted Wikipedia (not necessarily related to health information) 4. This paper examines the role of Wikipedia as a provider of online health information. 0 1
Wikipedia: Example for a future Electronic Democracy?: Decision, Discipline and Discourse in the Collaborative Encyclopaedia Sylvain Firer-Blaess Active learning strategies Studies in Social and Political Thought English 2011 This article describes the mechanisms of a successful product of the Internet involving mass collaboration, namely, the online encyclopaedia Wikipedia.

In the first part of the paper, the author analyses the decision making process, including debates and consensus, which Wikipedia employs, and makes a connection with the Habermasian model of rational discourse. In the second part, he analyses the disciplines (in the Foucauldian sense) which underlie and permit this decision making process. He finds that, on the theoretical plane, despite the harsh criticisms Habermas claimed against the writings of Foucault, we can see a rather complementary relation between the establishing of rational discourse in Wikipedia and the effects of its discipline. In a third part, the author shows the resistances that face the decision-making process and the disciplines, and considers the reactions that have emerged against such resistances. These findings lead on to a discussion of the normativity of Foucauldian disciplines and the possibility of their heterogeneity.

Finally, the author examines the possible implementations of the Wikipedia system to electronic democracy projects.
6 0
Trust in Collaborative Web Applications Andrew G. West
Jian Chang
Krishna Venkatasubramanian
Insup Lee
Future Generation Computer Systems, special section on Trusting Software Behavior English October 2010 Collaborative functionality is increasingly prevalent in Internet applications. Such functionality permits individuals to add -- and sometimes modify -- web content, often with minimal barriers to entry. Ideally, large bodies of knowledge can be amassed and shared in this manner. However, such software also provides a medium for biased individuals, spammers, and nefarious persons to operate. By computing trust/reputation for participating agents and/or the content they generate, one can identify quality contributions. In this work, we survey the state-of-the-art for calculating trust in collaborative content. In particular, we examine four proposals from literature based on: (1) content persistence, (2) natural-language processing, (3) metadata properties, and (4) incoming link quantity. Though each technique can be applied broadly, Wikipedia provides a focal point for discussion. Finally, having critiqued how trust values are calculated, we analyze how the presentation of these values can benefit end-users and application security. 0 0
STiki: An Anti-Vandalism Tool for Wikipedia Using Spatio-Temporal Analysis of Revision Metadata Andrew G. West
Sampath Kannan
Insup Lee
Collaboration software
Information security
Intelligent routing
Spatio-temporal processing
WikiSym English July 2010 STiki is an anti-vandalism tool for Wikipedia. Unlike similar tools, STiki does not rely on natural language processing (NLP) over the article or diff text to locate vandalism. Instead, STiki leverages spatio-temporal properties of revision metadata. The feasibility of utilizing such properties was demonstrated in our prior work, which found they perform comparably to NLP-efforts while being more efficient, robust to evasion, and language independent. STiki is a real-time, on-Wikipedia implementation based on these properties. It consists of, (1) a server-side processing engine that examines revisions, scoring the likelihood each is vandalism, and, (2) a client-side GUI that presents likely vandalism to end-users for definitive classiffcation (and if necessary, reversion on Wikipedia). Our demonstration will provide an introduction to spatio-temporal properties, demonstrate the STiki software, and discuss alternative research uses for the open-source code. 0 0
Semantic tagging of and semantic enhancements to systematics papers: ZooKeys working examples Lyubomir Penev
Donat Agosti
Teodor Georgiev
Terry Catapano
Jeremy Miller
Vladimir Blagoderov
David Roberts
Vincent Smith
Irina Brake
Simon Ryrcroft
Ben Scott
Norman Johnson
Robert Morris
Guido Sautter
Vishwas Chavan
Tim Robertson
David Remsen
Pavel Stoev
Cynthia Parr
Sandra Knapp
W. John Kress
Chris Thompson
Terry Erwin
Semantic tagging
Semantic enhancements
ZooKeys English June 2010 The concept of semantic tagging and its potential for semantic enhancements to taxonomic papers is outlined and illustrated by four exemplar papers published in the present issue of ZooKeys. The four papers were created in different ways: (i) written in Microsoft Word and submitted as non-tagged manuscript (doi: 10.3897/zookeys.50.504); (ii) generated from Scratchpads and submitted as XML-tagged manuscripts (doi: 10.3897/zookeys.50.505 and doi: 10.3897/zookeys.50.506); (iii) generated from an author’s database (doi: 10.3897/zookeys.50.485) and submitted as XML-tagged manuscript. XML tagging and semantic enhancements were implemented during the editorial process of ZooKeys using the Pensoft Mark Up Tool (PMT), specially designed for this purpose. The XML schema used was TaxPub, an extension to the Document Type Definitions (DTD) of the US National Library of Medicine Journal Archiving and Interchange Tag Suite (NLM). The following innovative methods of tagging, layout, publishing and disseminating the content were tested and implemented within the ZooKeys editorial workflow: (1) highly automated, fine-grained XML tagging based on TaxPub; (2) final XML output of the paper validated against the NLM DTD for archiving in PubMedCentral; (3) bibliographic metadata embedded in the PDF through XMP (Extensible Metadata Platform); (4) PDF uploaded after publication to the Biodiversity Heritage Library (BHL); (5) taxon treatments supplied through XML to Plazi; (6) semantically enhanced HTML version of the paper encompassing numerous internal and external links and linkouts, such as: (i) vizualisation of main tag elements within the text (e.g., taxon names, taxon treatments, localities, etc.); (ii) internal cross-linking between paper sections, citations, references, tables, and figures; (iii) mapping of localities listed in the whole paper or within separate taxon treatments; (v) taxon names autotagged, dynamically mapped and linked through the Pensoft Taxon Profile (PTP) to large international database services and indexers such as Global Biodiversity Information Facility (GBIF), National Center for Biotechnology Information (NCBI), Barcode of Life (BOLD), Encyclopedia of Life (EOL), ZooBank, Wikipedia, Wikispecies, Wikimedia, and others; (vi) GenBank accession numbers autotagged and linked to NCBI; (vii) external links of taxon names to references in PubMed, Google Scholar, Biodiversity Heritage Library and other sources. With the launching of the working example, ZooKeys becomes the first taxonomic journal to provide a complete XML-based editorial, publication and dissemination workflow implemented as a routine and cost-efficient practice. It is anticipated that XML-based workflow will also soon be implemented in botany through PhytoKeys, a forthcoming partner journal of ZooKeys. The semantic markup and enhancements are expected to greatly extend and accelerate the way taxonomic information is published, disseminated and used. 0 1
Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata Andrew G. West
Sampath Kannan
Insup Lee
Spatio-temporal reputation
Collaboration software
Content-based access control
EUROSEC English April 2010 Blatantly unproductive edits undermine the quality of the collaboratively-edited encyclopedia, Wikipedia. They not only disseminate dishonest and offensive content, but force editors to waste time undoing such acts of vandalism. Language-processing has been applied to combat these malicious edits, but as with email spam, these filters are evadable and computationally complex. Meanwhile, recent research has shown spatial and temporal features effective in mitigating email spam, while being lightweight and robust. In this paper, we leverage the spatio-temporal properties of revision metadata to detect vandalism on Wikipedia. An administrative form of reversion called rollback enables the tagging of malicious edits, which are contrasted with nonoffending edits in numerous dimensions. Crucially, none of these features require inspection of the article or revision text. Ultimately, a classifier is produced which flags vandalism at performance comparable to the natural-language efforts we intend to complement (85% accuracy at 50% recall). The classifier is scalable (processing 100+ edits a second) and has been used to locate over 5,000 manually-confirmed incidents of vandalism outside our labeled set. 9 3
A five-year study of on-campus Internet use by undergraduate biomedical students Terry Judd
Gregor Kennedy
Computers and Education 2010 This paper reports on a five-year study (2005-2009) of biomedical students' on-campus use of the Internet. Internet usage logs were used to investigate students' sessional use of key websites and technologies. The most frequented sites and technologies included the university's learning management system, Google, email and Facebook. Email was the primary method of electronic communication. However, its use declined over time, with a steep drop in use during 2006 and 2007 appearing to correspond with the rapid uptake of the social networking site Facebook. Both Google and Wikipedia gained in popularity over time while the use of other key information sources, including the library and biomedical portals, remained low throughout the study. With the notable exception of Facebook, most {'Web} 2.0' technologies attracted little use. The {'Net} Generation' students involved in this study were heavy users of generalist information retrieval tools and key online university services, and prefered to use externally hosted tools for online communication. These and other findings have important implications for the selection and provision of services by universities. 2010 Elsevier Ltd. All rights reserved. 0 0
A method for category similarity calculation in Wikis Cheong-Iao Pang
Robert P. Biuk-Aghai
Category similarity
WikiSym English 2010 Wikis, such as Wikipedia, allow their authors to assign categories to articles in order to better organize related content. This paper presents a method to calculate similarities between categories, illustrated by a calculation for the top-level categories in the Simple English version of Wikipedia. 5 2
A semantic approach for question classification using WordNet and Wikipedia Santosh Kumar Ray
Shailendra Singh
B.P. Joshi
Pattern Recognition Letters 2010 Question Answering Systems, unlike search engines, are providing answers to the users' questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the {WordNet} and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by {UIUC)} and then tested it over five {TREC} question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as {TREC,} the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task. 2010 Elsevier {B.V.} All rights reserved. 0 0
Academics and Wikipedia: Reframing Web 2.0+as a disruptor of traditional academic power-knowledge arrangements H. Eijkman Campus-Wide Information Systems 2010 Purpose - There is much hype about academics' attitude to Wikipedia. This paper seeks to go beyond anecdotal evidence by drawing on empirical research to ascertain how academics respond to Wikipedia and the implications these responses have for the take-up of Web 2.0+. It aims to test the hypothesis that Web 2.0+, as a platform built around the socially constructed nature of knowledge, is inimical to conventional power-knowledge arrangements in which academics are traditionally positioned as the key gatekeepers to knowledge. Design/methodology/approach - The research relies on quantitative and qualitative data to provide an evidence-based analysis of the attitudes of academics towards the student use of Wikipedia and towards Web 2.0+. These data were provided via an online survey made available to a number of universities in Australia and abroad. As well as the statistical analysis of quantitative data, qualitative data were subjected to thematic analysis using relational coding. Findings - The data by and large demonstrate that Wikipedia continues to be a divisive issue among academics, particularly within the soft sciences. However, Wikipedia is not as controversial as popular publicity would lead one to believe. Many academics use it extensively though cautiously themselves, and therefore tend to support a cautious approach to its use by students. However, evidence supports the assertion that there is an implicit if not explicit awareness among academics that Wikipedia, and possibly by extension Web 2.0+, are disruptors of conventional academic power-knowledge arrangements. Practical implications - It is clear that academics respond differently to the disruptive effects that Web 2.0+has on the political economy of academic knowledge construction. Contrary to popular reports, responses to Wikipedia are not overwhelmingly focused on resistance but encompass both cautious and creative acceptance. It is becoming equally clear that the increasing uptake of Web 2.0+in higher education makes it inevitable that academics will have to address the political consequences of this reframing of the ownership and control of academic knowledge production. Originality/value - The paper demonstrates originality and value by providing a unique, evidence-based insight into the different ways in which academics respond to Wikipedia as an archetypal Web 2.0+application and by positioning Web 2.0+within the political economy of academic knowledge construction. 0 0
Accuracy estimate and optimization techniques for SimRank computation Dmitry Lizorkin
Pavel Velikhov
Maxim Grinev
Denis Turdakov
VLDB Journal 2010 The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. {SimRank} is a simple and intuitive measure of this kind, based on a graph-theoretic model. {SimRank} is typically computed iteratively, in the spirit of {PageRank.} However, existing work on {SimRank} lacks accuracy estimation of iterative computation and has discouraging time complexity. In this paper, we present a technique to estimate the accuracy of computing {SimRank} iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing {SimRank.} We also present optimization techniques that improve the computational complexity of the iterative algorithm from O(n4) in the worst case to {min(O(nl),} O(n3/ log2n)), with n denoting the number of objects, and l denoting the number object-to-object relationships. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method. As a practical illustration of our techniques, we computed {SimRank} scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links. {Springer-Verlag} 2009. 0 0
An evaluation of medical knowledge contained in Wikipedia and its use in the LOINC database Jeff Friedlin
Clement J McDonald
Journal of the American Medical Informatics Association: JAMIA 2010 The logical observation identifiers names and codes {(LOINC)} database contains 55 000 terms consisting of more atomic components called parts. {LOINC} carries more than 18 000 distinct parts. It is necessary to have definitions/descriptions for each of these parts to assist users in mapping local laboratory codes to {LOINC.} It is believed that much of this information can be obtained from the internet; the first effort was with Wikipedia. This project focused on 1705 laboratory analytes (the first part in the {LOINC} laboratory name). Of the 1705 parts queried, 1314 matching articles were found in Wikipedia. Of these, 1299 (98.9\%) were perfect matches that exactly described the {LOINC} part, 15 (1.14\%) were partial matches (the description in Wikipedia was related to the {LOINC} part, but did not describe it fully), and 102 (7.76\%) were mis-matches. The current release of {RELMA} and {LOINC} include Wikipedia descriptions of {LOINC} parts obtained as a direct result of this project. 0 0
An inside view: credibility in Wikipedia from the perspective of editors H. Francke
O. Sundin
Information Research 2010 Introduction. The question of credibility in participatory information environments, particularly Wikipedia, has been much debated. This paper investigates how editors on Swedish Wikipedia consider credibility when they edit and read Wikipedia articles. Method. The study builds on interviews with 11 editors on Swedish Wikipedia, supported by a document analysis of policies on Swedish Wikipedia. Analysis. The interview transcripts have been coded qualitatively according to the participants' use of Wikipedia and what they take into consideration in making credibility assessments. Results. The participants use Wikipedia for purposes where it is not vital that the information is correct. Their credibility assessments are mainly based on authorship, verifiability, and the editing history of an article. Conclusions. The situations and purposes for which the editors use Wikipedia are similar to other user groups, but they draw on their knowledge as members of the network of practice of wikipedians to make credibility assessments, including knowledge of certain editors and of the {MediaWiki} architecture. Their assessments have more similarities to those used in traditional media than to assessments springing from the wisdom of crowds. 0 1
Analyzing the Creative Editing Behavior of Wikipedia Editors: Through Dynamic Social Network Analysis Takashi Iba
Keiichi Nemoto
Bernd Peters
Peter A. Gloor
Procedia - Social and Behavioral Sciences 2010 0 0
Automatic word sense disambiguation based on document networks D.Yu. Turdakov
S.D. Kuznetsov
Programming and Computer Software 2010 In this paper, a survey of works on word sense disambiguation is presented, and the method used in the Texterra system 1 is described. The method is based on calculation of semantic relatedness of Wikipedia concepts. Comparison of the proposed method and the existing word sense disambiguation methods on various document collections is given. 2010 Pleiades Publishing, Ltd. 0 0
Beyond the legacy of the Enlightenment? Online encyclopaedias as digital heterotopias J. Haider
O. Sundin
First Monday 2010 This article explores how we can understand contemporary participatory online encyclopaedic expressions, particularly Wikipedia, in their traditional role as continuation of the Enlightenment ideal, as well as in the distinctly different space of the Internet. Firstly we position these encyclopaedias in a historical tradition. Secondly, we assign them a place in contemporary digital networks which marks them out as sites in which Enlightenment ideals of universal knowledge take on a new shape. We argue that the Foucauldian concept of heterotopia, that is special spaces which exist within society, transferred online, can serve to understand Wikipedia and similar participatory online encyclopaedias in their role as unique spaces for the construction of knowledge, memory and culture in late modern society. 0 1
Beyond vandalism: Wikipedia trolls Pnina Shachaf
Noriko Hara
Journal of Information Science English 2010 Research on trolls is scarce, but their activities challenge online communities; one of the main challenges of the Wikipedia community is to fight against vandalism and trolls. This study identifies Wikipedia trolls’ behaviours and motivations, and compares and contrasts hackers with trolls; it extends our knowledge about this type of vandalism and concludes that Wikipedia trolls are one type of hacker. This study reports that boredom, attention seeking, and revenge motivate trolls; they regard Wikipedia as an entertainment venue, and find pleasure from causing damage to the community and other people. Findings also suggest that trolls’ behaviours are characterized as repetitive, intentional, and harmful actions that are undertaken in isolation and under hidden virtual identities, involving violations of Wikipedia policies, and consisting of destructive participation in the community.
Research on trolls is scarce, but their activities challenge online communities; one of the main challenges of the Wikipedia community is to fight against vandalism and trolls. This study identifies Wikipedia trolls behaviours and motivations, and compares and contrasts hackers with trolls; it extends our knowledge about this type of vandalism and concludes that Wikipedia trolls are one type of hacker. This study reports that boredom, attention seeking, and revenge motivate trolls; they regard Wikipedia as an entertainment venue, and find pleasure from causing damage to the community and other people. Findings also suggest that trolls behaviours are characterized as repetitive, intentional, and harmful actions that are undertaken in isolation and under hidden virtual identities, involving violations of Wikipedia policies, and consisting of destructive participation in the community. The Author(s), 2010.
0 1
BinRank: Scaling dynamic authority-based search using materialized subgraphs Heasoo Hwang
Andrey Balmin
Berthold Reinwald
Erik Nijkamp
IEEE Transactions on Knowledge and Data Engineering 2010 Dynamic authority-based keyword search algorithms, such as {ObjectRank} and personalized {PageRank,} leverage semantic link information to provide high quality, high recall search in databases, and the Web. Conceptually, these algorithms require a query-time {PageRank-style} iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. We introduce {BinRank,} a system that approximates {ObjectRank} results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running {ObjectRank} on only one of the subgraphs. {BinRank} generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing {ObjectRank} for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that {BinRank} can achieve subsecond query execution time on the English Wikipedia data set, while producing high-quality search results that closely approximate the results of {ObjectRank} on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of {BinRank. 0 0
Bridging domains using world wide knowledge for transfer learning Evan Wei Xiang
Bin Cao
Derek Hao Hu
Qiang Yang
IEEE Transactions on Knowledge and Data Engineering 2010 A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called {BIG} {(Bridging} Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. {BIG} works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a bridge that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with {BIG,} a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly. 0 0
Categorising Social Tags to Improve Folksonomy-based Recommendations Ivan Cantador
Ioannis Konstas
Joemon M. Jose
Web Semantics: Science, Services and Agents on the World Wide Web Pages Accepted Manuscript 2010 0 0
Characterizing and modeling the dynamics of online popularity Jacob Ratkiewicz
Santo Fortunato
Alessandro Flammini
Filippo Menczer
Alessandro Vespignani
Physical Review Letters 2010 Online popularity has an enormous impact on opinions, culture, policy, and profits. We provide a quantitative, large scale, temporal analysis of the dynamics of online content popularity in two massive model systems: the Wikipedia and an entire country's Web space. We find that the dynamics of popularity are characterized by bursts, displaying characteristic features of critical systems such as fat-tailed distributions of magnitude and interevent time. We propose a minimal model combining the classic preferential popularity increase mechanism with the occurrence of random popularity shifts due to exogenous factors. The model recovers the critical features observed in the empirical analysis of the systems analyzed here, highlighting the key factors needed in the description of popularity dynamics. 2010 The American Physical Society. 0 3
Chatting in the Wiki: synchronous-asynchronous integration Robert P. Biuk-Aghai
Keng Hong Lei
Instant messaging
WikiSym English 2010 Wikis have become popular platforms for collaborative writing. The traditional production mode has been remote asynchronous and supported by wiki systems geared toward both asynchronous writing and asynchronous communication. However, many people have come to rely on synchronous communication in their daily work. This paper first discusses aspects of synchronous and asynchronous activity and communication and then proposes an integration of synchronous communication facilities in wikis. A prototype system developed by the authors is briefly presented. 1 1
Chemical Information Media in the Chemistry Lecture Hall: A Comparative Assessment of Two Online Encyclopedias L Korosec
P A Limacher
H P Luthi
M P Brandle
CHIMIA 2010 The chemistry encyclopedia Rompp Online and the German universal encyclopedia Wikipedia were assessed by first-year university students on the basis of a set of 30 articles about chemical thermodynamics. Criteria with regard to both content and form were applied in the comparison; 619 ratings (48\% participation rate) were returned. While both encyclopedias obtained very good marks and performed nearly equally with regard to their accuracy, the average overall mark for Wikipedia was better than for Rompp Online, which obtained lower marks with regard to completeness and length. Analysis of the results and participants' comments shows that students attach importance to completeness, length and comprehensibility rather than accuracy, and also attribute less value to the availability of sources which validate an encyclopedia article. Both encyclopedias can be promoted as a starting reference to access a topic in chemistry. However, it is recommended that instructors should insist that students do not rely solely on encyclopedia texts, but use and cite primary literature in their reports. 0 1
Comparing Methods for Single Paragraph Similarity Analysis B. Stone
S. Dennis
P. J. Kwantes
Topics in Cognitive Science 2010 The focus of this paper is two-fold. First, similarities generated from six semantic models were compared to human ratings of paragraph similarity on two datasets”23 World Entertainment News Network paragraphs and 50 {ABC} newswire paragraphs. Contrary to findings on smaller textual units such as word associations {(Griffiths,} Tenenbaum, \& Steyvers, 2007), our results suggest that when single paragraphs are compared, simple nonreductive models (word overlap and vector space) can provide better similarity estimates than more complex models {(LSA,} Topic Model, {SpNMF,} and {CSM).} Second, various methods of corpus creation were explored to facilitate the semantic models similarity estimates. Removing numeric and single characters, and also truncating document length improved performance. Automated construction of smaller Wikipedia-based corpora proved to be very effective, even improving upon the performance of corpora that had been chosen for the domain. Model performance was further improved by augmenting corpora with dataset paragraphs. 0 0
Constructing Commons in the Cultural Environment MJ Madison
BM Frischmann
KJ Strandburg
Cornell Law Review, 2010 This Article sets out a. framework for investigating sharing and resource-pooling arrangements for information- and knowledge-based works. We argue that adapting the approach pioneered by Elmor Ostrom and her collaborators to commons arrangements in the natural environment provides a template for examining the construction of commons in the cultural environment. The approach promises to lead to a better understanding of how participants in commons and pooling arrangements structure their interactions in relation to the environments in which they are embedded, in relation to information and knowledge resources that they produce and use, and in relation to one another Some examples of the types of arrangements we have in. mind are patent pools (such as the Manufacturer's Aircraft Association), open source software development projects (such as Linux), Wikipedia, the Associated Press, certain jamband communities, medieval guilds, and modern research universities. These examples are illustrative and far from exhaustive. Each involves a constructed cultural commons worth of independent study, but independent studies get us only so far. A more systematic approach is needed. An improved understanding of cultural commons is critical for obtaining a more complete perspective on intellectual property doctrine and its interactions with other legal and social mechanisms for governing creativity and innovation, in particular, and information and knowledge production, conservation, and consumption, generally. We propose and initial framework for evaluating and comparing the contours of different commons arrangements. The framework will allow us to develop an inventory of structural similarities and differences among cultural commons in different industries, disciplines, and knowledge domains and shed light on the underlying contextual reasons for such differences. Structural inquiery into a series of case studies will provide a basis from developing theories to exploan the emergence, form, and stability of the observed variety of cultural commons and eventually, to design models to explicate and infrorm institutional desing. The proposed approach would draw upon case studies from a while range of disciplines Among other things, we argue that theoretical apporaches to constructed cultural and use of pooled resources, internal licensing conditions, management of external relationships, and institutional forms, along with the degree of collaboration among members, sharing of human capital, degrees of integration among participants, and any specified purposed to the arrangement. 0 0
Corrigendum to Wikipedia workload analysis for decentralized hosting Computer Networks 53 (11) (2009) 1830-1845 (DOI:10.1016/j.comnet.2009.02.019) Guido Urdaneta
Guillaume Pierre
Maarten van Steen
Computer Networks 2010 0 0
Creative Commons International The International License Porting Project Catharina Maracke Jipitec 2010 When Creative Commons {(CC)} was founded in 2001, the core Creative Commons licenses were drafted according to United States Copyright Law. Since their first introduction in December 2002, Creative Commons licenses have been enthusiastically adopted by many creators, authors, and other content producers “ not only in the United States, but in many other jurisdictions as well. Global interest in the {CC} licenses prompted a discussion about the need for national versions of the {CC} licenses. To best address this need, the international license porting project {(œCreative} Commons International? “ formerly known as {œInternational} Commons?) was launched in 2003. Creative Commons International works to port the core Creative Commons licenses to different copyright legislations around the world. The porting process includes both linguistically translating the licenses and legally adapting the licenses to a particular jurisdiction such that they are comprehensible in the local jurisdiction and legally enforceable but concurrently retain the same key elements. Since its inception, Creative Commons International has found many supporters all over the world. With Finland, Brazil, and Japan as the first completed jurisdiction projects, experts around the globe have followed their lead and joined the international collaboration with Creative Commons to adapt the licenses to their local copyright. This article aims to present an overview of the international porting process, explain and clarify the international license architecture, its legal and promotional aspects, as well as its most recent challenges. 0 0
Cross-cultural analysis of the Wikipedia community Noriko Hara
Pnina Shachaf
Khe Foon Hew
Communities of practice
Cross cultural aspects
Non English languages
User behavior
J. Am. Soc. Inf. Sci. Technol.
Journal of the American Society for Information Science and Technology
English 2010 This article reports a cross-cultural analysis of four Wikipedias in different languages and demonstrates their roles as communities of practice {(CoPs).} Prior research on {CoPs} and on the Wikipedia community often lacks cross-cultural analysis. Despite the fact that over 75\% of Wikipedia is written in languages other than English, research on Wikipedia primarily focuses on the English Wikipedia and tends to overlook Wikipedias in other languages. This article first argues that Wikipedia communities can be analyzed and understood as {CoPs.} Second, norms of behaviors are examined in four Wikipedia languages {(English,} Hebrew, Japanese, and Malay), and the similarities and differences across these four languages are reported. Specifically, typical behaviors on three types of discussion spaces (talk, user talk, and Wikipedia talk) are identified and examined across languages. Hofstede's dimensions of cultural diversity as well as the size of the community and the function of each discussion area provide lenses for understanding the similarities and differences. As such, this article expands the research on online {CoPs} through an examination of cultural variations across multiple {CoPs} and increases our understanding of Wikipedia communities in various languages. 0 4
Cross-language plagiarism detection Martin Potthast
Barrón-CedeñAlberto o
Benno Stein
Paolo Rosso
Language Resources and Evaluation 2010 0 0
Crowdsourcing: How and Why Should Libraries Do It? R. Holley D-Lib Magazine /04/2011 2010 The definition and purpose of crowdsourcing and its relevance to libraries is discussed with particular reference to the Australian Newspapers service, {FamilySearch,} Wikipedia, Distributed Proofreaders, Galaxy Zoo and The Guardian {MP's} Expenses Scandal. These services have harnessed thousands of digital volunteers who transcribe, create, enhance and correct text, images and archives. Known facts about crowdsourcing are presented and helpful tips and strategies for libraries beginning to crowdsource are given. 0 0
Cursed with self-awareness": gender-bending Suzanne M. Daughton Subversion Pages Women's Studies in Communication 2010 0 0
Design and Development of a Fluid Intelligence Instrument for a technology-enhanced PBL Programme Khar Thoe Ng
Soon Fook Fong
Seng Thah Soon
Global Learn Asia Pacific 2010 0 0
Development of Dermatology Resources in Wikipedia Brendan M. Thomas
Michaël R. Laurent
And Michael Martone
Collaborative editing
Dermatology task force
Article quality and accuracy.
Skin & Aging, , Issue 9, September Discusses the high rank wikipedia medicine related articles have on search engines and focuses on dermatology articles. 2010 0 0
Digital history: all contributions welcome. Nick Poyntz History Today 2010 This article looks at the opportunities and potential perils for historians brought about by the enormous growth in user-generated content on the internet. Developments such as the wiki enable the sharing of information and resources in new ways, one example being the {YourArchives} site provided by the National Archives since 2007. In terms of both its size and the amount of controversy it generates, Wikipedia, the online encyclopedia, surpasses all other secondary sources, and in using it historians need to be as cautious and as careful as they are when assessing the reliability of information contained in any primary source. Databases of photos and moving images such as Flickr and {YouTube} are certain to become essential tools for historians seeking sources on life in the early {21C} but effective use of them depends on accurate written descriptions provided with the images. The system known as Captcha, which ensures comments are not generated by computer programmes, is capable of digitising a huge volume of printed primary sources. {(Quotes} from original text) 0 0
Distributed biomedical terminology development: from experiments to open process C G Chute Yearbook of Medical Informatics 2010 {OBJECTIVE:} Can social computing efforts materially alter the distributed creation and maintenance of complex biomedical terminologies and ontologies; a review of distributed authoring history and status. {BACKGROUND:} Social computing projects, such as Wikipedia, have dramatically altered the perception and reality of large-scale content projects and the labor required to create and maintain them. Health terminologies have become large, complex, interdependent content artifacts of increasing importance to biomedical research and the communities understanding of biology, medicine, and optimal healthcare practices. The question naturally arises as to whether social computing models and distributed authoring platforms can be applied to the voluntary, distributed authoring of high-quality terminologies and ontologies. {METHODS:} An historical review of distributed authoring developments. {RESULTS:} The trajectory of description logic-driven authoring tools, group process, and web-based platforms suggests that public distributed authoring is likely feasible and practical; however, no compelling example on the order of Wikipedia is yet extant. Nevertheless, several projects, including the Gene Ontology and the new revision of the International Classification of Disease {(ICD-11)} hold promise. 0 0
Dynamic systems Consistency without concurrency control in large Mihai Letia
Nuno Preguica
Marc Shapiro
SOSP Workshop on Large Scale Distributed Systems and Middleware (LADIS) 2010 Replicas of a commutative replicated data type {(CRDT)} eventually converge without any complex concurrency control. We validate the design of a non-trivial {CRDT,} a replicated sequence, with performance measurements in the context of Wikipedia. Furthermore, we discuss how to eliminate a remaining scalability bottleneck: Whereas garbage collection previously required a system-wide consensus, here we propose a flexible two-tier architecture and a protocol for migrating between tiers. We also discuss how the {CRDT} concept can be generalised, and its limitations. 0 0
Dynamics of social roles in a knowledge management community Isa Jahnke Computers in Human Behavior 2010 With the emergence of community-oriented Information and Communication Technology {(ICT)} applications, e.g., Wikipedia, the popularity of socio-technical phenomena in society has increased. This development emphasises the need to further our understanding of how computer-supported social group structures change over time and what forms emerge. This contribution presents the results of a qualitative field study of a {Socio-Technical} Community {(STC).} The {STC} is described from its founding (in 2001) to its sustainable development (in 2006) as well as its transformation phase (2007-2008). The design-based research approach revealed changes of social structures by social roles within the {STC} over time. The central conclusion is that such {STC's} - networks of computer-mediated communication and human interaction - evolve a specific kind of social structure, which is formal rather than informal. The results indicate that a group evolves from an informal trust-based community with few formal roles to a {STC} where the social mechanisms, and not the software architecture, supports knowledge management processes. 2009 Elsevier Ltd. All rights reserved. 0 1
E-Partnerships: Library information acquisition in the comfort of students' digital homes Eva Dobozy
Julia Gross
Global Learn Asia Pacific 2010 0 0
ELearning at a higher education institution: Exponential growth and pain Juliet Stoltenkamp
Tasneem Taliep
Norina Braaf
Okasute Kasuto
Global Learn Asia Pacific 2010 0 0
Education and consumer informatics C Boyer Yearbook of Medical Informatics 2010 {OBJECTIVES:} To evaluate the extent to which the Internet is accessed for health information and perceived as useful to varying groups classified primarily according to age. {METHOD:} Synopsis of the articles on education and consumer health informatics selected for the {IMIA} Yearbook of Medical Informatics 2010. {RESULTS:} A growing number of individuals are actively seeking health information through a varying selection of resources. The Internet is now seen as a major source of health information alongside with books and other means of paper-based literature. However, it is not clear how the Internet is perceived by varied groups such as those coming from differing age groups. {CONCLUSION:} The papers selected attempt to obtain a better understanding about how the public perceives and uses the Internet as an information gathering tool-especially for health information. The papers also explore into how the Internet is used by different groups of people. As all online health information is not of uniform quality, it is important to access and rely on quality medical information. This issue is also dealt with, where the popularity of Wikipedia is measured with the popularity of reliable web sources such as Medline Plus and {NHS} Direct. 0 0
Establishing a K-12 circuit design program Mustafa M. Inceoglu IEEE Transactions on Education 2010 Outreach, as defined by Wikipedia, is an effort by an organization or group to connect its ideas or practices to the efforts of other organizations, groups, specific audiences, or the general public. This paper describes a computer engineering outreach project of the Department of Computer Engineering at Ege University, Izmir, Turkey, to a local elementary school. A group of 14 K-12 students was chosen by a four-stage selection method to participate in this project. This group was then taught discrete mathematics and logic design courses from the core curriculum of the Computer Engineering program. The two 11-week courses have a total of 132 contact h. The course contents are conveyed through both theoretical lessons and laboratory sessions. All of the laboratory sessions were carried out by K-12 students. Volunteer teachers from the elementary school participated in the project. The evaluations carried out during and at the end of project indicated the degree of satisfaction on the part of students and teachers. The project is still ongoing with the same methodology in its third year. 0 0
Estimating deep web data source size by capture---recapture method Jianguo Lu
Dingding Li
Information retrieval 2010 Access critical reviews of computing literature. Become a reviewer for Computing Reviews 0 0
Evaluating quality control of Wikipedia's feature articIes D. Lindsey First Monday 2010 The purpose of this study was to evaluate the effectiveness of Wikipedia's premier internal quality control mechanism, the featured article" process which assesses articles against a stringent set of criteria. To this end scholars were asked to evaluate the quality and accuracy of Wikipedia featured articles within their area of expertise. A total of 22 usable responses were collected from a variety of disciplines. Out of the Wikipedia articles assessed only 12 of 22 were found to pass Wikipedia's own featured article criteria indicating that Wikipedia's process is ineffective. This finding suggests both that Wikipedia must take steps to improve its featured article process and that scholars interested in studying Wikipedia should be careful not to naively believe its assertions of quality." 0 0
Experiencing a Context Aware Learning and Teaching Tool Selby Markham
Shonali Krishnaswami
John Hurst
Steven Cunningham
Behrang Saeedzadeh
Brett Gillick
Cyril Labbe
Global Learn Asia Pacific 2010 0 0
Expert-Built and Collaboratively Constructed Lexical Semantic Resources Iryna Gurevych
Elisabeth Wolf
Language and Linguistics Compass 2010 0 0
Exploring the Benefits and Challenges of Using Laptops in Higher Education Classrooms Robin Kay
Sharon Lauricella
Global Learn Asia Pacific 2010 0 0
Extracting content holes by comparing community-type content with Wikipedia Akiyo Nadamoto
Eiji Aramaki
Takeshi Abekawa
Yohei Murakami
International Journal of Web Information Systems 2010 0 0
Extraction, selection and ranking of Field Association (FA) Terms from domain-specific corpora for building a comprehensive FA terms dictionary Tshering Dorji
El sayed Atlam
Susumu Yata
Masao Fuketa
Kazuhiro Morita
Jun ichi Aoe
Knowledge and Information Systems 2010 0 0
From Town-Halls to Wikis: Exploring Wikipedia's Implications for Deliberative Democracy. NJ Klemp Journal of Public Deliberation 2010 0 0
Google Analytics for measuring website performance Beatriz Plaza Tourism Management Pages Corrected Proof 2010 0 0
Governance of Massive Multiauthor Collaboration — Linux, Wikipedia, and Other Networks: Governed by Bilateral Contracts, Partnerships, or Something in Between? Dan Wielsch Wikipedia licensing update wikis as decentralized networks Jipitec, , No. 2 (2010) 96 2010 Open collaborative projects are moving to the foreground of knowledge production. Some online user communities develop into longterm projects that generate a highly valuable and at the same time freely accessible output. Traditional copyright law that is organized around the idea of a single creative entity is not well equipped to accommodate the needs of these forms of collaboration. In order to enable a peculiar network-type of interaction participants instead draw on public licensing models that determine the freedoms to use individual contributions. With the help of these access rules the operational logic of the project can be implemented successfully. However, as the case of the Wikipedia GFDL-CC license transition demonstrates, the adaptation of access rules in networks to new circumstances raises collective action problems and suffers from pitfalls caused by the fact that public licensing is grounded in individual copyright. Legal governance of open collaboration projects is a largely unexplored field. The article argues that the license steward of a public license assumes the position of a fiduciary of the knowledge commons generated under the license regime. Ultimately, the governance of decentralized networks translates into a composite of organizational and contractual elements. It is concluded that the production of global knowledge commons relies on rules of transnational private law. 0 0
Governance of Massive Multiauthor Collaboration “ Linux, Wikipedia, and Other Networks: Governed by Bilateral Contracts, Partnerships, or Something in Between? Dan Wielsch Jipitec 2010 Open collaborative projects are moving to the foreground of knowledge production. Some online user communities develop into longterm projects that generate a highly valuable and at the same time freely accessible output. Traditional copyright law that is organized around the idea of a single creative entity is not well equipped to accommodate the needs of these forms of collaboration. In order to enable a peculiar network-type of interaction participants instead draw on public licensing models that determine the freedoms to use individual contributions. With the help of these access rules the operational logic of the project can be implemented successfully. However, as the case of the Wikipedia {GFDL-CC} license transition demonstrates, the adaptation of access rules in networks to new circumstances raises collective action problems and suffers from pitfalls caused by the fact that public licensing is grounded in individual copyright. Legal governance of open collaboration projects is a largely unexplored field. The article argues that the license steward of a public license assumes the position of a fiduciary of the knowledge commons generated under the license regime. Ultimately, the governance of decentralized networks translates into a composite of organizational and contractual elements. It is concluded that the production of global knowledge commons relies on rules of transnational private law. 0 0
How can contributors to open-source communities be trusted? On the assumption, inference, and substitution of trust P. B de Laat Ethics and Information Technology 2010 Open-source communities that focus on content rely squarely on the contributions of invisible strangers in cyberspace. How do such communities handle the problem of trusting that strangers have good intentions and adequate competence? This question is explored in relation to communities in which such trust is a vital issue: peer production of software {(FreeBSD} and Mozilla in particular) and encyclopaedia entries {(Wikipedia} in particular). In the context of open-source software, it is argued that trust was inferred from an underlying ˜hacker ethic, which already existed. The Wikipedian project, by contrast, had to create an appropriate ethic along the way. In the interim, the assumption simply had to be that potential contributors were trustworthy; they were granted ˜substantial trust. Subsequently, projects from both communities introduced rules and regulations which partly substituted for the need to perceive contributors as trustworthy. They faced a design choice in the continuum between a high-discretion design (granting a large amount of trust to contributors) and a low-discretion design (leaving only a small amount of trust to contributors). It is found that open-source designs for software and encyclopaedias are likely to converge in the future towards a mid-level of discretion. In such a design the anonymous user is no longer invested with unquestioning trust. 0 1
How today's college students use Wikipedia for course-related research A.J. Head
M.B. Eisenberg
First Monday 2010 Findings are reported from student focus groups and a large-scale survey about how and why students (enrolled at six different {U.S.} colleges) use Wikipedia during the course-related research process. A majority of respondents frequently used Wikipedia for background information, but less often than they used other common resources, such as course readings and Google. Architecture, engineering, and science majors were more likely to use Wikipedia for course-related research than respondents in other majors. The findings suggest Wikipedia is used in combination with other information resources. Wikipedia meets the needs of college students because it offers a mixture of coverage, currency, convenience, and comprehensibility in a world where credibility is less of a given or an expectation from today's students. 0 0
How today’s college students use Wikipedia for course-related research Alison J. Head
Michael B. Eisenberg.
Student use of Wikipedia First Monday, , No. 3 (March 2010) 2010 Findings are reported from student focus groups and a large-scale survey about how and why students (enrolled at six different U.S. colleges) use Wikipedia during the course-related research process. A majority of respondents frequently used Wikipedia for background information, but less often than they used other common resources, such as course readings and Google. Architecture, engineering, and science majors were more likely to use Wikipedia for course-related research than respondents in other majors. The findings suggest Wikipedia is used in combination with other information resources. Wikipedia meets the needs of college students because it offers a mixture of coverage, currency, convenience, and comprehensibility in a world where credibility is less of a given or an expectation from today’s students. 0 2
INQUIRY EVALUATION. Debbie Abilock Knowledge Quest 2010 The article focuses on series of judgment calls that end in a summative assessment of credibility for librarians for teaching evaluation on students as inquiry. It presents a model for credibility assessment which is an iterative process and is based on several factors. It refers to Wikipedia's list of projects, Wikimedia 2009c, for ideas on teaching evaluation from educators. 0 0
Identifying and understanding the problems of Wikipedia's peer governance: The case of inclusionists versus deletionists V. Kostakis First Monday 2010 Wikipedia has been hailed as one of the most prominent peer projects that led to the rise of the concept of peer governance. However, criticism has been levelled against Wikipedia's mode of governance. This paper, using the Wikipedia case as a point of departure and building upon the conflict between inclusionists and deletionists, tries to identify and draw some conclusions on the problematic issue of peer governance. 0 1
Identifying the borders of mathematical knowledge F.N. Silva
B.A.N. Travencolo
M.P. Viana
L. da Fontoura Costa
Journal of Physics A: Mathematical and Theoretical 2010 Based on a divide and conquer approach, knowledge about nature has been organized into a set of interrelated facts, allowing a natural representation in terms of graphs: each `chunk' of knowledge corresponds to a node, while relationships between such chunks are expressed as edges. This organization becomes particularly clear in the case of mathematical theorems, with their intense cross-implications and relationships. We have derived a web of mathematical theorems from Wikipedia and, thanks to the powerful concept of entropy, identified its more central and frontier elements. Our results also suggest that the central nodes are the oldest theorems, while the frontier nodes are those recently added to the network. The network communities have also been identified, allowing further insights about the organization of this network, such as its highly modular structure. 0 0
Image Interpretation Using Large Corpus: Wikipedia M. Rahurkar
S. Tsai-F.
C. Dagli
T.S. Huang
Proceedings of the IEEE 2010 Image is a powerful medium for expressing one's ideas and rightly confirms the adage, One picture is worth a thousand words. In this work, we explore the application of world knowledge in the form of Wikipedia to achieve this objective-literally. In the first part, we disambiguate and rank semantic concepts associated with ambiguous keywords by exploiting link structure of articles in Wikipedia. In the second part, we explore an image representation in terms of keywords which reflect the semantic content of an image. Our approach is inspired by the desire to augment low-level image representation with massive amounts of world knowledge, to facilitate computer vision tasks like image retrieval based on this information. We represent an image as a weighted mixture of a predetermined set of concrete concepts whose definition has been agreed upon by a wide variety of audience. To achieve this objective, we use concepts defined by Wikipedia articles, e.g., sky, building, or automobile. An important advantage of our approach is availability of vast amounts of highly organized human knowledge in Wikipedia. Wikipedia evolves rapidly steadily increasing its breadth and depth over time. 0 0
Improving Science Education and Understanding through Editing Wikipedia CL Moy
JR Locke
BP Coppola
AJ McNeil
Improving Wikipedia's credibility: References and citations in a sample of history articles Brendan Luyt
Daniel Tan
Bibliographic citations
Hypermedia authoring
Information literacy
J. Am. Soc. Inf. Sci. Technol. English 2010 This study evaluates how well the authors of Wikipedia history articles adhere to the site’s policy of assuring verifiability through citations. It does so by examining the references and citations of a subset of country histories. The findings paint a dismal picture. Not only are many claims not verified through citations, those that are suffer from the choice of references used. Many of these are from only a few US government Websites or news media and few are to academic journal material. Given these results, one response would be to declare Wikipedia unsuitable for serious reference work. But another option emerges when we jettison technological determinism and look at Wikipedia as a product of a wider social context. Key to this context is a world in which information is bottled up as commodities requiring payment for access. Equally important is the problematic assumption that texts are undifferentiated bearers of knowledge. Those involved in instructional programs can draw attention to the social nature of texts to counter these assumptions and by so doing create an awareness for a new generation of Wikipedians and Wikipedia users of the need to evaluate texts (and hence citations) in light of the social context of their production and use. 11 2
Improving wikipedia's credibility: References and citations in a sample of history articles Brendan Luyt
Daniel Tan
Journal of the American Society for Information Science and Technology 2010 This study evaluates how well the authors of Wikipedia history articles adhere to the site's policy of assuring verifiability through citations. It does so by examining the references and citations of a subset of country histories. The findings paint a dismal picture. Not only are many claims not verified through citations, those that are suffer from the choice of references used. Many of these are from only a few {US} government Websites or news media and few are to academic journal material. Given these results, one response would be to declare Wikipedia unsuitable for serious reference work. But another option emerges when we jettison technological determinism and look at Wikipedia as a product of a wider social context. Key to this context is a world in which information is bottled up as commodities requiring payment for access. Equally important is the problematic assumption that texts are undifferentiated bearers of knowledge. Those involved in instructional programs can draw attention to the social nature of texts to counter these assumptions and by so doing create an awareness for a new generation of Wikipedians and Wikipedia users of the need to evaluate texts (and hence citations) in light of the social context of their production and use. {2010ASIST. 0 2
Individual focus and knowledge contribution L.A. Adamic
Xiao Wei
Jiang Yang
S. Gerrish
K.K. Nam
G.S. Clarkson
First Monday 2010 Before contributing new knowledge, individuals must attain requisite background knowledge or skills through schooling, training, practice, and experience. Given limited time, individuals often choose either to focus on few areas, where they build deep expertise, or to delve less deeply and distribute their attention and efforts across several areas. In this paper we measure the relationship between the narrowness of focus and the quality of contribution across a range of both traditional and recent knowledge sharing media, including scholarly articles, patents, Wikipedia, and online question and answer forums. Across all systems, we observe a small but significant positive correlation between focus and quality. 0 1
Industrial ecology 2.0 Chris Davis
Igor Nikolic
Gerard P.J. Dijkema
Journal of Industrial Ecology 2010 Summary: Industrial ecology {(IE)} is an ambitious field of study where we seek to understand systems using a wide perspective ranging from the scale of molecules to that of the planet. Achieving such a holistic view is challenging and requires collecting, processing, curating, and sharing immense amounts of data and knowledge. We are not capable of fully achieving this due to the current state of tools used in {IE} and current community practices. Although we deal with a vastly interconnected world, we are not so good at efficiently interconnecting what we learn about it. This is not a problem unique to {IE,} and other fields have begun to use tools supported by the World Wide Web to meet these challenges. We discuss these sets of tools and illustrate how community driven data collection, processing, curation, and sharing is allowing people to achieve more than ever before. In particular, we discuss standards that have been created to allow for interlinking of data dispersed across multiple Web sites. This is currently visible in the Linking Open Data initiative, which among others contains interlinked datasets from the {U.S.} and {U.K.} governments, biology databases, and Wikipedia. Since the types of technologies and standards involved are outside the normal scope of work by many industrial ecologists, we attempt to explain the relevance, implications, and benefits through a discussion of many real examples currently on the Web. From these, we discuss several best practices, which can be enabling factors for how {IE} and the community can more efficiently and effectively meet its ambitions-an agenda for Industrial Ecology 2.0. 2010 by Yale University. 0 0
Information seeking with Wikipedia on the iPod Touch J. Hahn Reference Services Review 2010 Purpose - The purpose of this paper is to present the results of a usability study which inquired into undergraduate student information seeking with Wikipedia on the {iPod} touch. Design/methodology/approach - Data are drawn from {iPod} search logs and student survey responses. Search log data are coded with {FRBR} subject entities (group 3 entity sets) for analysis. Findings - Students characterize the overall nature of information searched for with the Wikipedia app to be for recreational and for short factual information. Recreational searching as a way in which undergraduate students utilize mobile technology is an earlier finding of Wikipedia {iPod} usage, and is verified as a trend of undergraduate student search using the {iPod.} All undergraduate student participants of the Wikipedia app on a mobile interface report this tool as helping to become more efficient in their research. Students viewed Wikipedia articles about people and concepts more so than other article types. Originality/value - Undergraduate student mobile search log analysis over a specific type of information resource on the {iPod} Touch is an original usability project. Previous mobile search log analysis analyzes thousands of unknown users and millions of anonymous queries, where the devices used for searching are not always identifiable and trends about touch screens cannot be ascertained. 0 0
Informed Investors and the Internet A. Rubin
E. Rubin
Journal of Business Finance \& Accounting /08/2011 2010 During the last decade the Internet has become an increasingly important source for gathering company related information. We employ Wikipedia editing frequency as an instrument that captures the degree in which the population is engaged with the processing of company-related information. We find that firms whose information is processed by the population more frequently are associated with lower analysts' forecast errors, smaller analysts' forecast dispersions, and significant changes in bid-ask spreads on analysts' recommendation days. These results indicate that information processing over the Internet is related to the degree to which investors and analysts are informed about companies. 0 0
Interactive visualization for opportunistic exploration of large document collections Simon Lehmann
Ulrich Schwanecke
Ralf Dorner
Information Systems 2010 Finding relevant information in a large and comprehensive collection of cross-referenced documents like Wikipedia usually requires a quite accurate idea where to look for the pieces of data being sought. A user might not yet have enough domain-specific knowledge to form a precise search query to get the desired result on the first try. Another problem arises from the usually highly cross-referenced structure of such document collections. When researching a subject, users usually follow some references to get additional information not covered by a single document. With each document, more opportunities to navigate are added and the structure and relations of the visited documents gets harder to understand. This paper describes the interactive visualization Wivi which enables users to intuitively navigate Wikipedia by visualizing the structure of visited articles and emphasizing relevant other topics. Combining this visualization with a view of the current article results in a custom browser specially adapted for exploring large information networks. By visualizing the potential paths that could be taken, users are invited to read up on subjects relevant to the current point of focus and thus opportunistically finding relevant information. Results from a user study indicate that this visual navigation can be easily used and understood. A majority of the participants of the study stated that this method of exploration supports them finding information in Wikipedia. 2009 Elsevier {B.V.} All rights reserved. 0 0
Internet Research: The Question of Method”A Keynote Address from the YouTube and the 2008 Election Cycle in the United States Conference Richard Rogers Journal of Information Technology \& Politics 2010 Digital studies on culture may be distinguished from cultural studies of the digital, at least in terms of method. This lecture takes up the question of the distinctiveness of œdigital methods? for researching Internet cultures. It asks, initially, should the methods of study change, however slightly or wholesale, given the specificity of the new medium? The larger digital methods project thereby engages with œvirtual methods,? the current, dominant œe-science? approach to the study of the Internet, and the consequences for research of importing standard methods from the social sciences in particular. What kinds of contributions are made to digital media studies, and the Internet in particular, when traditional methods are imported from the social sciences and the humanities onto the medium? Which research opportunities are foreclosed? Second, I ask, what kinds of new approaches are worthwhile, given an emphasis on the œnatively digital? as opposed to digitization? The goal is also to change the focus of humanities and humanities computing away from the opportunities afforded by transforming ink into bits. The effort is to develop the study of natively digital objects (the link, the tag, etc.) and devices (engines and other recommendation machines) that make use of them. After critically reviewing existing approaches to the study of the digital, which largely import method onto the medium, I subsequently propose research strategies that follow the medium. How can one learn from methods in the medium, and repurpose them for social and cultural research? The lecture launches a novel strand of study: digital methods. 0 0
Japanese-Chinese Information Retrieval With an Iterative Weighting Scheme C C Lin
Y C Wang
R T H Tsai
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 2010 This paper describes our {Japanese-Chinese} cross language information retrieval system. We adopt query-translation" approach and employ both a conventional {Japanese-Chinese} bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system we propose that query terms should be processed differently based on their written forms. We use an iterative method for weight-tuning and term disambiguation which is based on the {PageRank} algorithm. When evaluating on the {NTCIR-5} test set our system achieves as high as 0.2217 and 0.2276 in relax {MAP} {(Mean} Average Precision) measurement of T-runs and D-runs." 0 0
Keyphrase extraction based on topic relevance and term association Decong Li
Sujian Li
Wenjie Li
Congyun Gu
Yun Li
Journal of Information and Computational Science 2010 Keyphrases are concise representation of documents and usually are extracted directly from the original text. This paper proposes a novel approach to extract keyphrases. This method proposes two metrics, named topic relevance and term association respectively, for determining whether a term is a keyphrase. Using Wikipedia knowledge and betweenness computation, we compute these two metrics and combine them to extract important phrases from the text. Experimental results show the effectiveness of the proposed approach for keyphrases extaction. Copyright 2010 Binary Information Press. 0 0
LEGITIMIZING WIKIPEDIA -- How US national newspapers frame and use the online encyclopedia in their coverage Marcus Messner
Jeff South
Journalism Practice 2010 Within only a few years, the collaborative online encyclopedia Wikipedia has become one of the most popular websites in the world. At the same time, Wikipedia has become the subject of much controversy because of inaccuracies and hoaxes found in some of its entries. Journalists, therefore, have remained skeptical about the reliability and accuracy of Wikipedia's information, despite the fact that research has consistently shown an overall high level of accuracy compared to traditional encyclopedia. This study analyzed the framing of Wikipedia and its use as a news source by five US national newspapers over an eight-year period. A content analysis of 1486 Wikipedia references in The New York Times, The Washington Post, The Wall Street Journal, USA Today and The Christian Science Monitor found that Wikipedia is framed predominantly neutral and positive, and that it is increasingly used as a news source. By framing Wikipedia as credible and accurate, the newspapers help legitimize the use of the online encyclopedia. By allowing Wikipedia to influence their news agendas as a source, the newspapers confirm the growing reliability of Wikipedia. 0 1
Learning to rank with (a lot of) word features Bing Bai
Jason Weston
David Grangier
Ronan Collobert
Kunihiko Sadamasa
Yanjun Qi
Olivier Chapelle
Kilian Weinberger
Information retrieval 2010 In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing {(LSI),} our models take account of correlations between words (synonymy, polysemy). However, unlike {LSI} our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods. 0 0
Librarian Perception of Wikipedia: Threats or Opportunities for Librarianship? B Luyt
Y Ally
NH Low
NB Ismail
LIBRI 2010 The rapid rise of Wikipedia as an information source has placed the traditional role of librarians as information gatekeepers and guardians under scrutiny with much of the professional literature suggesting that librarians are polarized over the issue of whether Wikipedia is a useful reference tool. This qualitative study examines the perceptions and behaviours of National Library Board {(NLB)} of Singapore librarians with regards to information seeking and usage of Wikipedia. It finds that instead of polarized attitudes, most librarians, although cautious about using Wikipedia in their professional capacity, hold a range of generally positive attitudes towards the online en-cyclopaedia, believing that it has a valid role to play in the information seeking of patrons today. This is heartening because it suggests the existence within the librarian population of attitudes that can be tapped to engage constructively with Wikipedia. Three of these in particular are briefly discussed at the end of the article: Wikipedia's ability to appeal to the socalled digital natives its role as a source of {non-Western} information, and its potential to enable a revitalization of the role of librarians as public intellectuals contributing to a democratic information commons. 0 0
Logic-Based Question Answering Ulrich Furbach
GlöIngo ckner
Hermann Helbig
Bj Pelzerörn
KI - Künstliche Intelligenz 2010 0 0
Logoot-undo: Distributed collaborative editing system on P2P networks Stephane Weiss
Pascal Urso
Pascal Molli
IEEE Transactions on Parallel and Distributed Systems 2010 Peer-to-peer systems provide scalable content distribution for cheap and resist to censorship attempts. However, {P2P} networks mainly distribute immutable content and provide poor support for highly dynamic content such as produced by collaborative systems. A new class of algorithms called {CRDT} {(Commutative} Replicated Data Type), which ensures consistency of highly dynamic content on {P2P} networks, is emerging. However, if existing {CRDT} algorithms support the edit anywhere anytime feature they do not support the "undo anywhere anytime feature. In this paper we present the {Logoot-Undo} {CRDT} algorithm which integrates the "undo anywhere anytime feature. We compare the performance of the proposed algorithm with related algorithms and measure the impact of the undo feature on the global performance of the algorithm. We prove that the cost of the undo feature remains low on a corpus of data extracted from Wikipedia. " 0 0
Making History: The Changing Face of the Profession in Britain Ross McKibbin English Historical Review 2010 0 0
Mediating at the Student-Wikipedia Intersection. Angela Doucet Rand Journal of Library Administration /08/2011 2010 Wikipedia is a free online encyclopedia. The encyclopedia is openly edited by registered users. Wikipedia editors can edit their own and others' entries, and some abuse of this editorial power has been unveiled. Content authors have also been criticized for publishing less than accurate content. Educators and students acknowledge casual use of Wikipedia in spite of its perceived inaccuracies. Use of the online encyclopedia as a reference resource in scholarly papers is still debated. The increasing popularity of Wikipedia has led to an influx of research articles analyzing the validity and content of the encyclopedia. This study provides an analysis of relevant articles on academic use of Wikipedia. This analysis attempts to summarize the status of Wikipedia in relation to the scope (breadth) and depth of its contents and looks at content validity issues that are of concern to the use of Wikipedia for higher education. The study seeks to establish a reference point from which educators can make informed decisions about scholarly use of Wikipedia as a reference resource. 0 0
Meeting Student Writers Where They Are: Using Wikipedia to Teach Responsible Scholarship P. Patch Teaching English in the Two-Year College 2010 0 0
Mirage of us: a reflection on the role of the web in widening access to references on Southern African arts, culture and heritage Graham Stewart Tydskrif vir Letterkunde 2010 0 0
NORMATIVE BEHAVIOUR IN WIKIPEDIA Christopher Goldspink Information 2010 This paper examines the effect of norms and rules on editor communicative behaviour in Wikipedia. Specifically, processes of micro-coordination through speech acts are examined as a basis for norm establishment, maintenance, reinforcement and effectiveness. This is pursued by analysing discussion pages taken from a sample of controversial and featured articles. The results reveal some unexpected patterns. Despite the Wikipedia community generating a large number of rules, etiquettes and guidelines, the explicit invocation of rules and/or the use of wider social norms is rare and appears to play a very small role in influencing editor behaviour. The emergent pattern of communicative exchange is not well aligned either with rules established by Wikipedia contributors or with the characteristics of a coherent community and nor is it consistent with the behaviour needed to reach agreement on controversial topics. The paper concludes by offering some tentative hypotheses as to why this may be so and outlines possible future research which may help distinguish between alternatives. Adapted from the source document. 0 1
Negative Selection of Written Language Using Character Multiset Statistics Pöllä
Timo Honkela
Journal of Computer Science and Technology 2010 0 0
Of Descartes And Of Train Schedules: Evaluating The Encyclopedia Judaica, Wikipedia, And Other General And Jewish Studies Encyclopedias R.S. Kohn Library Review 2010 Purpose - The purpose of this paper is to discuss the second edition of the Encyclopaedia Judaica (2007) within its broader historical context of the production of encyclopedias in the twentieth and the twenty-first centuries. The paper contrasts the 2007 edition of the Encyclopaedia Judaica to the Jewish Encyclopedia published between 1901 and 1905, and to the first edition of the Encyclopaedia Judaica published in 1972; then contrasts the 2007 edition of the Encyclopaedia Judaica to Wikipedia and to other projects of online encyclopedias. Design/methodology/approach - The paper provides a personal reflective review of the sources in question. Findings - That Encyclopaedia Judaica in its latest edition does not adequately replace the original first edition in terms of depth of scholarly work. It is considered that the model offered by Wikipedia could work well for the Encyclopaedia Judaica, allowing it to retain the core of the expert knowledge, and at the same time channel the energy of volunteer editors which has made Wikipedia such a success. Practical implications - The paper is of interest to those with an interest in encyclopedia design or Jewish studies. Originality/value - This paper provides a unique reflection on the latest edition of the encyclopedia and considers future models for its publication based on traditional and non-traditional methods. 0 0
On social Web sites Won Kim
Ok Jeong-Ran
Sang Lee-Won
Information Systems 2010 Today hundreds of millions of Internet users are using thousands of social Web sites to stay connected with their friends, discover new friends and to share user-created contents, such as photos, videos, social bookmarks, and blogs. There are so many social Web sites, and their features are evolving rapidly. There is controversy about the benefits of these sites, and there are social issues these sites have given rise to. There are lots of press articles, Wikipedia articles, and blogs-in varying degrees of authoritativeness, clarity and accuracy-about some of the social Web sites, uses of the sites, and some social problems, and business challenges faced by the sites. In this paper, we attempt to organize the status, uses, and issues of social Web sites into a comprehensive framework for discussing, understanding, using, building, and forecasting the future of social Web sites. 2009 Elsevier {B.V.} All rights reserved. 0 0
Online professionalism and the mirror of social media S Ryan Greysen
Terry Kind
Katherine C Chretien
Journal of General Internal Medicine 2010 The rise of social media--content created by Internet users and hosted by popular sites such as Facebook, Twitter, {YouTube,} and Wikipedia, and blogs--has brought several new hazards for medical professionalism. First, many physicians may find applying principles for medical professionalism to the online environment challenging in certain contexts. Second, physicians may not consider the potential impact of their online content on their patients and the public. Third, a momentary lapse in judgment by an individual physician to create unprofessional content online can reflect poorly on the entire profession. To overcome these challenges, we encourage individual physicians to realize that as they tread" through the World Wide Web they leave behind a "footprint" that may have unintended negative consequences for them and for the profession at large. We also recommend that institutions take a proactive approach to engage users of social media in setting consensus-based standards for "online professionalism." Finally given that professionalism encompasses more than the avoidance of negative behaviors we conclude with examples of more positive applications for this technology. Much like a mirror social media can reflect the best and worst aspects of the content placed before it for all to see." 0 0
Or the costs of crowdsourcing Shirky and Sanger O'Neil
JCOM 2010 0 0
PDBWiki: added value through community annotation of the Protein Data Bank Henning Stehr
Jose M. Duarte
Michael Lappe
Jong Bhak
Dan M. Bolser
Database 2010 The success of community projects such as Wikipedia has recently prompted a discussion about the applicability of such tools in the life sciences. Currently, there are several such science-wikis' that aim to collect specialist knowledge from the community into centralized resources. However, there is no consensus about how to achieve this goal. For example, it is not clear how to best integrate data from established, centralized databases with that provided by community annotation'. We created {PDBWiki,} a scientific wiki for the community annotation of protein structures. The wiki consists of one structured page for each entry in the the Protein Data Bank {(PDB)} and allows the user to attach categorized comments to the entries. Additionally, each page includes a user editable list of cross-references to external resources. As in a database, it is possible to produce tabular reports and structure galleries' based on user-defined queries or lists of entries. {PDBWiki} runs in parallel to the {PDB,} separating original database content from user annotations. {PDBWiki} demonstrates how collaboration features can be integrated with primary data from a biological database. It can be used as a system for better understanding how to capture community knowledge in the biological sciences. For users of the {PDB,} {PDBWiki} provides a bug-tracker, discussion forum and community annotation system. To date, user participation has been modest, but is increasing. The user editable cross-references section has proven popular, with the number of linked resources more than doubling from 17 originally to 39 today. Database {URL:} 0 0
Perceived Credibility of Internet Encyclopedias Ida Kubiszewski
Thomas Noordewier
Robert Costanza
Computers \& Education Pages Accepted Manuscript 2010 0 1
Performing Knowledge: Cultural Discourses, Knowledge Communities, and Youth Culture. Mark W. Rectanus Telos 2010 The article discusses the destabilization of expert knowledge and the de-centering of the book in youth culture. The current fundamental shifts in the social construction of knowledge involves a number of interrelated topics such as the status of the book and scholarly publishing, the digitization and virtualization of libraries and the role of search engines, databases and books like Google Book Search, and the creation of encyclopedic projects like Wikipedia. It also explores the development of media culture in the {U.S.} and Germany. 0 0
Precompetitive preclinical ADME/Tox data: set it free on the web to facilitate computational model building and assist drug development S. Ekins
J. Williams
Lab on a Chip 2010 Web-based technologies coupled with a drive for improved communication between scientists have resulted in the proliferation of scientific opinion, data and knowledge at an ever-increasing rate. The increasing array of chemistry-related computer-based resources now available provides chemists with a direct path to the discovery of information, once previously accessed via library services and limited to commercial and costly resources. We propose that preclinical absorption, distribution, metabolism, excretion and toxicity data as well as pharmacokinetic properties from studies published in the literature (which use animal or human tissues in vitro or from in vivo studies) are precompetitive in nature and should be freely available on the web. This could be made possible by curating the literature and patents, data donations from pharmaceutical companies and by expanding the currently freely available {ChemSpider} database of over 21 million molecules with physicochemical properties. This will require linkage to {PubMed,} {PubChem} and Wikipedia as well as other frequently used public databases that are currently used, mining the full text publications to extract the pertinent experimental data. These data will need to be extracted using automated and manual methods, cleaned and then published to the {ChemSpider} or other database such that it will be freely available to the biomedical research and clinical communities. The value of the data being accessible will improve development of drug molecules with good {ADME/Tox} properties, facilitate computational model building for these properties and enable researchers to not repeat the failures of past drug discovery studies. 0 0
Promises unfulfilled? 'Journalism 2.0', user participation and editorial policy on newspaper websites. Franck Rebillard
Annelise Touboul
Media, Culture \& Society 2010 In this article the authors contemplate on the ideology involving the Web 2.0 services for journalism. They present their analysis on the ideological assumptions regarding the effectiveness of journalism 2.0., especially on online interaction and social networking sites. They also explore the material concretization of these assumptions particularly on users of participatory websites like Wikipedia or {YouTube} links and newsmaking within a corpus of news media websites in Europe and America. 0 0
Recognizing Contributions in Wikis: Authorship Categories, Algorithms, and Visualizations O Arazy
E Stroulia
S Ruecker
C Arias
C Fiorentino
V Ganev
T Yau
Journal of the American Society for Information Science and Technology 2010 Wikis are designed to support collaborative editing, without focusing on individual contribution, such that it is not straightforward to determine who contributed to a specific page. However, as wikis are increasingly adopted in settings such as business, government, and education, where editors are largely driven by career goals, there is a perceived need to modify wikis so that each editor's contributions are clearly presented. In this paper we introduce an approach for assessing the contributions of wiki editors along several authorship categories, as well as a variety of information glyphs for visualizing this information. We report on three types of analysis: (a) assessing the accuracy of the algorithms, (b) estimating the understandability of the visualizations, and (c) exploring wiki editors' perceptions regarding the extent to which such an approach is likely to change their behavior. Our findings demonstrate that our proposed automated techniques can estimate fairly accurately the quantity of editors' contributions across various authorship categories, and that the visualizations we introduced can clearly convey this information to users. Moreover, our user study suggests that such tools are likely to change wiki editors' behavior. We discuss both the potential benefits and risks associated with solutions for estimating and visualizing wiki contributions. 0 1
Reputation in a Networked World: Revisiting the Social Foundations of Defamation Law. David S. Ardia Harvard Civil Rights-Civil Liberties Law Review 2010 The article explores the social foundations of defamation law as of 2010 and the concept of reputation amid the emergence of online platforms such as blogs, social networks and discussion forums. It recounts the definition of reputation and its importance in humans and other social species as part of a set of feedback mechanisms within human social systems and a major factor in evolution. Described is how reputational information is used, created, and disseminated by a networked society. The court case about the editing of celebrity Ron Livingston's Wikipedia entry to suggests that he is gay is also discussed. It is inferred that private online intermediaries like content hosts and search providers would be helpful in mitigating reputational harms. 0 0
Research progress on wikipedia Fei Zhao
Tao Zhou
Liang Zhang
Ming Ma-Hui
Jin Liu-Hu
Fei Yu
Yi Zha-Long
Rui Li-Qi
Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China 2010 The rapid development of web technology has promoted the emergence and organization of the collaborative Wiki systems. This paper introduces the Wikipedia's history, macro-level statistical properties, evolution regularities, and so on. Especially the application of the motivation and methods of complex network study in analyzing the Wikipedia is emphasized. Wikipedia's significance and impacts on society, economy, culture and education are also discussed. Finally, some open questions are outlined for future research; especially the connection between Wikipedia and the new development in complexity sciences, such as the studies of complex network and human dynamics. 0 0
Seeing similarity in the face of difference: enabling comparison of online production systems Muller-Claudia Birn
Benedikt Meuthrath
Andreas Erber
Sebastian Burkhart
Anne Baumgrass
Janette Lehmann
Robert Schmidl
Social Network Analysis and Mining 2010 0 0
Sleeping with the Enemy: Wikipedia in the College Classroom C. J. Chandler The History Teacher 2010 0 1
Social Computing. Anonymous AI Magazine 2010 The article offers information on the social computing issues discussed by the council of the Association for the Advancement of Artificial Intelligence {(AAAI).} It says that the social computing issues formulated by Martha Pollack include establishment of a Wikipedia team, Facebook Inc. presence, {YouTube} channel, and a community blog. It mentions that Carol Hamilton will conduct talks and paper lectures which will be videotaped by {VideoLectures} and be posted on their site. 0 0
Social Learning Qiang Yang
Zhi Zhou-Hua
Wenji Mao
Wei Li
Nathan Nan Liu
IEEE Intelligent Systems 2010 In recent years, social behavioral data have been exponentially expanding due to the tremendous success of various outlets on the social Web (aka Web 2.0) such as Facebook, Digg, Twitter, Wikipedia, and Delicious. As a result, there's a need for social learning to support the discovery, analysis, and modeling of human social behavioral data. The goal is to discover social intelligence, which encompasses a spectrum of knowledge that characterizes human interaction, communication, and collaborations. The social Web has thus become a fertile ground for machine learning and data mining research. This special issue gathers the state-of-the-art research in social learning and is devoted to exhibiting some of the best representative works in this area. 0 0
Sometimes Why finding entities in Wikipedia is difficult G. Demartini
C. Firan
T. Iofciu
R. Krestel
W. Nejdl
Information retrieval 2010 Entity Retrieval {(ER)--in} comparison to classical search--aims at finding individual entities instead of relevant documents. Finding a list of entities requires therefore techniques different to classical search engines. In this paper, we present a model to describe entities more formally and how an {ER} system can be build on top of it. We compare different approaches designed for finding entities in Wikipedia and report on results using standard test collections. An analysis of entity-centric queries reveals different aspects and problems related to {ER} and shows limitations of current systems performing {ER} with Wikipedia. It also indicates which approaches are suitable for which kinds of queries. 0 0
Struggles online over the meaning of 'Down's syndrome': A 'dialogic' interpretation Nicholas Cimini Health (London, England: 1997) 2010 Bakhtin's suggestion that a unified truth demands a 'multiplicity of consciousnesses' seems particularly relevant in the 'globally connected age'. At a time when the {DIY/'punk} ethic' seems to prevail online, and Wikipedia and blogging means that anyone with access to the Internet can enter into public deliberation, it is worth considering the potential for mass communication systems to create meaningful changes in the way that 'disability' is theorized. Based on the findings of qualitative research, this study explores competing interpretations of disability, specifically dialogue online over the meaning of Down's syndrome, from the vantage point of an approach towards language analysis that emanates from the work of the Bakhtin Circle. It will be shown that, suitably revised and supplemented, elements of Bakhtinian theory provide powerful tools for understanding online relations and changes in the notion of disability. It will also be shown that, while activists in the disabled people's movement have managed to effect modest changes to the way that disability is theorized, both online and in the 'real world', there remains a great deal still to be achieved. This study allows us to understand better the social struggles faced by disabled people and the opportunities open to them. 0 0
TAKING NOTE Elton Hall Inc. The Chronicle of the Early American Industries Association 2010 A fellow EAIA member recently told me of an online encyclopedia called Wikipedia. I had never heard of it before, so I put the name into a Google search and was immediately confronted with the first ten out of about 387,000,000 results. Every time I looked again the number had grown by a few million. Here is a site that is clearly on the move. It's a fascinating approach to gathering and disseminating information. 0 0
Technology Trends in Learning and Implications for Intercultural Exchange Bert Kimura
Curtis Ho
Global Learn Asia Pacific 2010 0 0
Testing an Integrative Theoretical Model of Knowledge-Sharing Behavior in the Context of Wikipedia Hichang Cho
Meihui Chen
Siyoung Chung
Journal of the American Society for Information Science and Technology 2010 This study explores how and why people participate in collaborative knowledge-building practices in the context of Wikipedia. Based on a survey of 223 Wikipedians, this study examines the relationship between motivations, internal cognitive beliefs, social-relational factors, and knowledge-sharing intentions. Results from structural equation modeling {(SEM)} analysis reveal that attitudes, knowledge self-efficacy, and a basic norm of generalized reciprocity have significant and direct relationships with knowledge-sharing intentions. Altruism (an intrinsic motivator) is positively related to attitudes toward knowledge sharing, whereas reputation (an extrinsic motivator) is not a significant predictor of attitude. The study also reveals that a social-relational factor, namely, a sense of belonging, is related to knowledge-sharing intentions indirectly through different motivational and social factors such as altruism, subjective norms, knowledge self-efficacy, and generalized reciprocity. Implications for future research and practice are discussed. 0 1
The Changing Space of Research: Web 2.0 and the Integration of Research and Writing Environments J.P. Purdy Computers and Composition 2010 Web 2.0 challenges the artificial compartmentalization of research and writing that often characterizes instruction in composition classes. In Web 2.0, writing and researching activities are increasingly integrated both spatially and conceptually. This article contends that, with this integration, Web 2.0 technologies showcase how research and writing together participate in knowledge production. Through analyzing specific technologies that incorporate Web 2.0 features, including Wikipedia, {JSTOR,} {ARTstor,} and, this article argues that including Web 2.0 technologies in composition courses as objects of analysis and as writing and researching resources offers a means to bridge the gap between students' online proficiencies and academic writing tasks. {All} rights reserved Elsevier. 0 1
The Collective Intelligence Genome TW Malone
R Laubacher
C Dellarocas
MIT SLOAN MANAGEMENT REVIEW 2010 Google. Wikipedia. Threadless. All are platinum exemplars of collective intelligence in action. Two of them are famous. The third is getting there. Each of the three helps demonstrate how large, loosely organized groups of people can work together electronically in surprisingly effective ways sometimes even without knowing that they are working together, as in the case of Google. In the authors' work at {MIT's} Center for Collective Intelligence, they have gathered nearly 250 examples of web-enabled collective intelligence. After examining these examples in depth, they identified a relatively small set of building blocks that are combined and recombined in various ways in different collective intelligence systems. This article offers a new framework for understanding those systems - and more important, for understanding how to build them. It identifies the underlying building blocks - the genes" - that are at the heart of collective intelligence systems. It explores the conditions under which each gene is useful. And it begins to suggest the possibilities for combining and recombining these genes to not only harness crowds in general but to harness them in just the way that your organization needs." 0 0
The Framing of Political NGOs in Wikipedia through Criticism Elimination Andre Oboler
Gerald Steinberg
Rephael Stern
Journal of Information Technology \& Politics 2010 This article introduces criticism elimination, a type of information removal leading to a framing effect that impairs Wikipedia's delivery of a neutral point of view {(NPOV)} and ultimately facilitates a new form of gatekeeping with political science and information technology implications. This article demonstrates a systematic use of criticism elimination and categorizes the editors responsible into four types. We show that some types use criticism elimination to dominate and manipulate articles to advocate political and ideological agendas. We suggest mitigation approaches to criticism elimination. The research is interdisciplinary and based on empirical analysis of the public edit histories. 0 0
The Gene Wiki: community intelligence applied to human gene annotation Jon W. Huss
Pierre Lindenbaum
Michael Martone
Donabel Roberts
Angel Pizarro
Faramarz Valafar
John B. Hogenesch
Andrew I. Su
Nucleic Acids Research
Nucleic Acids Research Pages Database issue
English 2010 Annotating the function of all human genes is a critical, yet formidable, challenge. Current gene annotation efforts focus on centralized curation resources, but it is increasingly clear that this approach does not scale with the rapid growth of the biomedical literature. The Gene Wiki utilizes an alternative and complementary model based on the principle of community intelligence. Directly integrated within the online encyclopedia, Wikipedia, the goal of this effort is to build a gene-specific review article for every gene in the human genome, where each article is collaboratively written, continuously updated and community reviewed. Previously, we described the creation of Gene Wiki ‘stubs’ for approximately 9000 human genes. Here, we describe ongoing systematic improvements to these articles to increase their utility. Moreover, we retrospectively examine the community usage and improvement of the Gene Wiki, providing evidence of a critical mass of users and editors. Gene Wiki articles are freely accessible within the Wikipedia web site, and additional links and information are available at
Annotating the function of all human genes is a critical, yet formidable, challenge. Current gene annotation efforts focus on centralized curation resources, but it is increasingly clear that this approach does not scale with the rapid growth of the biomedical literature. The Gene Wiki utilizes an alternative and complementary model based on the principle of community intelligence. Directly integrated within the online encyclopedia, Wikipedia, the goal of this effort is to build a gene-specific review article for every gene in the human genome, where each article is collaboratively written, continuously updated and community reviewed. Previously, we described the creation of Gene Wiki 'stubs' for approximately 9000 human genes. Here, we describe ongoing systematic improvements to these articles to increase their utility. Moreover, we retrospectively examine the community usage and improvement of the Gene Wiki, providing evidence of a critical mass of users and editors. Gene Wiki articles are freely accessible within the Wikipedia web site, and additional links and information are available at\_Wiki.
The Spirit of Combination David Alan Grier Computer 2010 We find new ideas by starting from where we are and asking the simple question, {Where} can we go from here?"" 0 0
The Wikipedia Revolution: How a Bunch of Nobodies Created the World's Greatest Encyclopedia. David Kowalsky Technical Communication 2010 The article reviews the book {The} Wikipedia Revolution: How a Bunch of Nobodies Created the World's Greatest Encyclopedia by Andrew Lih. 0 0
The iPad and Twenty-First-Century Humanism Stephen Marche Queen's Quarterly 2010 0 0
The perspectives of higher education faculty on Wikipedia Hsin liang Chen Electronic Library 2010 Purpose - This purpose of this paper is to investigate whether higher education instructors use information from Wikipedia for teaching and research. Design/methodology/approach - This is an explorative study to identify important factors regarding user acceptance and use of emerging information resources and technologies in the academic community. A total of 201 participants around the world answered an online questionnaire administered by a commercial provider. The questionnaire consisted of 16 Likert-scaled questions to assess participants' agreement with each question along with an optional open-ended explanation. Findings - The findings of this project confirm that internet access was related to faculty technology use. Online resources and references were ranked the first choice by the participants when searching for familiar and unfamiliar topics. The investigator found that participants' academic ranking status, frequency of e-mail use and academic discipline were related to their use of online databases, web-based information and directing students to information from the Web. Although the participants might often use online resources for research and teaching, Wikipedia's credibility was the participants' major concern. Research limitations/implications - This project is an exploratory study and more considerations are needed for this research area. Originality/value - The paper shows that participants who used online databases more often showed a negative attitude toward Wikipedia. Those participants who used Wikipedia for teaching and research also allowed students to use information from Wikipedia and were more likely to be contributors to Wikipedia. 0 1
The wiki - A virtual home base for constructivist blended learning courses S.Beercock Wiki
Collaborative learning
Language teaching
Online collaboration
Procedia - Social and Behavioral Sciences English 2010 0 0
The wizard of oz effect and a new Emerald city P. Villano On the Horizon 2010 Purpose - The purpose of this paper is to develop three key concepts to the future of knowledge work: knowledge work is a natural, ever-changing process - not something that can be certified; open education, connection and interaction are the way of the future; and the future of knowledge work hinges on enabling shared practical knowledge globally. Design/methodology/approach - The paper is filled with metaphor mixed with research from recognized knowledge management {(KM)} experts as well as extensive social media sources such as Wikipedia. The intent is to demonstrate as well as describe the natural process and potential of global connection and interaction. Findings - Knowledge can be found in one's own back yard (or as close as one's pocket) and one's ability to connect to the world. Open education will be increasingly available to support community-generated certification of knowledge workers. Originality/value - The paper uses a unique approach to forward a new, inclusive way of looking at knowledge worker certification. It also suggests pragmatic approaches for accomplishing community-generated certification. 0 0
Trends Controversies Business and Market Intelligence 2.0 Hsinchun Chen IEEE Intelligent Systems 2010 Business Intelligence {(BI),} a term coined in 1989, has gained much traction in the {IT} practitioner community and academia over the past two decades. According to Wikipedia, {BI} refers to the skills technologies applications and practices used to support decision making" {(\_intelligence).} On the basis of a survey of 1400 {CEOs} the Gartner Group projected {BI} revenue to reach {US\$3} billion in 2009. Through {BI} initiatives businesses are gaining insights from the growing volumes of transaction product inventory customer competitor and industry data generated by enterprise-wide applications such as enterprise resource planning {(ERP)} customer relationship management {(CRM)} supply-chain management {(SCM)} knowledge management collaborative computing Web analytics and so on. The same Gartner survey also showed that {BI} surpassed security as the top business {IT} priority in 2006." 0 0
Tribal Knowledge L. Henderson Applied Clinical Trials 2010 Tribal knowledge is any unwritten information that is not commonly known by others within a company. This term is used most when referencing information that may need to be known by others in order to produce quality product or service. The information may be key to quality performance but it may also be totally incorrect. Unlike similar forms of artisan intelligence tribal knowledge can be converted into company property. It is often a good source of test factors during improvement efforts." That from Wikipedia sourced from Six Sigma is the methodology that companies apply to achieve optimal efficiencies and performance. It all sounds well and good until they start talking about belts. I recently heard the term tribal knowledge applied to the outsourcing process. That is when transferring a job function to the outsourcer the outsourced is shadowed by the outsourcer for a time to acquire said tribal knowledge. That knowledge then becomes part of the outsourcers' knowledge. And the outsourcer can apply Six Sigma and kung fu the process right up to optimal efficiency I suppose." 0 0
Typing software articles with Wikipedia category structure Liang Xu
Hideaki Takeda
Masahiro Hamasaki
Huayu Wu
NII Technical Reports 2010 In this paper we present a low-cost method for typing Named Entities with Wikipedia. Different from other text analysis-based approaches, our approach relies only on the structural features of Wikipidia and the use of external linguistic resources is optional. We perform binary classification of an article by analyzing of the names of its categories as well as the structure. The evaluation shows our method can be successfully applied to the 'software' category {(F} 80\%). 0 0
Understanding Knowledge Sharing Behaviour in Wikipedia Yang
Cheng Lai-Yu
Behaviour \& Information Technology 2010 Wikipedia is the world's largest multilingual free-content encyclopaedia written by users collaboratively. It is interesting to investigate why individuals have willingness to spend their time and knowledge to engage in. In this study, we try to explore the influence of self-concept-based motivation and individual attitudes toward Wikipedia on individual's knowledge sharing intention in Wikipedia. Members from Wikipedia were invited to participate in the investigation. An online questionnaire and structural equation modelling {(SEM)} technology was utilized to test the proposed model and hypotheses. Analytical results indicate that internal self-concept-based motivation significantly influences individual's knowledge sharing intention. Further, both information and system quality have significant effects on individual's attitude toward Wikipedia, and therefore, influence the intention to share knowledge in it. 0 0
Up on Angels Landing D. Johnson ISHN 2010 Here's how Wikipedia describes the journey: {After} a series of steep switchbacks the trail goes through a gradual ascent. Walter's Wiggles a series of 21 steep switchbacks are the last hurdle before Scout's Lookout. Scout's Lookout is generally the turnaround point for those who are unwilling to make the final summit push to the top of Angels Landing. The last half-mile of the trail is strenuous and littered with sharp drop offs and narrow paths." 0 0
Users of the world, unite! The challenges and opportunities of Social Media A.M. Kaplan
M. Haenlein
Business Horizons 2010 The concept of Social Media is top of the agenda for many business executives today. Decision makers, as well as consultants, try to identify ways in which firms can make profitable use of applications such as Wikipedia, {YouTube,} Facebook, Second Life, and Twitter. Yet despite this interest, there seems to be very limited understanding of what the term Social Media exactly means; this article intends to provide some clarification. We begin by describing the concept of Social Media, and discuss how it differs from related concepts such as Web 2.0 and User Generated Content. Based on this definition, we then provide a classification of Social Media which groups applications currently subsumed under the generalized term into more specific categories by characteristic: collaborative projects, blogs, content communities, social networking sites, virtual game worlds, and virtual social worlds. Finally, we present 10 pieces of advice for companies which decide to utilize Social Media. {All} rights reserved Elsevier. 0 0
Using Podcasts to Improve Safety T. Mathis
S. Galloway
Professional Safety 2010 Communicating information is a challenge that has plagued professionals for many years. Several innovative safety managers have identified a new solution: producing podcasts. Wikipedia defines podcast as a series of digital media files (either audio or video) that are released episodically and downloaded through Web syndication." This article discusses three major projects in which the authors discovered potential uses for podcasts. Based on experience the authors believe podcasts can help to improve safety in several ways: 1. overcome logistical challenges 2. ensure message uniformity 3. eliminate message drift 4. multiply or leverage leaders' and experts' ability to communicate 5. facilitate international messages 6. support traditional channels and media and 7. reduce communication costs." 0 0
Using web sources for improving video categorization Perea-Jose M. Ortega
Montejo-Arturo Raez
Martin-M.Teresa Valdivia
Journal of Intelligent Information Systems , Number 1, 117-130 2010 In this paper, several experiments about video categorization using a supervised learning approach are presented. To this end, the {VideoCLEF} 2008 evaluation forum has been chosen as experimental framework. After an analysis of the {VideoCLEF} corpus, it was found that video transcriptions are not the best source of information in order to identify the thematic of video streams. Therefore, two web-based corpora have been generated in the aim of adding more informational sources by integrating documents from Wikipedia articles and Google searches. A number of supervised categorization experiments using the test data of {VideoCLEF} have been accomplished. Several machine learning algorithms have been proved to validate the effect of the corpus on the final results: Naive Bayes, K-nearest-neighbors {(KNN),} Support Vectors Machine {(SVM)} and the j48 decision tree. The results obtained show that web can be a useful source of information for generating classification models for video data. 0 0
Using wikis as an online health information resource. Paula Younger Nursing Standard 2010 with the proviso that information found on them should be independently verified. This article defines wikis and sets them in context with recent developments on the internet. The article discusses the use of Wikipedia and other wikis as potential sources of health information for nurses. Wikis can be a powerful online resource for the provision and sharing of information 0 0
VIRaL: Visual Image Retrieval and Localization Yannis Kalantidis
Giorgos Tolias
Yannis Avrithis
Marios Phinikettos
Evaggelos Spyrou
Phivos Mylonas
Stefanos Kollias
Multimedia Tools and Applications 2010 New applications are emerging every day exploiting the huge data volume in community photo collections. Most focus on popular subsets, e.g., images containing landmarks or associated to Wikipedia articles. In this work we are concerned with the problem of accurately finding the location where a photo is taken without needing any metadata, that is, solely by its visual content. We also recognize landmarks where applicable, automatically linking them to Wikipedia. We show that the time is right for automating the geo-tagging process, and we show how this can work at large scale. In doing so, we do exploit redundancy of content in popular locations”but unlike most existing solutions, we do not restrict to landmarks. In other words, we can compactly represent the visual content of all thousands of images depicting e.g., the Parthenon and still retrieve any single, isolated, non-landmark image like a house or a graffiti on a wall. Starting from an existing, geo-tagged dataset, we cluster images into sets of different views of the same scene. This is a very efficient, scalable, and fully automated mining process. We then align all views in a set to one reference image and construct a {2D} scene map. Our indexing scheme operates directly on scene maps. We evaluate our solution on a challenging one million urban image dataset and provide public access to our service through our online application, VIRaL. 0 0
Johannes Moskaliuk
Andreas Harrer
Ulrike Cress
Information, Communication \& Society 2010 This paper describes how processes of knowledge building with wikis may be visualized, citing the user-generated online encyclopedia Wikipedia as an example. The underlying theoretical basis is a framework for collaborative knowledge building with wikis that describes knowledge building as a co-evolution of individual and collective knowledge. These co-evolutionary processes may be visualized graphically, applying methods from social network analysis, especially those methods that take dynamic changes into account. For this purpose, we have undertaken to analyse, on the one hand, the temporal development of a Wikipedia article and related articles that are linked to this core article. On the other hand, we analysed the temporal development of those users who worked on these articles. The resulting graphics show an analogous process, both with regard to the articles that refer to the core article and to the users involved. These results provide empirical support for the co-evolution model. 0 0
… further results

See also[edit]