From WikiPapers
Jump to: navigation, search
<< 2010 - 2011 - 2012 - 2013 - 2014 - 2015 - 2016 >>

This is a list of 1 events celebrated and 495 publications published in 2013.


Name City Country DateThis property is a special property in this wiki.
WikiSym 2013 Hong Kong China 5 August 2013


Title Author(s) Keyword(s) Published in Language Abstract R C
(Re)triggering Backlash: Responses to news about Wikipedia's gender gap Eckert S.
Steiner L.
Antifeminism backlash
Gender difference
Gender gap
Status politics
Journal of Communication Inquiry English Wikipedia, the free encyclopedia that anyone can edit, has been enormously successful. But while it is read nearly equally by women and men, women are only 8.5 to 12.6% of those who edit or write Wikipedia articles. We analyzed coverage of Wikipedia's gender gap by 42 U.S. news organizations and blogs as well as 1,336 comments posted online by readers. We also interviewed Wikimedia Foundation executive director Sue Gardner. Commentators questioned Wikipedia's epistemology and culture and associated the gap with societal issues and/or (perceived) gender differences regarding time management, self-confidence, and expertise, as well as personality and interests. Yet, many commentators denied the gap was a problem; they blamed women for not joining, suggested it was women's choice, or mocked girly interests. The belittling of the disparity as feminist ideology arguably betrays an antifeminist backlash. © The Author(s) 2013 Reprints and permissions: sagepub.com/journalsPermissions.nav. 0 0
2012 - A year of Ginev D.
Miller B.R.
Lecture Notes in Computer Science English a to XML converter, is being used in a wide range of MKM applications. In this paper, we present a progress report for the 2012 calendar year. Noteworthy enhancements include: increased coverage such as Wikipedia syntax; enhanced capabilities such as embeddable JavaScript and CSS resources and RDFa support; a web service for remote processing via web-sockets; along with general accuracy and reliability improvements. The outlook for an 0.8.0 release in mid-2013 is also discussed. 0 0
3D Wikipedia: Using online text to automatically label and navigate reconstructed geometry Russell B.C.
Martin-Brualla R.
Butler D.J.
Seitz S.M.
Zettlemoyer L.
3D visualization
Image-based modeling and rendering
Natural Language Processing
ACM Transactions on Graphics English We introduce an approach for analyzing Wikipedia and other text, together with online photos, to produce annotated 3D models of famous tourist sites. The approach is completely automated, and leverages online text and photo co-occurrences via Google Image Search. It enables a number of new interactions, which we demonstrate in a new 3D visualization tool. Text can be selected to move the camera to the corresponding objects, 3D bounding boxes provide anchors back to the text describing them, and the overall narrative of the text provides a temporal guide for automatically flying through the scene to visualize the world as you read about it. We show compelling results on several major tourist sites. 0 0
A Malicious Bot Capturing System using a Beneficial Bot and Wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Information security
Network analysis
Journal of Information Processing English Locating malicious bots in a large network is problematic because the internal firewalls and network address translation (NAT) routers of the network unintentionally contribute to hiding the bots’ host address and malicious packets. However, eliminating firewalls and NAT routers merely for locating bots is generally not acceptable. In the present paper, we propose an easy to deploy, easy to manage network security control system for locating a malicious host behind internal secure gateways. The proposed network security control system consists of a remote security device and a command server. The remote security device is installed as a transparent link (implemented as an L2 switch), between the subnet and its gateway in order to detect a host that has been compromised by a malicious bot in a target subnet, while minimizing the impact of deployment. The security device is controlled remotely by 'polling' the command server in order to eliminate the NAT traversal problem and to be firewall friendly. Since the remote security device exists in transparent, remotely controlled, robust security gateways, we regard this device as a beneficial bot. We adopt a web server with wiki software as the command server in order to take advantage of its power of customization, ease of use, and ease of deployment of the server. 5 2
A Wiki collaborative application for teaching in manufacturing engineering Cuesta E.
Sanchez-Lasheras F.
Alvarez B.J.
Gonzalez-Madruga D.
Collaborative work
Manufacturing engineering
Wiki learning
Materials Science Forum English The interest of the present work is focused on the improvement of the students learning process through the use of a Wiki-like platform. In our research The Wiki was intended as a mean in order to make easier the learning project. During the academic year 2011/2012 the Area of Manufacturing Engineering of the University of Oviedo was involved in a project which aim was the creation of a Wiki. Nowadays this software is used as auxiliary material for other subjects that are given by the Manufacturing Engineering Area in those new Engineering degrees that have been created in order to adapt the studies to the requirement of the European Higher Education Area (EHEA). According to the results obtained by the students, it can be stated that the higher the mark of the student's Wiki the better his/her mark in the exam is. 0 0
A Wikipedia based hybrid ranking method for taxonomic relation extraction Zhong X. Hybrid ranking method
Select best position
Taxonomic relation extraction
Lecture Notes in Computer Science English This paper proposes a hybrid ranking method for taxonomic relation extraction (or select best position) in an existing taxonomy. This method is capable of effectively combining two resources, an existing taxonomy and Wikipedia, in order to select a most appropriate position for a term candidate in the existing taxonomy. Previous methods mainly focus on complex inference methods to select the best position among all the possible position in the taxonomy. In contrast, our algorithm, a simple but effective one, leverage two kinds of information, the expression of and the ranking information of a term candidate, to select the best position for the term candidate (the hypernym of the term candidate in the existing taxonomy). We conduct our approach on the agricultural domain and the experimental result indicates that the performances are significantly improved. 0 0
A Wikipédia como diálogo entre universidade e sociedade: uma experiência em extensão universitária Juliana Bastos Marques
Otavio Saraiva Louvem
Anais do XIX Workshop de Informática na Escola Portuguese
List of publications in Portuguese

O artigo apresenta uma experiência no trabalho com o uso crítico e edição de artigos da Wikipédia lusófona no ambiente universitário, em atividades de extensão, realizado na Universidade Federal do Estado do Rio de Janeiro (Unirio) em 2012. Foram realizados diferentes tipos de atividades, desde workshops de 4h até cursos de maior duração, tanto para o público adulto geral quanto para universitários segmentados por área de estudo. O objetivo do trabalho foi exercitar competências críticas de leitura e produção de textos de divulgação, trazendo e adaptando para o usuário da Wikipédia conhecimentos ensinados em nível de graduação e pós-graduação.


The paper presents an experience with critical reading and edition of Portuguese Wikipedia articles in the university, in extension activities, conducted at the Federal University of Rio de Janeiro State (Unirio), in 2012. Different types of activities were introduced, from 4h workshops to longer term courses, for both broader audiences and university students by field of study. The goal of the activities was to exercise critical proficiency in reading and writing skills, offering and adapting for the regular Wikipedia user academic knowledge produced in undergraduate and graduate levels.
4 0
A bookmark recommender system based on social bookmarking services and wikipedia categories Yoshida T.
Inoue U.
SNPD 2013 - 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing English Social book marking services allow users to add bookmarks of web pages with freely chosen keywords as tags. Personalized recommender systems recommend new and useful bookmarks added by other users. We propose a new method to find similar users and to select relevant bookmarks in a social book marking service. Our method is lightweight, because it uses a small set of important tags for each user to find useful bookmarks to recommend. Our method is also powerful, because it employs the Wikipedia category database to deal with the diversity of tags among users. The evaluation using the Hatena bookmark service in Japan shows that our method significantly increases the number of relevant bookmarks recommended without notable increase of irrelevant bookmarks. 0 0
A case study of a course including Wikipedia editing activity for undergraduate students Mori Y.
Egi H.
Ozawa S.
Information ethics
Online course
Project based learning
Proceedings of the 21st International Conference on Computers in Education, ICCE 2013 English Editing Wikipedia can increase participants' understandings of subjects, while making valuable contributions to the information society. In this study, we designed an online course for undergraduate students that included a Wikipedia editing activity. The result of a content analysis of the term papers revealed that the suggestions made by the e-mentor and the teacher were highly supportive for the students in our case study, and it is important for Japanese students to check Wikipedia in English before making their edits in Japanese. 0 0
A cloud of FAQ: A highly-precise FAQ retrieval system for the Web 2.0 Romero M.
Moreo A.
Castro J.L.
FAQ retrieval
Natural language
Tag cloud
Wikipedia concepts
Knowledge-Based Systems English FAQ (Frequency Asked Questions) lists have attracted increasing attention for companies and organizations. There is thus a need for high-precision and fast methods able to manage large FAQ collections. In this context, we present a FAQ retrieval system as part of a FAQ exploiting project. Following the growing trend towards Web 2.0, we aim to provide users with mechanisms to navigate through the domain of knowledge and to facilitate both learning and searching, beyond classic FAQ retrieval algorithms. To this purpose, our system involves two different modules: an efficient and precise FAQ retrieval module and, a tag cloud generation module designed to help users to complete the comprehension of the retrieved information. Empirical results evidence the validity of our approach with respect to a number of state-of-the-art algorithms in terms of the most popular metrics in the field. © 2013 Elsevier B.V. All rights reserved. 0 0
A collaboration effectiveness and Easiness Evaluation Method for RE-specific wikis based on Cognition-Behavior Consistency Decision Triangle Peng R.
Sun D.
Lai H.
Cognition-behavior consistency decision triangle
Collaboration effectiveness and easiness evaluation
RE-specific wikis
Jisuanji Xuebao/Chinese Journal of Computers Chinese Wiki technology, represented by Wikipedia, has attracted serious concern due to its capability to support collaboratively online contents' creation in a flexible and simple manner. Under the guidance of Wiki technology, developing specific wiki-based requirements management tools, namely RE-specific wikis, through extending various open source wikis to support distributed requirements engineering activities becomes a hot research topic. Many RE-specific wikis, such as RE-Wiki, SOP-Wiki and WikiWinWin, have been developed. But how to evaluate its collaboration effectiveness and easiness still needs further study. Based on Cognition-Behavior Consistency Decision Triangle (CBCDT), a Collaboration Effectiveness and Easiness Evaluation Method (CE3M) for evaluating RE-specific wikis is proposed. As to a specific RE-specific wiki, it evaluates the consistencies from three aspects: the expectations of its designers, the cognitions of its users and the behavior significations of its users. Specifically, the expectations of its designers and the cognitions of users are got from investigation; the behavior significations are gained from experts' investigation according to their opinions on the statistical data of the users' collaboration behaviors. And then, the consistencies' evaluations based on statistical hypothesis testing are performed. Through the case study, it shows that CE3M is appropriate to discover the similarities and differences among the expectations, cognitions and behaviors. These insights gained can be used as the objective evidences of RE-specific wiki's evolution decisions. 0 0
A collaborative multi-source intelligence working environment: A systems approach Eachus P.
Short B.
Stedmon A.W.
Brown J.
Wilson M.
Lemanski L.
Collaborative working
Intelligence analysis
Lecture Notes in Computer Science English This research applies a systems approach to aid the understanding of collaborative working during intelligence analysis using a dedicated (Wiki) environment. The extent to which social interaction, and problem solving was facilitated by the use of the wiki, was investigated using an intelligence problem derived from the Vast 2010 challenge. This challenge requires "intelligence analysts" to work with a number of different intelligence sources in order to predict a possible terrorist attack. The study compared three types of collaborative working, face-to-face without a wiki, face-to-face with a wiki, and use of a wiki without face-to-face contact. The findings revealed that in terms of task performance the use of the wiki without face-to-face contact performed best and the wiki group with face-to-face contact performed worst. Measures of interpersonal and psychological satisfaction were highest in the face-to-face group not using a wiki and least in the face-to-face group using a wiki. Overall it was concluded that the use of wikis in collaborative working is best for task completion whereas face-to-face collaborative working without a wiki is best for interpersonal and psychological satisfaction. 0 0
A comparative study of academic and wikipedia ranking Shuai X.
Jiang Z.
Xiaojiang Liu
Bollen J.
Citation analysis
Scholar impact
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries English In addition to its broad popularity Wikipedia is also widely used for scholarly purposes. Many Wikipedia pages pertain to academic papers, scholars and topics providing a rich ecology for scholarly uses. Scholarly references and mentions on Wikipedia may thus shape the \societal impact" of a certain scholarly communication item, but it is not clear whether they shape actual \academic impact". In this paper we compare the impact of papers, scholars, and topics according to two different measures, namely scholarly citations and Wikipedia mentions. Our results show that academic and Wikipedia impact are positively correlated. Papers, authors, and topics that are mentioned on Wikipedia have higher academic impact than those are not mentioned. Our findings validate the hypothesis that Wikipedia can help assess the impact of scholarly publications and underpin relevance indicators for scholarly retrieval or recommendation systems. Copyright © 2013 by the Association for Computing Machinery, Inc. (ACM). 0 0
A comparison of named entity recognition tools applied to biographical texts Atdag S.
Labatut V.
2013 2nd International Conference on Systems and Computer Science, ICSCS 2013 English Named entity recognition (NER) is a popular domain of natural language processing. For this reason, many tools exist to perform this task. Amongst other points, they differ in the processing method they rely upon, the entity types they can detect, the nature of the text they can handle, and their input/output formats. This makes it difficult for a user to select an appropriate NER tool for a specific situation. In this article, we try to answer this question in the context of biographic texts. For this matter, we first constitute a new corpus by annotating 247 Wikipedia articles. We then select 4 publicly available, well known and free for research NER tools for comparison: Stanford NER, Illinois NET, OpenCalais NER WS and Alias-i LingPipe. We apply them to our corpus, assess their performances and compare them. When considering overall performances, a clear hierarchy emerges: Stanford has the best results, followed by LingPipe, Illionois and OpenCalais. However, a more detailed evaluation performed relatively to entity types and article categories highlights the fact their performances are diversely influenced by those factors. This complementarity opens an interesting perspective regarding the combination of these individual tools in order to improve performance. 0 0
A computational approach to politeness with application to social factors Cristian Danescu-Niculescu-Mizil
Sudhof M.
Dan J.
Leskovec J.
Potts C.
ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference English We propose a computational framework for identifying linguistic aspects of politeness. Our starting point is a new corpus of requests annotated for politeness, which we use to evaluate aspects of politeness theory and to uncover new interactions between politeness markers and context. These findings guide our construction of a classifier with domain-independent lexical and syntactic features operationalizing key components of politeness theory, such as indirection, deference, impersonalization and modality. Our classifier achieves close to human performance and is effective across domains. We use our framework to study the relationship between politeness and social power, showing that polite Wikipedia editors are more likely to achieve high status through elections, but, once elevated, they become less polite. We see a similar negative correlation between politeness and power on Stack Exchange, where users at the top of the reputation scale are less polite than those at the bottom. Finally, we apply our classifier to a preliminary analysis of politeness variation by gender and community. 0 0
A content analysis of wikiproject discussions: Toward a typology of coordination language used by virtual teams Morgan J.T.
Mcdonald D.W.
Gilbert M.
Mark Zachry
Content analysis
Distributed collaboration
Group dynamics
English Understanding the role of explicit coordination in virtual teams allows for a more meaningful understanding of how people work together online. We describe a new content analysis for classifying discussions within Wikipedia WikiProjects-voluntary, self-directed teams of editors-present preliminary findings, and discuss potential applications and future research directions. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM). 0 0
A content-context-centric approach for detecting vandalism in Wikipedia Lakshmish Ramaswamy
Tummalapenta R.S.
Li K.
Calton Pu
Collaborative online social media
Top-ranked co-occurrence probability
Vandalism detection
WWW co-occurrence probability
Proceedings of the 9th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing, COLLABORATECOM 2013 English Collaborative online social media (CSM) applications such as Wikipedia have not only revolutionized the World Wide Web, but they also have had a hugely positive effect on modern free societies. Unfortunately, Wikipedia has also become target to a wide-variety of vandalism attacks. Most existing vandalism detection techniques rely upon simple textual features such as existence of abusive language or spammy words. These techniques are ineffective against sophisticated vandal edits, which often do not contain the tell-tale markers associated with vandalism. In this paper, we argue for a context-aware approach for vandalism detection. This paper proposes a content-context-aware vandalism detection framework. The main idea is to quantify how well the words contained in the edit fit into the topic and the existing content of the Wikipedia article. We present two novel metrics, called WWW co-occurrence probability and top-ranked co-occurrence probability for this purpose. We also develop efficient mechanisms for evaluating these two metrics, and machine learning-based schemes that utilize these metrics. The paper presents a range of experiments to demonstrate the effectiveness of the proposed approach. 0 0
A contextual semantic representation of learning assets in online communities of practice Berkani L.
Chikh A.
Communities of practice
Contextual semantic annotation
Knowledge reuse
Learning assets
Semantic adaptive wiki
Semantic search
International Journal of Metadata, Semantics and Ontologies English This paper presents an ontology-based framework for a contextual semantic representation of learning assets within a Community of Practice of E-learning (CoPE). The community, made up of actors from the e-learning domain (teachers, tutors, pedagogues, administrators...), is considered as a virtual space for exchanging and sharing techno-pedagogic knowledge and know-how between those actors. Our objective is to semantically describe the CoPE's learning assets using contextual semantic annotations. We consider two types of semantic annotations: (a) objective annotations, describing the learning assets with a set of context-related metadata and (b) subjective annotations, to express the members' experience and feedback regarding these same assets. The paper is illustrated with a case study related to a semantic adaptive wiki using the framework and aiming to foster the knowledge sharing and reuse between the CoPE's members. The wiki provides essentially a semantic search and a recommendation support of assets. Copyright 0 0
A distributed ontology repository management approach based on semantic wiki Rao G.
Feng Z.
Xiaolong Wang
Liu R.
Distributed ontolog
Ontology inconsistency
Semantic wiki
Communications in Computer and Information Science English As the foundation of Semantic Web, the size of ontologies on the Web has developed into tens of billions. Furthermore, the creation process of the repository takes place through open collaboration. However, the problem of inconsistent repository is made even worse because of openness and collaboration. Semantic wiki provides a new approach to build large-scale, unified semantic knowledge base. This paper focuses on the relevant problems, technologies and applications of semantic wiki based ontology repository with the combination of semantic wiki technologies and distributed ontology repository. A distributed ontology repository management approach and platform based on semantic wiki is presented. The platform is divided into three layers, including distributed ontology management layer, business logic layer, and application performance layer. Self-maintenance and optimization of distributed ontology repository is implemented by the management module with technology of ontology reasoning, ontology view extraction and ontology segmentation. The unified interface of the repository to provide knowledge storage and query services to application of semantic web is provided through knowledge bus mechanism with distributed ontology encapsulated. In the business logic layer, the operations of wiki and ontology are mapped to manage the wiki pages and ontology resources through mapping the wiki entries and ontology resources. In the application performance layer, a friendly interface is provided to build repository through combining the entry information display and the semantic information extraction. 0 0
A formative evaluation of WIKI's as a learning tool in a face to face juvenile justice course Bowman S.W. Collaborative learning
Technology integration
Educational Technology Research and Development English Current literature indicates an increased pedagogical value of technology integration in university coursework. One form of technology that encourages collaborative, online teaching and learning is a wiki, an online application that allows participants to partner and direct a website. This article describes the design and formative evaluation of a semester-long wiki project that was conducted during three face-to-face juvenile justice courses. Upon completion, 61 students completed written, open-ended evaluations of the project with a focus on (a) the strengths of the project, (b) knowledge of the juvenile justice system gained through the project, and (c) suggestions to improve the overall effectiveness. NVIVO8 was used to code and analyze the results of their responses. Results indicate that the Juvenile Justice Wiki Project demonstrated a real-life (online) understanding of the juvenile justice system in a face-to-face meeting, a more comprehensive examination of the juvenile justice system compared to a more traditional book and lecture pedagogy, and a perceived value in the collaborative, constructivist approach. A formative evaluation indicates future structural and pedagogical project modifications according to student evaluations and perceptions. 0 0
A framework for benchmarking entity-annotation systems Cornolti M.
Paolo Ferragina
Massimiliano Ciaramita
Benchmark framework
Entity annotation
WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web English In this paper we design and implement a benchmarking framework for fair and exhaustive comparison of entity-annotation systems. The framework is based upon the definition of a set of problems related to the entity-annotation task, a set of measures to evaluate systems performance, and a systematic comparative evaluation involving all publicly available datasets, containing texts of various types such as news, tweets and Web pages. Our framework is easily-extensible with novel entity annotators, datasets and evaluation measures for comparing systems, and it has been released to the public as open source1. We use this framework to perform the first extensive comparison among all available entity annotators over all available datasets, and draw many interesting conclusions upon their efficiency and effectiveness. We also draw conclusions between academic versus commercial annotators. Copyright is held by the International World Wide Web Conference Committee (IW3C2). 0 0
A framework for detecting public health trends with Twitter Parker J.
Wei Y.
Yates A.
Frieder O.
Goharian N.
Health surveillance
Item-set mining
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013 English Traditional public health surveillance requires regular clinical reports and considerable effort by health professionals to analyze data. Therefore, a low cost alternative is of great practical use. As a platform used by over 500 million users worldwide to publish their ideas about many topics, including health conditions, Twitter provides researchers the freshest source of public health conditions on a global scale. We propose a framework for tracking public health condition trends via Twitter. The basic idea is to use frequent term sets from highly purified health-related tweets as queries into a Wikipedia article index - treating the retrieval of medically-related articles as an indicator of a health-related condition. By observing fluctuations in frequent term sets and in turn medically-related articles over a series of time slices of tweets, we detect shifts in public health conditions and concerns over time. Compared to existing approaches, our framework provides a general a priori identification of emerging public health conditions rather than a specific illness (e.g., influenza) as is commonly done. Copyright 2013 ACM. 0 0
A framework for the calibration of social simulation models Ciampaglia G.L. Agent-based model
Indirect inference
Norm emergence
Advances in Complex Systems English Simulation with agent-based models is increasingly used in the study of complex socio-technical systems and in social simulation in general. This paradigm offers a number of attractive features, namely the possibility of modeling emergent phenomena within large populations. As a consequence, often the quantity in need of calibration may be a distribution over the population whose relation with the parameters of the model is analytically intractable. Nevertheless, we can simulate. In this paper we present a simulation-based framework for the calibration of agent-based models with distributional output based on indirect inference. We illustrate our method step by step on a model of norm emergence in an online community of peer production, using data from three large Wikipedia communities. Model fit and diagnostics are discussed. 0 0
A game theoretic analysis of collaboration in Wikipedia Anand S.
Ofer Arazy
Mandayam N.B.
Oded Nov
Non-cooperative game
Peer production
Trustworthy collaboration
Lecture Notes in Computer Science English Peer production projects such as Wikipedia or open-source software development allow volunteers to collectively create knowledge-based products. The inclusive nature of such projects poses difficult challenges for ensuring trustworthiness and combating vandalism. Prior studies in the area deal with descriptive aspects of peer production, failing to capture the idea that while contributors collaborate, they also compete for status in the community and for imposing their views on the product. In this paper, we investigate collaborative authoring in Wikipedia, where contributors append and overwrite previous contributions to a page. We assume that a contributor's goal is to maximize ownership of content sections, such that content owned (i.e. originated) by her survived the most recent revision of the page.We model contributors' interactions to increase their content ownership as a non-cooperative game, where a player's utility is associated with content owned and cost is a function of effort expended. Our results capture several real-life aspects of contributors interactions within peer-production projects. Namely, we show that at the Nash equilibrium there is an inverse relationship between the effort required to make a contribution and the survival of a contributor's content. In other words, majority of the content that survives is necessarily contributed by experts who expend relatively less effort than non-experts. An empirical analysis of Wikipedia articles provides support for our model's predictions. Implications for research and practice are discussed in the context of trustworthy collaboration as well as vandalism. 0 0
A generalized flow-based method for analysis of implicit relationships on wikipedia Xiaodan Zhang
Yasuhito Asano
Masatoshi Yoshikawa
Generalized flow
Link analysis
Data mining
IEEE Transactions on Knowledge and Data Engineering English We focus on measuring relationships between pairs of objects in Wikipedia whose pages can be regarded as individual objects. Two kinds of relationships between two objects exist: in Wikipedia, an explicit relationship is represented by a single link between the two pages for the objects, and an implicit relationship is represented by a link structure containing the two pages. Some of the previously proposed methods for measuring relationships are cohesion-based methods, which underestimate objects having high degrees, although such objects could be important in constituting relationships in Wikipedia. The other methods are inadequate for measuring implicit relationships because they use only one or two of the following three important factors: distance, connectivity, and cocitation. We propose a new method using a generalized maximum flow which reflects all the three factors and does not underestimate objects having high degree. We confirm through experiments that our method can measure the strength of a relationship more appropriately than these previously proposed methods do. Another remarkable aspect of our method is mining elucidatory objects, that is, objects constituting a relationship. We explain that mining elucidatory objects would open a novel way to deeply understand a relationship. 0 0
A generic open world named entity disambiguation approach for tweets Habib M.B.
Van Keulen M.
Named Entity Disambiguation
Social media
IC3K 2013; KDIR 2013 - 5th International Conference on Knowledge Discovery and Information Retrieval and KMIS 2013 - 5th International Conference on Knowledge Management and Information Sharing, Proc. English Social media is a rich source of information. To make use of this information, it is sometimes required to extract and disambiguate named entities. In this paper, we focus on named entity disambiguation (NED) in twitter messages. NED in tweets is challenging in two ways. First, the limited length of Tweet makes it hard to have enough context while many disambiguation techniques depend on it. The second is that many named entities in tweets do not exist in a knowledge base (KB). We share ideas from information retrieval (IR) and NED to propose solutions for both challenges. For the first problem we make use of the gregarious nature of tweets to get enough context needed for disambiguation. For the second problem we look for an alternative home page if there is no Wikipedia page represents the entity. Given a mention, we obtain a list of Wikipedia candidates from YAGO KB in addition to top ranked pages from Google search engine. We use Support Vector Machine (SVM) to rank the candidate pages to find the best representative entities. Experiments conducted on two data sets show better disambiguation results compared with the baselines and a competitor. 0 0
A history of newswork on wikipedia Brian C. Keegan Breaking news
Current events
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English Wikipedia's coverage of current events blurs the boundaries of what it means to be an encyclopedia. Drawing on Gieyrn's concept of \boundary work", this paper explores how Wiki- pedia's response to the 9/11 attacks expanded the role of the encyclopedia to include newswork, excluded content like the 9/11 Memorial Wiki that became problematic following this expansion, and legitimized these changes through the adop- Tion of news-related policies and routines like promoting "In the News" content on the homepage. However, a second case exploring WikiNews illustrates the pitfalls of misappropriat- ing professional newswork norms as well as the challenges of sustaining online communities. These cases illuminate the social construction of new technologies as they confront the boundaries of traditional professional identities and also re- veal how newswork is changing in response to new forms of organizing enabled by these technologies. Categories and Subject Descriptors K.2 [Computing Milieux]: History of Computing; K.4.3 [Computers and Society]: Organizational ImpactsCom- puter supported collaborative work General Terms Standardization,Theory. Copyright 2010 ACM. 0 0
A hybrid method for detecting outdated information in Wikipedia infoboxes Thanh Tran
Cao T.H.
Entity Search
Information extraction
Pattern Learning
Wikipedia Update
Proceedings - 2013 RIVF International Conference on Computing and Communication Technologies: Research, Innovation, and Vision for Future, RIVF 2013 English Wikipedia has grown fast and become a major information resource for users as well as for many knowledge bases derived from it. However it is still edited manually while the world is changing rapidly. In this paper, we propose a method to detect outdated attribute values in Wikipedia infoboxes by using facts extracted from the general Web. Our proposed method extracts new information by combining pattern-based approach with entity-search-based approach to deal with the diversity of natural language presentation forms of facts on the Web. Our experimental results show that the achieved accuracies of the proposed method are 70% and 82% respectively on the chief-executive-officer attribute and the number-of-employees attribute in company infoboxes. It significantly improves the accuracy of the single pattern-based or entity-search-based method. The results also reveal the striking truth about the outdated status of Wikipedia. 0 0
A likelihood-based framework for the analysis of discussion threads Gomez V.
Kappen H.J.
Litvak N.
Andreas Kaltenbrunner
Discussion threads
Information cascades
Maximum likelihood
Online conversations
Preferential attachment
World Wide Web English Online discussion threads are conversational cascades in the form of posted messages that can be generally found in social systems that comprise many-to-many interaction such as blogs, news aggregators or bulletin board systems. We propose a framework based on generative models of growing trees to analyse the structure and evolution of discussion threads. We consider the growth of a discussion to be determined by an interplay between popularity, novelty and a trend (or bias) to reply to the thread originator. The relevance of these features is estimated using a full likelihood approach and allows to characterise the habits and communication patterns of a given platform and/or community. We apply the proposed framework on four popular websites: Slashdot, Barrapunto (a Spanish version of Slashdot), Meneame (a Spanish Digg-clone) and the article discussion pages of the English Wikipedia. Our results provide significant insight into understanding how discussion cascades grow and have potential applications in broader contexts such as community management or design of communication platforms. 0 0
A linguistic consensus model for Web 2.0 communities Alonso S.
Perez I.J.
Cabrerizo F.J.
Herrera-Viedma E.
Fuzzy logic
Group decision making
Linguistic preferences
Online community
Web 2.0
Applied Soft Computing Journal English Web 2.0 communities are a quite recent phenomenon which involve large numbers of users and where communication between members is carried out in real time. Despite of those good characteristics, there is still a necessity of developing tools to help users to reach decisions with a high level of consensus in those new virtual environments. In this contribution a new consensus reaching model is presented which uses linguistic preferences and is designed to minimize the main problems that this kind of organization presents (low and intermittent participation rates, difficulty of establishing trust relations and so on) while incorporating the benefits that a Web 2.0 community offers (rich and diverse knowledge due to a large number of users, real-time communication, etc.). The model includes some delegation and feedback mechanisms to improve the speed of the process and its convergence towards a solution of consensus. Its possible application to some of the decision making processes that are carried out in the Wikipedia is also shown. © 2012 Elsevier B.V. All rights reserved. 0 0
A method for recommending the most appropriate expansion of acronyms using wikipedia Choi D.
Shin J.
Lee E.
Kim P.
Acronym expansion
Information extraction
Text mining
Proceedings - 7th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, IMIS 2013 English Over the years, many researchers have been studied to detect expansions of acronyms in texts by using linguistic and syntactical approaches in order to overcome disambiguation problems. Acronym is an abbreviation formed which is composed of initial components of single or multiple words. These initial components bring huge mistakes when a machine conducts experiments to find meaning from given texts. Detecting expansions of acronyms is not a big issue now days. The problem is that a polysemous acronym. In order to solve this problem, this paper proposes a method to recommend the most related expansion of acronym through analyzing co-occurrence words by using Wikipedia. Our goal is not finding acronym definition or expansion but recommending the most appropriate expansion of given acronyms. 0 0
A multilingual and multiplatform application for medicinal plants prescription from medical symptoms Ruiz-Rico F.
Rubio-Sanchez M.-C.
Tomas D.
Vicedo J.-L.
Category ranking
Medical Subject Headings
Medicinal Plants
Text classification
SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval English This paper presents an application for medicinal plants prescription based on text classification techniques. The system receives as an input a free text describing the symptoms of a user, and retrieves a ranked list of medicinal plants related to those symptoms. In addition, a set of links to Wikipedia are also provided, enriching the information about every medicinal plant presented to the user. In order to improve the accessibility to the application, the input can be written in six different languages, adapting the results accordingly. The application interface can be accessed from different devices and platforms. 0 0
A multilingual semantic wiki based on attempto controlled english and grammatical framework Kaljurand K.
Kuhn T.
Attempto Controlled English
Controlled natural language
Grammatical Framework
Semantic wiki
Lecture Notes in Computer Science English We describe a semantic wiki system with an underlying controlled natural language grammar implemented in Grammatical Framework (GF). The grammar restricts the wiki content to a well-defined subset of Attempto Controlled English (ACE), and facilitates a precise bidirectional automatic translation between ACE and language fragments of a number of other natural languages, making the wiki content accessible multilingually. Additionally, our approach allows for automatic translation into the Web Ontology Language (OWL), which enables automatic reasoning over the wiki content. The developed wiki environment thus allows users to build, query and view OWL knowledge bases via a user-friendly multilingual natural language interface. As a further feature, the underlying multilingual grammar is integrated into the wiki and can be collaboratively edited to extend the vocabulary of the wiki or even customize its sentence structures. This work demonstrates the combination of the existing technologies of Attempto Controlled English and Grammatical Framework, and is implemented as an extension of the existing semantic wiki engine AceWiki. 0 0
A new approach for building domain-specific corpus with wikipedia Zhang X.Y.
Li X.
Ruan Z.J.
Domain-specific corpus
Kosaraju algorithm based
Multi-root method
Applied Mechanics and Materials English Domain-specific corpus can be used to build domain ontology, which is used in many areas such as IR, NLP and web Mining. We propose a multi-root based method to build a domain-specific corpus making use of Wikipedia resources. First we select some top-level nodes (Wikipedia category articles) as root nodes and traverse the Wikipedia using BFS-like algorithm. After the traverse, we get a directed Wikipedia graph (Wiki-graph). Then an algorithm mainly based on Kosaraju Algorithm is proposed to remove the cycles in the Wiki-graph. Finally, topological sort algorithm is used to traverse the Wiki-graph, and ranking and filtering is done during the process. When computing a node's ranking score, the in-degree of itself and the out-degree of its parents are both considered. The experimental evaluation shows that our method could get a high-quality domain-specific corpus. 0 0
A new approach to detecting content anomalies in Wikipedia Sinanc D.
Yavanoglu U.
Artificial neural networks
Class mapping
Data mining
Open editing schemas
Web classification
Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013 English The rapid growth of the web has caused to availability of data effective if its content is well organized. Despite the fact that Wikipedia is the biggest encyclopedia on the web, its quality is suspect due to its Open Editing Schemas (OES). In this study, zoology and botany pages are selected in English Wikipedia and their html contents are converted to text then Artificial Neural Network (ANN) is used for classification to prevent disinformation or misinformation. After the train phase, some irrelevant words added in the content about politics or terrorism in proportion to the size of the text. By the time unsuitable content is added in a page until the moderators' intervention, the proposed system realized the error via wrong categorization. The results have shown that, when words number 2% of the content is added anomaly rate begins to cross the 50% border. 0 0
A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification Alahmadi A.
Joorabchi A.
Mahdi A.E.
Text Classification
2013 7th IEEE GCC Conference and Exhibition, GCC 2013 English This paper introduces a new approach to creating text representations and apply it to a standard text classification collections. The approach is based on supplementing the well-known Bag-of-Words (BOW) representational scheme with a concept-based representation that utilises Wikipedia as a knowledge base. The proposed representations are used to generate a Vector Space Model, which in turn is fed into a Support Vector Machine classifier to categorise a collection of textual documents from two publically available datasets. Experimental results for evaluating the performance of our model in comparison to using a standard BOW scheme and a concept-based scheme, as well as recently reported similar text representations that are based on augmenting the standard BOW approach with concept-based representations. 0 0
A novel map-based visualization method based on liquid modelling Biuk-Aghai R.P.
Ao W.H.
Information visualization
ACM International Conference Proceeding Series English Many applications produce large amounts of data, and information visualization has been successfully applied to help make sense of this data. Recently geographic maps have been used as a metaphor for visualization, given that most people are familiar with reading maps, and several visualization methods based on this metaphor have been developed. In this paper we present a new visualization method that aims to improve on existing map-like visualizations. It is based on the metaphor of liquids poured onto a surface that expand outwards until they touch each other, forming larger areas. We present the design of our visualization method and an evaluation we have carried out to compare it with an existing visualization. Our new visualization has better usability, leading to higher accuracy and greater speed of task performance. 0 0
A portable multilingual medical directory by automatic categorization of wikipedia articles Ruiz-Rico F.
Rubio-Sanchez M.-C.
Tomas D.
Vicedo J.-L.
Category ranking
JQuery Mobile
Medical Subject Headings
Text classification
SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval English Wikipedia has become one of the most important sources of information available all over the world. However, the categorization of Wikipedia articles is not standardized and the searches are mainly performed on keywords rather than concepts. In this paper we present an application that builds a hierarchical structure to organize all Wikipedia entries, so that medical articles can be reached from general to particular, using the well known Medical Subject Headings (MeSH) thesaurus. Moreover, the language links between articles will allow using the directory created in different languages. The final system can be packed and ported to mobile devices as a standalone offline application. 0 0
A preliminary study of Croatian language syllable networks Ban K.
Ivakic I.
Mestrovic A.
2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2013 - Proceedings English This paper presents preliminary results of Croatian syllable networks analysis. We analyzed networks of syllables generated from texts collected from the Croatian Wikipedia and Blogs. Different syllable networks are constructed in a way that each node in this network is a syllable, and links are established between two syllables if they appear together in the same word (co-occurrence network) or if they appear as neighbours in a word (neighbour network). As a main tool we use network analysis methods which provide mechanisms that can reveal new patterns in a complex language structure. We aim to show that syllable networks differ from Erdös-Renyi random networks, which may indicate that language has its own rules and self-organization structure. Furthermore, our results have been compared with other studies on syllable network of Portuguese and Chinese. The results indicate that Croatian Syllables networks exhibit certain properties of a small world networks. 0 0
A preliminary study on the effects of barnstars on wikipedia editing Lim K.H.
Anwitaman Datta
Wise M.
Editing behaviour
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English This paper presents a preliminary study into the awarding of barnstars among Wikipedia editors to better understand their motivations in contributing to Wikipedia articles. We crawled the talk pages of all active Wikipedia editors and retrieved 21,299 barnstars that were awarded among 14,074 editors. In particular, we found that editors do not award and receive barnstars in equal (or similar) quantities. Also, editors were more active in editing articles before awarding or receiving barnstars. Categories and Subject Descriptors H.5.3 [Group and Organization Interfaces]: Computer- supported cooperative work General Terms Measurement. Copyright 2010 ACM. 0 0
A quick tour of BabelNet 1.1 Roberto Navigli BabelNet
Knowledge acquisition
Multilingual ontologies
Semantic networks
Lecture Notes in Computer Science English In this paper we present BabelNet 1.1, a brand-new release of the largest "encyclopedic dictionary", obtained from the automatic integration of the most popular computational lexicon of English, i.e. WordNet, and the largest multilingual Web encyclopedia, i.e. Wikipedia. BabelNet 1.1 covers 6 languages and comes with a renewed Web interface, graph explorer and programmatic API. BabelNet is available online at http://www.babelnet.org. 0 0
A semantic wiki to support knowledge sharing in innovation activities Lahoud I.
Monticolo D.
Hilaire V.
Gomes S.
Knowledge creation
Knowledge evaluation and sharing
Semantic wiki
Lecture Notes in Electrical Engineering English We will present in this paper how to ensure the creation, the validation and the sharing of ideas by using a Semantic Wiki approach. We describe the system called Wiki-I which is used by engineers to allow them to formalize their ideas during the research solutions activities. Wiki-I is based on an ontology of the innovation domain which allows to structure the wiki pages and to store the knowledge posted by the engineers. In this paper, we will explain how Wiki-I ensures the reliability of the innovative ideas thanks to an idea of evaluation process. After explaining the interest of the use of semantic wikis in innovation management approach, we describe the architecture of Wiki-I with its semantic functionalities. At the end of the paper, we prove the effectiveness of Wiki-I with an ideas evaluation example in the case of students challenge for innovation. 0 0
A social contract for virtual institutions Memmi D. Social contract
Social institutions
Social organizations
Virtual institutions
AI and Society English Computer-mediated social groups, often known as virtual communities, are now giving rise to a more durable and more abstract phenomenon: the emergence of virtual institutions. These social institutions operating mostly online exhibit very interesting qualities. Their distributed, collaborative, low-cost and reactive nature makes them very useful. Yet they are also probably more fragile than classical institutions and in need of appropriate support mechanisms. We will analyze them as social institutions, and then resort to social contract theory to determine adequate support measures. We will argue that virtual institutions can be greatly helped by making explicit and publicly available online their norms, rules and procedures, so as to improve the collaboration between their members. 0 0
A study of the Sudanese students' use of collaborative tools within Moodle Learning Management System Elmahadi I.
Osman I.
Computer Supported Collaborative Learning
2013 IST-Africa Conference and Exhibition, IST-Africa 2013 English This study aims to investigate the use of Moodle Learning Management System by Sudanese students, particularly forum and wiki collaborative tools. The participants for this study were 92 undergraduate students from University of Khartoum in Sudan, where face to face collaboration is a common indigenous way of learning. The students took part in a Software Engineering blended learning course during the first semester of 2010-2011 academic year. The students' use was assessed using Moodle activity report tool, wiki entries, forum transcripts and students' final examination marks. Pearson product moment correlation coefficient was used to test for the relationship between using forum and wiki tools and the students' performance in the course. A detailed description of the students' use of the tools is provided. The study also showed a moderate correlation between participating in discussion forum and the students' performance in the course, and a low correlation between wiki participation and course performance. 0 0
A support framework for argumentative discussions management in the web Cabrio E.
Villata S.
Fabien Gandon
Lecture Notes in Computer Science English On the Web, wiki-like platforms allow users to provide arguments in favor or against issues proposed by other users. The increasing content of these platforms as well as the high number of revisions of the content through pros and cons arguments make it difficult for community managers to understand and manage these discussions. In this paper, we propose an automatic framework to support the management of argumentative discussions in wiki-like platforms. Our framework is composed by (i) a natural language module, which automatically detects the arguments in natural language returning the relations among them, and (ii) an argumentation module, which provides the overall view of the argumentative discussion under the form of a directed graph highlighting the accepted arguments. Experiments on the history of Wikipedia show the feasibility of our approach. 0 0
A triangulated investigation of using wiki for project-based learning in different undergraduate disciplines Chu E.H.Y.
Chan C.K.
Michele Notari
Chu S.K.W.
Chen K.
Wu W.W.Y.
Project-based learning
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English This study investigates the use of wiki to support project-based learning (PBL) in 3 undergraduate courses of different disciplines: English Language Studies, Information Management, and Mechanical Engineering. This study takes a methodological triangulation approach that employs the use of questionnaires, interviews, and wiki activity logs. The level of activities and the types of core actions captured on wiki varied among the three groups of students. Students generally rated positively on the use of wiki to support PBL, while significant differences were found on 9 items (especially in the "Motivation" and "Knowledge Management" dimensions of the questionnaire) among students in the three different disciplines. Interviews revealed that these differences may be attributable to the variations in the natures and scopes of the PBL, as well as in the different emphases that students placed on the work presented on the wiki. This study may provide directions on the use of wiki in PBL in undergraduate courses. Categories and Subject Descriptors K.3.1 [Computers and Education]: Computer Uses in Education - collaborative learning. General Terms Experimentation, Human Factors. Copyright 2010 ACM. 0 0
A verification method for MASOES Perozo N.
Aguilar J.
Teran O.
Molina H.
Emergent systems
Fuzzy cognitive maps (FCMs)
Multiagent systems
Wisdom of crowds
IEEE Transactions on Cybernetics English MASOES is a 3agent architecture for designing and modeling self-organizing and emergent systems. This architecture describes the elements, relationships, and mechanisms, both at the individual and the collective levels, that favor the analysis of the self-organizing and emergent phenomenon without mathematically modeling the system. In this paper, a method is proposed for verifying MASOES from the point of view of design in order to study the self-organizing and emergent behaviors of the modeled systems. The verification criteria are set according to what is proposed in MASOES for modeling self-organizing and emerging systems and the principles of the wisdom of crowd paradigm and the fuzzy cognitivemap (FCM) theory. The verificationmethod for MASOES has been implemented in a tool called FCM Designer and has been tested to model a community of free software developers that works under the bazaar style as well as a Wikipedia community in order to study their behavior and determine their self-organizing and emergent capacities. 0 0
A virtual player for "who Wants to Be a Millionaire?" based on Question Answering Molino P.
Pierpaolo Basile
Santoro C.
Pasquale Lops
De Gemmis M.
Giovanni Semeraro
Lecture Notes in Computer Science English This work presents a virtual player for the quiz game "Who Wants to Be a Millionaire?". The virtual player demands linguistic and common sense knowledge and adopts state-of-the-art Natural Language Processing and Question Answering technologies to answer the questions. Wikipedia articles and DBpedia triples are used as knowledge sources and the answers are ranked according to several lexical, syntactic and semantic criteria. Preliminary experiments carried out on the Italian version of the boardgame proves that the virtual player is able to challenge human players. 0 0
A wiki-based assessment system towards social-empowered collaborative learning environment Kao B.C.
Chen Y.H.
Collaborative learning
Past exam
Social learning
Social network
Lecture Notes in Electrical Engineering English The social network has been a very popular research area in the recent years. Lot of people at least have one or more social network account and use it keep in touch with other people on the internet and build own small social network. Thus, the effect and the strength of social network is a very deep and worth to figure out the information delivery path and apply to digital learning area. In this age of web 2.0, sharing knowledge is the main stream of the internet activity, everyone on the internet share and exchanges the information and knowledge every day, and starts to collaborate with other users to build specific knowledge domain in the knowledge database website like Wikipedia. This learning behavior also called co-writing or collaborative learning. This learning strategy brings the new way of the future distance learning. But it is hard to evaluate the performance in the co-writing learning activity, researchers still continue to find out more accurate method which can measure and normalize the learner's performance, provide the result to the teacher, assess the student learning performance in social dimension. As our Lab's previous research, there are several technologies proposed in distance learning area. Based on these background generation, we build a wiki-based website, provide past exam question to examinees, help them to collect all of the target college or license exam resource, moreover, examinees can deploy the question on the own social network, discuss with friends, co-resolve the questions and this system will collect the path of these discussions and analyze the information, improve the collaborative learning assessment efficiency research in social learning field. 0 0
A wiki-based teaching material development environment with enhanced particle swarm optimization Lin Y.-T.
Lin Y.-C.
Huang Y.-M.
Cheng S.-C.
Material design
Particle swarm optimization
Wiki-based revision
Educational Technology and Society English One goal of e-learning is to enhance the interoperability and reusability of learning resources. However, current elearning systems do little to adequately support this. In order to achieve this aim, the first step is to consider how to assist instructors in re-organizing the existing learning objects. However, when instructors are dealing with a large number of existing learning objects, manually re-organizing them into appropriate teaching materials is very laborious. Furthermore, in order to organize well-structured teaching materials, the instructors also have to take more than one factor or criterion into account simultaneously. To cope with this problem, this study develops a wiki-based teaching material development environment by employing enhanced particle swarm optimization and wiki techniques to enable instructors to create and revise teaching materials. The results demonstrated that the proposed approach is efficient and effective in forming custom-made teaching materials by organizing existing and relevant learning objects that satisfy specific requirements. Finally, a questionnaire and interviews were used to investigate teachers' perceptions of the effectiveness of the environment. The results revealed that most of the teachers accepted the quality of the teaching material development results and appreciated the proposed environment. 0 0
Academic Staff in Traditional Universities: Motivators and Demotivators in the Adoption of E-learning Mackeogh K.
Fox S.
Bologna declaration
Course development
Distance education
Distance learning
Lisbon declaration
Open and distance learning (ODL)
Professional development
Quality assurance
Quality learning resources
Quality management
Student feedback
Teaching and learning
Ubiquitous learning
Virtual learning spaces
Web 2.0
Distance and E-Learning in Transition: Learning Innovation, Technology and Social Challenges English [No abstract available] 0 0
Access and Efficiency in the Development of Distance Education and E-Learning Hulsmann T. Blended learning
Computer-based traing (CBT)
Course development
Distance education
Distance learning
Drop out
Flexible learning
Information and communication technologies (ICT)
Instructional design
Open and distance learning (ODL)
Open educational resources (OER)
Real simple syndication (RSS)
Scale economies
Synchronous communication
Teaching and learning process
Virtual learning spaces
Visual representation
Web 2.0
Distance and E-Learning in Transition: Learning Innovation, Technology and Social Challenges English [No abstract available] 0 0
Accessible online content creation by end users Kuksenok K.
Brooks M.
Mankoff J.
User generated content
Conference on Human Factors in Computing Systems - Proceedings English Like most online content, user-generated content (UGC) poses accessibility barriers to users with disabilities. However, the accessibility difficulties pervasive in UGC warrant discussion and analysis distinct from other kinds of online content. Content authors, community culture, and the authoring tool itself all affect UGC accessibility. The choices, resources available, and strategies in use to ensure accessibility are different than for other types of online content. We contribute case studies of two UGC communities with accessible content: Wikipedia, where authors focus on access to visual materials and navigation, and an online health support forum where users moderate the cognitive accessibility of posts. Our data demonstrate real world moderation strategies and illuminate factors affecting success, such as community culture. We conclude with recommended strategies for creating a culture of accessibility around UGC. Copyright 0 0
Acronym-expansion recognition based on knowledge map system Jeong D.-H.
Myunggwon Hwang
Jihie Kim
Hanmin Jung
Sung W.-K.
Acronym-Expansion recognition
Instance mapping
Knowledge map
Linked Open Data
URI resolution
Information (Japan) English In this paper, we present a method for instance mapping and URI resolving to merge two heterogeneous resources and construct a new semantic network from the viewpoint of acronym-expansion. Acronym-expansion information extracted from two unstructured large datasets can be remapped by using linkage information between instances and measuring string similarity. Finally we evaluate the acronym discrimination performance based on the proposed knowledge map system. The result showed that noun phrase based feature selection method gained 89.6% micro averaged precision, which outperformed single noun based one by 20.1%. We found a possibility of interoperability between heterogeneous databases through the experiment of acronym-expansion recognition. 0 0
Adaptive semantics-aware management for web caches and wikis Roque C.
Ferreira P.
Veiga L.
Cache management
Replacement strategies
Web cache
Proceedings of the 12th International Workshop on Adaptive and Reflective Middleware, ARM 2013 - Co-located with ACM/IFIP/USENIX 14th International Middleware Conference, Middleware 2013 English In today's caching and replicated distributed systems, there is a clear need to minimize the amount of data transmitted. This is due to the fact that: i) there is an increase in the size of web objects that can be cached, and the continuous usage increase of these systems makes that a page can be edited and viewed simultaneously by several users. This entails that any modifications to data have to be propagated to a lot of people, thus increasing the use of the network, regardless of the level of interest each one has on such modifications. In this paper, we describe how the current web and wiki systems perform caching and manage replication, and offer an alternative approach by adopting a consistency algorithm, enhanced with user's preferences and notion of inter-document distance, to the web and wiki environments. 0 0
Aemoo: Exploring knowledge on the Web Nuzzolese A.G.
Valentina Presutti
Aldo Gangemi
Alberto Musetti
Paolo Ciancarini
Proceedings of the 3rd Annual ACM Web Science Conference, WebSci 2013 English Aemoo is a Semantic Web application supporting knowledge exploration on the Web. Through a keyword-based search interface, users can gather an effective summary of the knowledge about an entity, according to Wikipedia, Twitter, and Google News. Summaries are designed by applying lenses based on a set of empirically discovered knowledge patterns. Copyright 2013 ACM. 0 0
Affordances and constraints of a wiki for primary-school students' group projects Fu H.
Samuel Chu
Kang W.
Collaborative learning
Group project
Primary school
Educational Technology and Society English This study examined a wiki as a computer-supported collaborative learning (CSCL) environment at upper primary level. A total of 388 Hong Kong Primary-five (P5) students in four Chinese primary schools used a wiki platform within the context of their group projects in General Studies (GS) classes. Adopting a mixed-methods design, qualitative and quantitative data were collected from focus group interviews, survey and wiki entries. Findings showed that the wiki platform provided educational, technological, and social affordances for the P5 students' collaborative learning. At the same time, constraints were found to be related to technological factors and users' dispositions, which may be counterbalanced by providing scaffolding and selecting wiki variants. Students' attitudes towards the pedagogical value of the wiki were found to be strongly positive after the group project implementation. Overall, this research contributes to the literature on the use of wikis in primary education. 0 0
Affordances of wikispaces for collaborative learning and knowledge management Singh A.K.J.
Harun R.N.S.R.
Fareed W.
Collaborative learning
Information and communication technology (ICT) affordances model
Knowledge management
GEMA Online Journal of Language Studies English A wiki, namely, Wikispaces is a component of Web 2.0 technology tools. It is utilised as a peer editing platform for students in correcting errors made by them in their essay writing. The purpose of this article is to find out how the affordances of Wikispaces encourage collaborative learning and knowledge management in correcting errors in L2 students essays. The experience of using Wikispaces throughout the peer editing context and its affordances are described extensively based on three perspectives: pedagogical, social and technological. Data was obtained from online-writing records (students essays), field notes, questionnaire, reflective research diary, and feedback form. The qualitative data are analysed thematically and then triangulated. In terms of pedagogical affordances, Wikispaces supports dual applications of pedagogical approaches (teaching and learning). In relation to social affordances, Wikispaces promotes a variety of interactions (peer-peer and students-teacher interactions) and the dynamics of the activities involved (individual, group and whole-class work) provide a safe and comfortable environment for social interaction and add possibility for asynchronous communications to happen through discussion forums and personal messaging. With respect to the technological affordances, it is a free, user-friendly, and easily accessible web-based tool provided via the Internet. Wikispaces secures backups as well as supports flexible learning environment with the presence of two features: page reverting and autosave. This study implicates the needs to consider the three mentioned affordances of wikis (pedagogical, social and technological) as a platform for teaching and learning. 0 0
An Evaluation of Mental Health Wiki: A Consumer Guide to Mental Health Information on the Internet Reavley N.J.
Morgan A.J.
Jorm D.
Jorm A.F.
Mental health
Journal of Consumer Health on the Internet English The aim of this project was to establish and evaluate a wiki (&www.mentalhealthwiki.org>), which contained information on mental health topics, with contributors restricted to those with some qualification to write on such topics. The site was established in October 2009 with seed content from consumer guides to depression and anxiety disorders. Between October 2009 and May 2012, there were almost 60,000 site visitors and 130 registrants. However, there were no contributions to content other than those of the research team. The dominance of Wikipedia, a wiki to which anyone can contribute, regardless of appropriate professional qualifications, and the relative lack of interest in contributing to Mental Health Wiki, suggests that the open approach is more successful. 0 0
An Inter-Wiki Page Data Processor for a M2M System Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
IIAI ESKM English A data processor, which inputs data from wiki pages, processes the data, and outputs the processed data on a wiki page, is proposed. This data processor is designed for a Machine-to-Machine (M2M) system, which uses Arduino, Android, and Wiki software. This processor is controlled by the program which is written on a wiki page. This M2M system consists of mobile terminals and web sites with wiki software. A mobile terminal of the system consists of an Android terminal and it may have an Arduino board with sensors and actuators. The mobile terminal can read data from not only the sensors in the Arduino board but also wiki pages on the Internet. The input data may be processed by the data processor of this paper. The processed data may be sent to a wiki page. The mobile terminal can control the actuators of the Arduino board by reading commands on the wiki page or by running the program of the processor. This system realizes an open communication forum for not only people but also for machines. 2 0
An agile method for teaching agile in business schools Marija Cubric Agile
Experiential learning
Project management
International Journal of Management Education English The aim of this paper is to describe, evaluate and discuss a new method for teaching agile project management and similar subjects in higher education.Agile is not only a subject domain in this work, the teaching method itself is based on Scrum, a popular agile methodology mostly used in software development projects. The method is supported by wikis, a natural platform for simulation of software development environments.The findings from the evaluation indicate that the method enables the creation of "significant learning", which prepares students for life-long learning and increases their employability. However, the knowledge gains, resulting from wiki interactions are found to be more quantitative than qualitative.The results also imply that despite the active promotion of agile values of communication and feedback, issues regarding the teamwork are still emerging. The engagement of the teacher in the learning and teaching process was discovered to be a motivational factor for the team cohesion.This paper could be of interest to anyone planning to teach agile in the higher education settings, but also to a wider academic community interested in applying agile methods in their own teaching practice. 0 0
An approach for deriving semantically related category hierarchies from Wikipedia category graphs Hejazy K.A.
El-Beltagy S.R.
Category hierarchy
Graph analysis
Hierarchy extraction
Semantic relatedness
Semantic similarity
Advances in Intelligent Systems and Computing English Wikipedia is the largest online encyclopedia known to date. Its rich content and semi-structured nature has made it into a very valuable research tool used for classification, information extraction, and semantic annotation, among others. Many applications can benefit from the presence of a topic hierarchy in Wikipedia. However, what Wikipedia currently offers is a category graph built through hierarchical category links the semantics of which are un-defined. Because of this lack of semantics, a sub-category in Wikipedia does not necessarily comply with the concept of a sub-category in a hierarchy. Instead, all it signifies is that there is some sort of relationship between the parent category and its sub-category. As a result, traversing the category links of any given category can often result in surprising results. For example, following the category of "Computing" down its sub-category links, the totally unrelated category of "Theology" appears. In this paper, we introduce a novel algorithm that through measuring the semantic relatedness between any given Wikipedia category and nodes in its sub-graph is capable of extracting a category hierarchy containing only nodes that are relevant to the parent category. The algorithm has been evaluated by comparing its output with a gold standard data set. The experimental setup and results are presented. 0 0
An approach for restructuring text content Aversano L.
Canfora G.
De Ruvo G.
Tortorella M.
Concept Location
Reverse Engineering
Proceedings - International Conference on Software Engineering English Software engineers have successfully used Natural Language Processing for refactoring source code. Conversely, in this paper we investigate the possibility to apply software refactoring techniques to textual content. As a procedural program is composed of functions calling each other, a document can be modeled as content fragments connected each other through links. Inspired by software engineering refactoring strategies, we propose an approach for refactoring wiki content. The approach has been applied to the EMF category of Eclipsepedia with encouraging results. 0 0
An approach for using wikipedia to measure the flow of trends across countries Tinati R.
Tiropanis T.
Leslie Carr
Social machines
Web observatories
Web science
WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web English Wikipedia has grown to become the most successful online encyclopedia on the Web, containing over 24 million articles, offered in over 240 languages. In just over 10 years Wikipedia has transformed from being just an encyclopedia of knowledge, to a wealth of facts and information, from articles discussing trivia, political issues, geographies and demographics, to popular culture, news articles, and social events. In this paper we explore the use of Wikipedia for identifying the flow of information and trends across the world. We start with the hypothesis that, given that Wikipedia is a resource that is globally available in different languages across countries, access to its articles could be a reflection human activity. To explore this hypothesis we try to establish metrics on the use of Wikipedia in order to identify potential trends and to establish whether or how those trends flow from one county to another. We subsequently compare the outcome of this analysis to that of more established methods that are based on online social media or traditional media. We explore this hypothesis by applying our approach to a subset of Wikipedia articles and also a specific worldwide social phenomenon that occurred during 2012; we investigate whether access to relevant Wikipedia articles correlates to the viral success of the South Korean pop song, "Gangnam Style" and the associated artist "PSY" as evidenced by traditional and online social media. Our analysis demonstrates that Wikipedia can indeed provide a useful measure for detecting social trends and events, and in the case that we studied; it could have been possible to identify the specific trend quicker in comparison to other established trend identification services such as Google Trends. 0 0
An approach of filtering wrong-type entities for entity ranking Jinghua Zhang
Qu Y.
Gong S.
Tian S.
Sun H.
Entity ranking
Related entity finding
Type filtering
IEICE Transactions on Information and Systems English Entity is an important information carrier in Web pages. Users would like to directly get a list of relevant entities instead of a list of documents when they submit a query to the search engine. So the research of related entity finding (REF) is a meaningful work. In this paper we investigate the most important task of REF: Entity Ranking. The wrong-type entities which don't belong to the target-entity type will pollute the ranking result. We propose a novel method to filter wrong-type entities. We focus on the acquisition of seed entities and automatically extracting the common Wikipedia categories of target-entity type. Also we demonstrate how to filter wrong-type entities using the proposed model. The experimental results show our method can filter wrong-type entities effectively and improve the results of entity ranking. 0 0
An automatic approach for ontology-based feature extraction from heterogeneous textualresources Vicient C.
Sanchez D.
Moreno A.
Feature extraction
Information extraction
Engineering Applications of Artificial Intelligence English Data mining algorithms such as data classification or clustering methods exploit features of entities to characterise, group or classify them according to their resemblance. In the past, many feature extraction methods focused on the analysis of numerical or categorical properties. In recent years, motivated by the success of the Information Society and the WWW, which has made available enormous amounts of textual electronic resources, researchers have proposed semantic data classification and clustering methods that exploit textual data at a conceptual level. To do so, these methods rely on pre-annotated inputs in which text has been mapped to their formal semantics according to one or several knowledge structures (e.g. ontologies, taxonomies). Hence, they are hampered by the bottleneck introduced by the manual semantic mapping process. To tackle this problem, this paper presents a domain-independent, automatic and unsupervised method to detect relevant features from heterogeneous textual resources, associating them to concepts modelled in a background ontology. The method has been applied to raw text resources and also to semi-structured ones (Wikipedia articles). It has been tested in the Tourism domain, showing promising results. © 2012 Elsevier Ltd. All rights reserved. 0 0
An efficient incentive compatible mechanism to motivate wikipedia contributors Pramod M.
Mukhopadhyay S.
Gosh D.
Advances in Intelligent Systems and Computing English Wikipedia is the world's largest collaboratively edited source of encyclopedic information repository consisting almost 1.5 million articles and more than 90,000 contributors. Although, since its inception on 2001, the numbers of contributors were huge, A study made in 2009 found that members (contributors) may initially contribute to site for pleasure or being motivated by an internal drive to share his knowledge. But latter they are not motivated to edit the related articles so that quality of the articles could be improved [1] [5].In our paper we address above problem in economics perspective. Here we propose a novel scheme to motivate the contributors of Wikipedia with the mechanism design theory that is the most emerging tool at present to address the situation when data is privately held with the agents. 0 0
An empirical study on faculty perceptions and teaching practices of wikipedia Llados J.
Eduard Aibar
Lerga M.
Meseguer A.
Minguillon J.
Faculty perceptions
Online collaborative environments
Open resources
Web 2.0
Proceedings of the European Conference on e-Learning, ECEL English Some faculty members from different universities around the world have begun to use Wikipedia as a teaching tool in recent years. These experiences show, in most cases, very satisfactory results and a substantial improvement in various basic skills, as well as a positive influence on the students' motivation. Nevertheless and despite the growing importance of e-learning methodologies based on the use of the Internet for higher education, the use of Wikipedia as a teaching resource remains scarce among university faculty. Our investigation tries to identify which are the main factors that determine acceptance or resistance to that use. We approach the decision to use Wikipedia as a teaching tool by analyzing both the individual attributes of faculty members and the characteristics of the environment where they develop their teaching activity. From a specific survey sent to all faculty of the Universitat Oberta de Catalunya (UOC), pioneer and leader in online education in Spain, we have tried to infer the influence of these internal and external elements. The questionnaire was designed to measure different constructs: perceived quality of Wikipedia, teaching practices involving Wikipedia, use experience, perceived usefulness and use of 2.0 tools. Control items were also included for gathering information on gender, age, teaching experience, academic rank, and area of expertise. Our results reveal that academic rank, teaching experience, age or gender, are not decisive factors in explaining the educational use of Wikipedia. Instead, the decision to use it is closely linked to the perception of Wikipedia's quality, the use of other collaborative learning tools, an active attitude towards web 2.0 applications, and connections with the professional non-academic world. Situational context is also very important, since the use is higher when faculty members have got reference models in their close environment and when they perceive it is positively valued by their colleagues. As far as these attitudes, practices and cultural norms diverge in different scientific disciplines, we have also detected clear differences in the use of Wikipedia among areas of academic expertise. As a consequence, a greater application of Wikipedia both as a teaching resource and as a driver for teaching innovation would require much more active institutional policies and some changes in the dominant academic culture among faculty members. 0 0
An exploration of discussion threads in social news sites: A case study of the Reddit community Weninger T.
Zhu X.A.
Jangwhan Han
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013 English Social news and content aggregation Web sites have become massive repositories of valuable knowledge on a diverse range of topics. Millions of Web-users are able to leverage these platforms to submit, view and discuss nearly anything. The users themselves exclusively curate the content with an intricate system of submissions, voting and discussion. Furthermore, the data on social news Web sites is extremely well organized by its user-base, which opens the door for opportunities to leverage this data for other purposes just like Wikipedia data has been used for many other purposes. In this paper we study a popular social news Web site called Reddit. Our investigation looks at the dynamics of its discussion threads, and asks two main questions: (1) to what extent do discussion threads resemble a topical hierarchy? and (2) Can discussion threads be used to enhance Web search? We show interesting results for these questions on a very large snapshot several sub-communities of the Reddit Web site. Finally, we discuss the implications of these results and suggest ways by which social news Web site's can be used to perform other tasks. Copyright 2013 ACM. 0 0
An index for efficient semantic full-text search Holger Bast
Buchhold B.
Query processing
Semantic full-text search
International Conference on Information and Knowledge Management, Proceedings English In this paper we present a novel index data structure tailored towards semantic full-text search. Semantic full-text search, as we call it, deeply integrates keyword-based full-text search with structured search in ontologies. Queries are SPARQL-like, with additional relations for specifying word-entity co-occurrences. In order to build such queries the user needs to be guided. We believe that incremental query construction with context-sensitive suggestions in every step serves that purpose well. Our index has to answer queries and provide such suggestions in real time. We achieve this through a novel kind of posting lists and query processing, avoiding very long (intermediate) result lists and expensive (non-local) operations on these lists. In an evaluation of 8000 queries on the full English Wikipedia (40 GB XML dump) and the YAGO ontology (26.6 million facts), we achieve average query and suggestion times of around 150ms. Copyright is held by the owner/author(s). 0 0
An initial analysis of semantic wikis Gil Y.
Knight A.
Zhang K.
Lei Zhang
Sethi R.
Semantic web
Semantic wiki
Social knowledge collection
International Conference on Intelligent User Interfaces, Proceedings IUI English Semantic wikis augment wikis with semantic properties that can be used to aggregate and query data through reasoning. Semantic wikis are used by many communities, for widely varying purposes such as organizing genomic knowledge, coding software, and tracking environmental data. Although wikis have been analyzed extensively, there has been no published analysis of the use of semantic wikis. We carried out an initial analysis of twenty semantic wikis selected for their diverse characteristics and content. Based on the number of property edits per contributor, we identified several patterns to characterize community behaviors that are common to groups of wikis. 0 0
An inter-wiki page data processor for a M2M system Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Sensor Network
Social Network
Proceedings - 2nd IIAI International Conference on Advanced Applied Informatics, IIAI-AAI 2013 English A data processor, which inputs data from wiki pages, processes the data, and outputs the processed data on a wiki page, is proposed. This data processor is designed for a Machine-to-Machine (M2M) system, which uses Arduino, Android, and Wiki software. This processor is controlled by the program which is written on a wiki page. This M2M system consists of mobile terminals and web sites with wiki software. A mobile terminal of the system consists of an Android terminal and it may have an Arduino board with sensors and actuators. The mobile terminal can read data from not only the sensors in the Arduino board but also wiki pages on the Internet. The input data may be processed by the data processor of this paper. The processed data may be sent to a wiki page. The mobile terminal can control the actuators of the Arduino board by reading commands on the wiki page or by running the program of the processor. This system realizes an open communication forum for not only people but also for machines. 0 0
An investigation of the relationship between the amount of extra-textual data and the quality of Wikipedia articles Himoro M.Y.
Hanada R.
Marco Cristo
Pimentel M.D.G.C.
Content quality
Extra-textual data
WebMedia 2013 - Proceedings of the 19th Brazilian Symposium on Multimedia and the Web English Wikipedia, a web-based collaboratively maintained free encyclopedia, is emerging as one of the most important websites on the internet. However, its openness raises many concerns about the quality of the articles and how to assess it automatically. In the Portuguese-speaking Wikipedia, articles can be rated by bots and by the community. In this paper, we investigate the correlation between these ratings and the count of media items (namely images and sounds) through a series of experiments. Our results show that article ratings and the count of media items are correlated. 0 0
An open conceptual framework for operationalising collective awareness and social sensing Di Maio P.
Ure J.
Systems engineering
ACM International Conference Proceeding Series English Substantial EU resources are being invested in research and practice emerging from the socio-technical convergence of networked technologies and social clusters, increasingly referred to as 'collective awareness' and 'social sensing' platforms. Novel concepts and tools are being developed to stimulate and promote technologies and environments, requiring some level of shared conceptualisation of the domain. This position paper identifies the need to capture and represent the knowledge and information in 'social sensing and collective awareness platforms' with minimal formalisms. It proposes steps toward the development of tools for collective development of shared conceptual models, to facilitate communication, knowledge sharing and collaboration in this emerging, and highly interdisciplinary research field. Copyright 0 0
An open-source toolkit for mining Wikipedia Milne D.
Witten I.H.
Ontology extraction
Semantic relatedness
Artificial Intelligence English The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. For developers and researchers it represents a giant multilingual database of concepts and semantic relations, a potential resource for natural language processing and many other research areas. This paper introduces the Wikipedia Miner toolkit, an open-source software system that allows researchers and developers to integrate Wikipedia's rich semantics into their own applications. The toolkit creates databases that contain summarized versions of Wikipedia's content and structure, and includes a Java API to provide access to them. Wikipedia's articles, categories and redirects are represented as classes, and can be efficiently searched, browsed, and iterated over. Advanced features include parallelized processing of Wikipedia dumps, machine-learned semantic relatedness measures and annotation features, and XML-based web services. Wikipedia Miner is intended to be a platform for sharing data mining techniques. © 2012 Elsevier B.V. All rights reserved. 0 1
Analysis and forecasting of trending topics in online media streams Althoff T.
Borth D.
Hees J.
Andreas Dengel
Social media analysis. lifecycle forecast
Trending topics
MM 2013 - Proceedings of the 2013 ACM Multimedia Conference English Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems. Correctly utilizing trending topics requires a better under- standing of their various characteristics in different social media streams. To this end, we present the first comprehensive study across three major online and social media streams, Twitter, Google, and Wikipedia, covering thou- sands of trending topics during an observation period of an entire year. Our results indicate that depending on one's requirements one does not necessarily have to turn to Twitter for information about current events and that some media streams strongly emphasize content of specific categories. As our second key contribution, we further present a novel approach for the challenging task of forecasting the life cycle of trending topics in the very moment they emerge. Our fully automated approach is based on a nearest neighbor forecasting technique exploiting our assumption that semantically similar topics exhibit similar behavior. We demonstrate on a large-scale dataset of Wikipedia page view statistics that forecasts by the proposed approach are about 9-48k views closer to the actual viewing statistics compared to baseline methods and achieve a mean average percentage error of 45-19% for time periods of up to 14 days. Copyright 0 0
Analysis of cluster structure in large-scale English Wikipedia category networks Klaysri T.
Fenner T.
Lachish O.
Mark Levene
Papapetrou P.
Connected component
Graph structure analysis
Large-scale social network analysis
Wikipedia category network
Lecture Notes in Computer Science English In this paper we propose a framework for analysing the structure of a large-scale social media network, a topic of significant recent interest. Our study is focused on the Wikipedia category network, where nodes correspond to Wikipedia categories and edges connect two nodes if the nodes share at least one common page within the Wikipedia network. Moreover, each edge is given a weight that corresponds to the number of pages shared between the two categories that it connects. We study the structure of category clusters within the three complete English Wikipedia category networks from 2010 to 2012. We observe that category clusters appear in the form of well-connected components that are naturally clustered together. For each dataset we obtain a graph, which we call the t-filtered category graph, by retaining just a single edge linking each pair of categories for which the weight of the edge exceeds some specified threshold t. Our framework exploits this graph structure and identifies connected components within the t-filtered category graph. We studied the large-scale structural properties of the three Wikipedia category networks using the proposed approach. We found that the number of categories, the number of clusters of size two, and the size of the largest cluster within the graph all appear to follow power laws in the threshold t. Furthermore, for each network we found the value of the threshold t for which increasing the threshold to t + 1 caused the "giant" largest cluster to diffuse into two or more smaller clusters of significant size and studied the semantics behind this diffusion. 0 0
Analysis of students' behaviour based on participation and results achieved in wiki-based team assignments Putnik Z.
Budimac Z.
Ivanovic M.
Bothe K.
ACM International Conference Proceeding Series English In this paper, we tried to present part of our experiences with the use of Web 2.0, and in particular Wiki technology in education. We are presenting evidence that we had no significant problem in introducing Wikis in scheduling and organizing students work in "assignment solving" part of the course, and that our students embraced and gladly accepted this element of Web 2.0 we added to teaching. Analysis of attitudes and behavior of our students presented in this paper also changed some of our opinions and expectations about students' actions and manners, but we hope that those will only help us in further improvement of our course. Copyright 2013 ACM. 0 0
Analysis of students' participation patterns and learning presence in a wiki-based project Roussinos D.
Jimoyiannis A.
Computer-supported collaborative learning
Higher education
Project-based learning
Educational Media International English The educational applications of wikis are becoming very popular among instructors and researchers and they have captured their attention and imagination. This paper reports on the investigation of a wiki project designed to support university students' collaborative authoring and learning. The design framework of the wiki-based project is outlined and an analysis framework is proposed as the result of combining analysis of students' collaborative actions, e.g. edits and posts in the wiki pages. The framework was applied to investigate students' engagement, their contribution to the wiki content and the patterns of collaboration and content co-creation they followed during the project timeline. Our findings revealed different patterns of students' contribution to their group wiki as well as their different roles. The paper concludes with suggestions for future development of the framework and research in the field of wiki learning design. 0 0
Analyzing and Predicting Quality Flaws in User-generated Content: The Case of Wikipedia Maik Anderka Information quality
Quality Flaws
Quality Flaw Prediction
Bauhaus-Universität Weimar, Germany English Web applications that are based on user-generated content are often criticized for containing low-quality information; a popular example is the online encyclopedia Wikipedia. The major points of criticism pertain to the accuracy, neutrality, and reliability of information. The identification of low-quality information is an important task since for a huge number of people around the world it has become a habit to first visit Wikipedia in case of an information need. Existing research on quality assessment in Wikipedia either investigates only small samples of articles, or else deals with the classification of content into high-quality or low-quality. This thesis goes further, it targets the investigation of quality flaws, thus providing specific indications of the respects in which low-quality content needs improvement. The original contributions of this thesis, which relate to the fields of user-generated content analysis, data mining, and machine learning, can be summarized as follows:

(1) We propose the investigation of quality flaws in Wikipedia based on user-defined cleanup tags. Cleanup tags are commonly used in the Wikipedia community to tag content that has some shortcomings. Our approach is based on the hypothesis that each cleanup tag defines a particular quality flaw.

(2) We provide the first comprehensive breakdown of Wikipedia's quality flaw structure. We present a flaw organization schema, and we conduct an extensive exploratory data analysis which reveals (a) the flaws that actually exist, (b) the distribution of flaws in Wikipedia, and, (c) the extent of flawed content.

(3) We present the first breakdown of Wikipedia's quality flaw evolution. We consider the entire history of the English Wikipedia from 2001 to 2012, which comprises more than 508 million page revisions, summing up to 7.9 TB. Our analysis reveals (a) how the incidence and the extent of flaws have evolved, and, (b) how the handling and the perception of flaws have changed over time.

(4) We are the first who operationalize an algorithmic prediction of quality flaws in Wikipedia. We cast quality flaw prediction as a one-class classification problem, develop a tailored quality flaw model, and employ a dedicated one-class machine learning approach. A comprehensive evaluation based on human-labeled Wikipedia articles underlines the practical applicability of our approach.
0 0
Analyzing multi-dimensional networks within mediawikis Brian C. Keegan
Ceni A.
Smith M.A.
Data analysis
Network analysis
Social media
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English The MediaWiki platform supports popular socio-technical systems such as Wikipedia as well as thousands of other wikis. This software encodes and records a variety of rela- Tionships about the content, history, and editors of its arti- cles such as hyperlinks between articles, discussions among editors, and editing histories. These relationships can be an- Alyzed using standard techniques from social network analy- sis, however, extracting relational data from Wikipedia has traditionally required specialized knowledge of its API, in- formation retrieval, network analysis, and data visualization that has inhibited scholarly analysis. We present a soft- ware library called the NodeXL MediaWiki Importer that extracts a variety of relationships from the MediaWiki API and integrates with the popular NodeXL network analysis and visualization software. This library allows users to query and extract a variety of multidimensional relationships from any MediaWiki installation with a publicly-accessible API. We present a case study examining the similarities and dif- ferences between dierent relationships for the Wikipedia articles about \Pope Francis" and \Social media." We con- clude by discussing the implications this library has for both theoretical and methodological research as well as commu- nity management and outline future work to expand the capabilities of the library. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; D.2.8 [Software Engineering]: Metricscomplexity mea- sures, performance measures General Terms System. Copyright 2010 ACM. 0 0
Analyzing task and technology characteristics for enterprise architecture management tool support Hauder M.
Fiedler M.
Florian Matthes
Wust B.
Enterprise architecture management
Enterprise wiki
Tool support
Proceedings - IEEE International Enterprise Distributed Object Computing Workshop, EDOC English Adequate tool support for Enterprise Architecture (EA) and its respective management function is crucial for the success of the discipline in practice. However, currently available tools used in organizations focus on structured information neglecting the collaborative effort required for developing and planning the EA. As a result, utilization of these tools by stakeholders is often not sufficient and availability of EA products in the organization is limited. We investigate the integration of existing EA tools and Enterprise Wikis to tackle these challenges. We will describe how EA initiatives can benefit from the use and integration of an Enterprise Wiki with an existing EA tool. Main goal of our research is to increase the utilization of EA tools and enhance the availability of EA products by incorporating unstructured information content in the tools. For this purpose we analyze task characteristics that we revealed from the processes and task descriptions of the EA department of a German insurance organization and align them with technology characteristics of EA tools and Enterprise Wikis. We empirically evaluated these technology characteristics using an online survey with results from 105 organizations in previous work. We apply the technology-to-performance chain model to derive the fit between task and technology characteristics for EA management (EAM) tool support in order to evaluate our hypotheses. 0 0
Arabic WordNet semantic relations enrichment through morpho-lexical patterns Boudabous M.M.
Chaaben Kammoun N.
Khedher N.
Belguith L.H.
Sadat F.
Arabic WordNet
Morpho-lexical patterns
NooJ grammars
2013 1st International Conference on Communications, Signal Processing and Their Applications, ICCSPA 2013 English Arabic WordNet (AWN) ontology is one of the most interesting lexical resources for Modern Standard Arabic. Although, its development is based on Princeton WordNet, it suffers from some weaknesses such as the absence of some words and some semantic relations between synsets. In this paper we propose a linguistic method based on morpho-lexical patterns to add semantic relations between synsets in order to improve the AWN performance. This method relies on two steps: morpho-lexical patterns definition and Semantic relations enrichment. We will take advantage of defined patterns to propose a hybrid method for building Arabic ontology based on Wikipedia. 0 0
Arguments about deletion: How experience improves the acceptability of arguments in ad-hoc online task groups Jodi Schneider
Samp K.
Alexandre Passant
Stefan Decker
Argumentation schemes
Collaboration and conflict
Critical questions
Online argumentation
Peer production
English Increasingly, ad-hoc online task groups must make decisions about jointly created artifacts such as open source software and Wikipedia articles. Time-consuming and laborious attention to textual discussions is needed to make such decisions, for which computer support would be beneficial. Yet there has been little study of the argumentation patterns that distributed ad-hoc online task groups use in evaluation and decision-making. In a corpus of English Wikipedia deletion discussions, we investigate the argumentation schemes used, the role of the arguer's experience, and which arguments are acceptable to the audience. We report three main results: First, the most prevalent patterns are the Rules and Evidence schemes from Walton's catalog of argumentation schemes [34], which comprise 36% of arguments. Second, we find that familiarity with community norms correlates with the novices' ability to craft persuasive arguments. Third, acceptable arguments use community-appropriate rhetoric that demonstrate knowledge of policies and community values while problematic arguments are based on personal preference and inappropriate analogy to other cases. Copyright 2013 ACM. 0 0
Art History on Wikipedia, a Macroscopic Observation Doron Goldfarb
Max Arends
Josef Froschauer
Dieter Merkl
ArXiv English How are articles about art historical actors interlinked within Wikipedia? Lead by this question, we seek an overview on the link structure of a domain specific subset of Wikipedia articles. We use an established domain-specific person name authority, the Getty Union List of Artist Names (ULAN), in order to externally identify relevant actors. Besides containing consistent biographical person data, this database also provides associative relationships between its person records, serving as a reference link structure for comparison. As a first step, we use mappings between the ULAN and English Dbpedia provided by the Virtual Internet Authority File (VIAF). This way, we are able to identify 18,002 relevant person articles. Examining the link structure between these resources reveals interesting insight about the high level structure of art historical knowledge as it is represented on Wikipedia. 4 1
Assessing adoption of wikis in a Singapore secondary school: Using the UTAUT model Toh C.H. Social media
Technology adoption
Proceedings of the 2013 IEEE 63rd Annual Conference International Council for Education Media, ICEM 2013 English This quantitative study explores students' motivation towards the use of wikis to encourage self-directed learning (SDL) and collaborative learning (CoL). SDL and CoL are the goals for Singapore's Ministry of Education Information and Communication Technology Masterplan 3. Wikis were used in the project to support reflection and communication within groups. Five classes consisting of 181 Secondary Two students from a Singapore secondary school were involved in this project. The participants were selected based on their mandatory involvement in an integrated 5-month project initiated by the school. As the participation in the study was voluntary, 144 of the 181 students responded. Sixty nine of the students had no prior experience with wikis. Among the 75 students who had prior experience, most of them used wikis to obtain information while 46 of them shared information using wikis and 51 of them used it to work on collaborative projects with others. The variance explained by Unified Theory of Acceptance and Use of Technology (UTAUT) was 32.4 percent. The results showed that performance expectancy and facilitating condition were found to have a significant relationship with behavioural intention; while effort expectancy and social influence did not, contrary to many prior studies. Modifying the original UTAUT to include three other factors, attitude, trust and comfort level increased the variance explained to 37 percent. However, trust and comfort level were found to have a significant relationship with behavioural intention in the modified UTAUT. This study contributes to UTAUT's theoretical validity and empirical applicability and to the management of technology based initiatives in education. The findings provide insights to educators and schools considering the use of wikis and other forms of social media into their lessons. 0 0
Assessing individual learning and group knowledge in a wiki environment: An empirical analysis Agrifoglio R.
Metallo C.
Varriale L.
Ferrara M.
Casalino N.
De Marco M.
Educational processes
Individual learning
Knowledge sharing
Online collaborative learning
IASTED Multiconferences - Proceedings of the IASTED International Conference on Web-Based Education, WBE 2013 English The aim of this study was to investigate the collaborative learning in an online environment in order to assess the role of technology in determining individual learning of students. It describes the benefits of using a wiki in education and how it can allow students to work together to reach a common goal, giving them a sense of how writing can be effectively performed in collaboration. In collaborative learning with a wiki, students need to agree the structure, the contents, and the methods that are necessary to accomplish cooperative activities. The technology investigated is PBworks Education (PBwiki Edu), a collaborative tool that offers a variety of powerful information sharing and collaboration features in order to improve student's learning activities. Respect than traditional in-class course, PBwiki Edu facilitates the communication and encourages collaborative finding, shaping and sharing of knowledge, all of which are essential properties for student's learning process. A survey methodology was used in undergraduate students of "Management Information Systems" course who used PBwiki Edu for doing four reports concerning to case studies on specific lesson topics. With regard to these topics, we measured individual learning of students before (traditional learning) and after (online learning) any case study and compared these results through t-test method. Findings have shown significant differences between learning before and after case studies, pointing out the contribute of PBwiki Edu to student's learning. 0 0
Assessing quality score of wikipedia articles using mutual evaluation of editors and texts Yu Suzuki
Masatoshi Yoshikawa
Edit history
Peer review
International Conference on Information and Knowledge Management, Proceedings English In this paper, we propose a method for assessing quality scores of Wikipedia articles by mutually evaluating editors and texts. Survival ratio based approach is a major approach to assessing article quality. In this approach, when a text survives beyond multiple edits, the text is assessed as good quality, because poor quality texts have a high probability of being deleted by editors. However, many vandals, low quality editors, delete good quality texts frequently, which improperly decreases the survival ratios of good quality texts. As a result, many good quality texts are unfairly assessed as poor quality. In our method, we consider editor quality score for calculating text quality score, and decrease the impact on text quality by vandals. Using this improvement, the accuracy of the text quality score should be improved. However, an inherent problem with this idea is that the editor quality scores are calculated by the text quality scores. To solve this problem, we mutually calculate the editor and text quality scores until they converge. In this paper, we prove that the text quality score converges. We did our experimental evaluation, and confirmed that our proposed method could accurately assess the text quality scores. Copyright is held by the owner/author(s). 0 0
Assessing trustworthiness in collaborative environments Segall J.
Mayhew M.J.
Atighetchi M.
Greenstadt R.
Collaborative trust
Cyber analytics
ACM International Conference Proceeding Series English Collaborative environments, specifically those concerning in- formation creation and exchange, increasingly demand notions of trust and accountability. In the absence of explicit authority, the quality of information is often unknown. Using Wikipedia edit sequences as a use case scenario, we detail experiments in the determination of community-based user and document trust. Our results show success in answering the first of many research questions: Provided a user's edit history, is a given edit to a document positively contributing to its content? We detail how the ability to answer this question provides a preliminary framework towards a better model for collaborative trust and discuss subsequent areas of research necessary to broaden its utility and scope. Copyright 2012 ACM. 0 0
Attributing authorship of revisioned content Luca de Alfaro
Shavlovsky M.
Revisioned content
WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web English A considerable portion of web content, from wikis to collaboratively edited documents, to code posted online, is revisioned. We consider the problem of attributing authorship to such revisioned content, and we develop scalable attribution algorithms that can be applied to very large bodies of revisioned content, such as the English Wikipedia. Since content can be deleted, only to be later re-inserted, we introduce a notion of authorship that requires comparing each new revision with the entire set of past revisions. For each portion of content in the newest revision, we search the entire history for content matches that are statistically unlikely to occur spontaneously, thus denoting common origin. We use these matches to compute the earliest possible attribution of each word (or each token) of the new content. We show that this \earliest plausible attribution" can be computed efficiently via compact summaries of the past revision history. This leads to an algorithm that runs in time proportional to the sum of the size of the most recent revision, and the total amount of change (edit work) in the revision history. This amount of change is typically much smaller than the total size of all past revisions. The resulting algorithm can scale to very large repositories of revisioned content, as we show via experimental data over the English Wikipedia Copyright is held by the International World Wide Web Conference Committee (IW3C2). 0 0
Automated Decision support for human tasks in a collaborative system: The case of deletion in wikipedia Gelley B.S.
Suel T.
Automating human tasks
Collaborative system
Decision support
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English Wikipedia's low barriers to participation have the unintended effect of attracting a large number of articles whose topics do not meet Wikipedia's inclusion standards. Many are quickly deleted, often causing their creators to stop contributing to the site. We collect and make available several datasets of deleted articles, heretofore inaccessible, and use them to create a model that can predict with high precision whether or not an article will be deleted. We report precision of 98.6% and recall of 97.5% in the best case and high precision with lower, but still useful, recall, in the most difficult case. We propose to deploy a system utilizing this model on Wikipedia as a set of decision-support tools to help article creators evaluate and improve their articles before posting, and new article patrollers make more informed decisions about which articles to delete and which to improve. Categories and Subject Descriptors H.5.3. Collaborative Computing; Computer Supported Collaborative Work General Terms Measurement, Performance, Human Factors,. Copyright 2010 ACM. 0 0
Automated non-content word list generation using hLDA Krug W.
Tomlinson M.T.
FLAIRS 2013 - Proceedings of the 26th International Florida Artificial Intelligence Research Society Conference English In this paper, we present a language-independent method for the automatic, unsupervised extraction of non-content words from a corpus of documents. This method permits the creation of word lists that may be used in place of traditional function word lists in various natural language processing tasks. As an example we generated lists of words from a corpus of English, Chinese, and Russian posts extracted from Wikipedia articles and Wikipedia Wikitalk discussion pages. We applied these lists to the task of authorship attribution on this corpus to compare the effectiveness of lists of words extracted with this method to expert-created function word lists and frequent word lists (a common alternative to function word lists). hLDA lists perform comparably to frequent word lists. The trials also show that corpus-derived lists tend to perform better than more generic lists, and both sets of generated lists significantly outperformed the expert lists. Additionally, we evaluated the performance of an English expert list on machine translations of our Chinese and Russian documents, showing that our method also outperforms this alternative. Copyright © 2013, Association for the Advancement of Artificial Intelligence. All rights reserved. 0 0
Automated query learning with Wikipedia and genetic programming Pekka Malo
Pyry Siitari
Ankur Sinha
Automatic indexing
Concept recognition
Genetic programming
Information filtering
Query definition
Artificial Intelligence English Most of the existing information retrieval systems are based on bag-of-words model and are not equipped with common world knowledge. Work has been done towards improving the efficiency of such systems by using intelligent algorithms to generate search queries, however, not much research has been done in the direction of incorporating human-and-society level knowledge in the queries. This paper is one of the first attempts where such information is incorporated into the search queries using Wikipedia semantics. The paper presents Wikipedia-based Evolutionary Semantics (Wiki-ES) framework for generating concept based queries using a set of relevance statements provided by the user. The query learning is handled by a co-evolving genetic programming procedure. To evaluate the proposed framework, the system is compared to a bag-of-words based genetic programming framework as well as to a number of alternative document filtering techniques. The results obtained using Reuters newswire documents are encouraging. In particular, the injection of Wikipedia semantics into a GP-algorithm leads to improvement in average recall and precision, when compared to a similar system without human knowledge. A further comparison against other document filtering frameworks suggests that the proposed GP-method also performs well when compared with systems that do not rely on query-expression learning. © 2012 Elsevier B.V. All rights reserved. 0 1
Automatic extraction of Polish language errors from text edition history Grundkiewicz R. Error corpora
Language errors detection
Data mining
Lecture Notes in Computer Science English There are no large error corpora for a number of languages, despite the fact that they have multiple applications in natural language processing. The main reason underlying this situation is a high cost of manual corpora creation. In this paper we present the methods of automatic extraction of various kinds of errors such as spelling, typographical, grammatical, syntactic, semantic, and stylistic ones from text edition histories. By applying of these methods to the Wikipedia's article revision history, we created the large and publicly available corpus of naturally-occurring language errors for Polish, called PlEWi. Finally, we analyse and evaluate the detected error categories in our corpus. 0 0
Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms Joorabchi A.
Mahdi A.E.
Genetic algorithms
Keyphrase annotation
Keyphrase indexing
Metadata generation
Scientific digital libraries
Subject metadata
Text mining
Journal of Information Science English Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents to both human readers and information retrieval systems. This article describes a machine learning-based keyphrase annotation method for scientific documents that utilizes Wikipedia as a thesaurus for candidate selection from documents' content. We have devised a set of 20 statistical, positional and semantical features for candidate phrases to capture and reflect various properties of those candidates that have the highest keyphraseness probability. We first introduce a simple unsupervised method for ranking and filtering the most probable keyphrases, and then evolve it into a novel supervised method using genetic algorithms. We have evaluated the performance of both methods on a third-party dataset of research papers. Reported experimental results show that the performance of our proposed methods, measured in terms of consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised and unsupervised methods. 0 0
Automatic readability classification of crowd-sourced data based on linguistic and information-theoretic features Zahurul Islam
Alexander Mehler
Evaluation of features
Information transmission
Text readability
Computacion y Sistemas English This paper presents a classifier of text readability based on information-theoretic features. The classifier was developed based on a linguistic approach to readability that explores lexical, syntactic and semantic features. For this evaluation we extracted a corpus of 645 articles from Wikipedia together with their quality judgments. We show that information-theoretic features perform as well as their linguistic counterparts even if we explore several linguistic levels at once. 0 0
Automatic summarization of events from social media Chua F.C.T.
Asur S.
Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013 English Social media services such as Twitter generate phenomenal volume of content for most real-world events on a daily basis. Digging through the noise and redundancy to understand the important aspects of the content is a very challenging task. We propose a search and summarization framework to extract relevant representative tweets from a time-ordered sample of tweets to generate a coherent and concise summary of an event. We introduce two topic models that take advantage of temporal correlation in the data to extract relevant tweets for summarization. The summarization framework has been evaluated using Twitter data on four real-world events. Evaluations are performed using Wikipedia articles on the events as well as using Amazon Mechanical Turk (MTurk) with human readers (MTurkers). Both experiments show that the proposed models outperform traditional LDA and lead to informative summaries. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 0 0
Automatic topic ontology construction using semantic relations from wordnet and wikipedia Subramaniyaswamy V. Open directory project (ODP)
Semantic web
Topic ontology
Web ontology language (OWL)
International Journal of Intelligent Information Technologies English Due to the explosive growth of web technology, a huge amount of information is available as web resources over the Internet. Therefore, in order to access the relevant content from the web resources effectively, considerable attention is paid on the semantic web for efficient knowledge sharing and interoperability. Topic ontology is a hierarchy of a set of topics that are interconnected using semantic relations, which is being increasingly used in the web mining techniques. Reviews of the past research reveal that semiautomatic ontology is not capable of handling high usage. This shortcoming prompted the authors to develop an automatic topic ontology construction process. However, in the past many attempts have been made by other researchers to utilize the automatic construction of ontology, which turned out to be challenging due to time, cost and maintenance. In this paper, the authors have proposed a corpus based novel approach to enrich the set of categories in the ODP by automatically identifying the concepts and their associated semantic relationship with corpus based external knowledge resources, such as Wikipedia and WordNet. This topic ontology construction approach relies on concept acquisition and semantic relation extraction. A Jena API framework has been developed to organize the set of extracted semantic concepts, while Protégé provides the platform to visualize the automatically constructed topic ontology. To evaluate the performance, web documents were classified using SVM classifier based on ODP and topic ontology. The topic ontology based classification produced better accuracy than ODP. Copyright 0 0
Automatically building templates for entity summary construction Li P.
Yafang Wang
Jian Jiang
Pattern mining
Summary template
Information Processing and Management English In this paper, we propose a novel approach to automatic generation of summary templates from given collections of summary articles. We first develop an entity-aspect LDA model to simultaneously cluster both sentences and words into aspects. We then apply frequent subtree pattern mining on the dependency parse trees of the clustered and labeled sentences to discover sentence patterns that well represent the aspects. Finally, we use the generated templates to construct summaries for new entities. Key features of our method include automatic grouping of semantically related sentence patterns and automatic identification of template slots that need to be filled in. Also, we implement a new sentence compression algorithm which use dependency tree instead of parser tree. We apply our method on five Wikipedia entity categories and compare our method with three baseline methods. Both quantitative evaluation based on human judgment and qualitative comparison demonstrate the effectiveness and advantages of our method. © 2012 Elsevier Ltd. All rights reserved. 0 0
Automating document annotation using open source knowledge Apoorv Singhal
Kasturi R.
Srivastava J.
Document summarization
Global context
Google Scholar
Proceedings - 2013 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013 English Annotating documents with relevant and comprehensive keywords offers invaluable assistance to the readers to quickly overview any document. The problem of document annotation is addressed in the literature under two broad classes of techniques namely, key phrase extraction and key phrase abstraction. In this paper, we propose a novel approach to generate summary phrases for research documents. Given the dynamic nature of scientific research, it has become important to incorporate new and popular scientific terminologies in document annotations. For this purpose, we have used crowd-source knowledge bases like Wikipedia and WikiCFP (a open source information source for call for papers) for automating key phrase generation. Also, we have taken into account the lack of availability of the document's content (due to protective policies) and developed a global context based key-phrase identification approach. We show that given only the title of a document, the proposed approach generates its global context information using academic search engines like Google Scholar. We evaluated the performance of the proposed approach on real-world dataset obtained from a computer science research document corpus. We quantitatively evaluated the performance of the proposed approach and compared it with two baseline approaches. 0 0
Beyond Responsive Regulation: The expanding role of non-state actors in the regulatory process Grabosky P. Citizen participation
Private regulation
Regulatory pluralism
Regulation and Governance English This comment extends the vision of Responsive Regulation by noting subsequent developments in regulatory pluralism, in particular those occurring under private auspices. The apparent weakening or withdrawal of state regulatory institutions has inspired considerable regulatory activity on the part of non-state actors. In addition, the concurrent growth and pervasiveness of digital technology have greatly facilitated the involvement of individual citizens in non-state regulatory activity. However, the full implications of what might be called "wiki-regulation" remain to be seen. The risks that accompany private regulation may include the lack of accountability of non-state regulatory actors, and the possibility of their failure. There is also a risk that with the increasing salience of what Vogel calls "civil regulation," state regulatory institutions may atrophy, or fail to develop at all. 0 0
Beyond open source software: Framework and implications for open content research Chitu Okoli
Carillo K.D.A.
Creative Commons
Free cultural works
Libre software
Open content
Open knowledge
Open Source Software
ECIS 2013 - Proceedings of the 21st European Conference on Information Systems English The same open source philosophy that has been traditionally applied to software development can be applied to the collaborative creation of non-software information products, such as books, music and video. Such products are generically referred to as open content. Due largely to the success of large projects such as Wikipedia and the Creative Commons, open content has gained increasing attention not only in the popular media, but also in scholarly research. It is important to investigate the workings of the open source process in these new media of expression. This paper introduces the scope of emerging research on the open content phenomenon beyond open source software. We develop a framework for categorizing copyrightable works as utilitarian, factual, aesthetic or opinioned works. Based on these categories, we review some key theory-driven findings from open source software research and assess the applicability of extending their implications to open content. We present a research agenda that integrates the findings and proposes a list of research topics that can help lay a solid foundation for open content research. 0 0
Blogs, wikis and social networking sites: A cross institutional survey amongst Greek students Ponis S.T.
Gioti H.
Greek students
Social networking sites
Web 2.0
Int. J. Web Based Communities English In this paper, we attempt to explore the penetration of Web 2.0 technologies amongst Greek students, determine their level of usage and explore the students' opinions and perceptions regarding their usefulness for learning and educational purposes. In that context, we present the initial results of a survey-based cross-institutional study, conducted between September 15 and October 30, 2010, on a sample of five hundred undergraduate students from the two oldest university establishments in Greece. Our survey results reveal that social networking sites despite being by far, the most popular Web 2.0 service amongst Greek students, present the lowest perceived value in regards of the service's usefulness for educational and studying support purposes. On the other hand, blogs and wikis, which are considered educationally more useful by students, present low percentages of systematic use and content contribution and even lower percentages of ownership. Following the initial descriptive analysis of our cross institutional survey data presented in this paper, we are in the process of conducting a series of statistical tests for identifying significant correlations between variables and testing a set of prescribed research hypotheses. Copyright 0 0
BlueFinder: Recommending wikipedia links using DBpedia properties Torres D.
Hala Skaf-Molli
Pascal Molli
Diaz A.
Proceedings of the 3rd Annual ACM Web Science Conference, WebSci 2013 English DBpedia knowledge base has been built from data extracted from Wikipedia. However, many existing relations among resources in DBpedia are missing links among articles from Wikipedia. In some cases, adding these links into Wikipedia will enrich Wikipedia content and therefore will enable better navigation. In previous work, we proposed PIA algorithm that predicts the best link to connect two articles in Wikipedia corresponding to those related by a semantic property in DB-pedia and respecting the Wikipedia convention. PIA calculates this link as a path query. After introducing PIA results in Wikipedia, most of them were accepted by the Wikipedia community. However, some were rejected because PIA predicts path queries that are too general. In this paper, we report the BlueFinder collaborative filtering algorithm that fixes PIA miscalculation. It is sensible to the specificity of the resource types. According to the conducted experimentation we found out that BlueFinder is a better solution than PIA because it solves more cases with a better recall. Copyright 2013 ACM. 0 0
Bookmark recommendation in social bookmarking services using Wikipedia Yoshida T.
Inoue U.
2013 IEEE/ACIS 12th International Conference on Computer and Information Science, ICIS 2013 - Proceedings English Social bookmarking systems allow users to attach freely chosen keywords as tags to bookmarks of web pages. These tags are used to recommend relevant bookmarks to other users. However, there is no guarantee that every user get enough bookmark recommended, because of the diversity of tags. In this paper, we propose a personalized recommender system using Wikipedia. Our system extends a tag set to find similar users and relevant bookmarks by using the Wikipedia category database. The experimental results show that significant increase of relevant bookmarks recommended without notable increase of the noise. 0 0
Boosting cross-lingual knowledge linking via concept annotation Zhe Wang
Jing-Woei Li
Tang J.
IJCAI International Joint Conference on Artificial Intelligence English Automatically discovering cross-lingual links (CLs) between wikis can largely enrich the cross-lingual knowledge and facilitate knowledge sharing across different languages. In most existing approaches for cross-lingual knowledge linking, the seed CLs and the inner link structures are two important factors for finding new CLs. When there are insufficient seed CLs and inner links, discovering new CLs becomes a challenging problem. In this paper, we propose an approach that boosts cross-lingual knowledge linking by concept annotation. Given a small number of seed CLs and inner links, our approach first enriches the inner links in wikis by using concept annotation method, and then predicts new CLs with a regression-based learning model. These two steps mutually reinforce each other, and are executed iteratively to find as many CLs as possible. Experimental results on the English and Chinese Wikipedia data show that the concept annotation can effectively improve the quantity and quality of predicted CLs. With 50,000 seed CLs and 30% of the original inner links in Wikipedia, our approach discovered 171,393 more CLs in four runs when using concept annotation. 0 0
Boot-strapping language identifiers for short colloquial postings Goldszmidt M.
Najork M.
Paparizos S.
Language Identification
Lecture Notes in Computer Science English There is tremendous interest in mining the abundant user generated content on the web. Many analysis techniques are language dependent and rely on accurate language identification as a building block. Even though there is already research on language identification, it focused on very 'clean' editorially managed corpora, on a limited number of languages, and on relatively large-sized documents. These are not the characteristics of the content to be found in say, Twitter or Facebook postings, which are short and riddled with vernacular. In this paper, we propose an automated, unsupervised, scalable solution based on publicly available data. To this end we thoroughly evaluate the use of Wikipedia to build language identifiers for a large number of languages (52) and a large corpus and conduct a large scale study of the best-known algorithms for automated language identification, quantifying how accuracy varies in correlation to document size, language (model) profile size and number of languages tested. Then, we show the value in using Wikipedia to train a language identifier directly applicable to Twitter. Finally, we augment the language models and customize them to Twitter by combining our Wikipedia models with location information from tweets. This method provides massive amount of automatically labeled data that act as a bootstrapping mechanism which we empirically show boosts the accuracy of the models. With this work we provide a guide and a publicly available tool [1] to the mining community for language identification on web and social data. 0 0
Bounding Boundaries: The Construction of Geoengineering on Wikipedia Nils Markusson
Andreas Kaltenbrunner
David Laniado
Tommaso Venturini
Climate Geoengineering Governance Working Paper Series: 00 5 Definitions and classifications of geoengineering are fluid and contested. Wikipedia offers an opportunity to study how people negotiate and construct these definitions. Geoengineering related Wikipedia articles were identified in an overall data set of climate change related articles, with data on both article inter-linkage and the commenting activity of article editors. This enabled analysis of how geoengineering is constructed on Wikipedia, in itself and in relation to wider climate change discourse. The main finding is that a distinction is made on Wikipedia between two groups of geoengineering methods. On the one hand, there is a group of land-based sequestration technologies, strongly related to adaptation and mitigation discourse, and on the other hand a set of geoengineering technologies, including solar radiation management, ocean iron fertilisation, weather modification and planetary engineering, that is relatively separate from the overall climate change discourse on Wikipedia. 0 0
Building, maintaining, and using knowledge bases: A report from the trenches Deshpande O.
Lamba D.S.
Tourn M.
Sanmay Das
Subramaniam S.
Rajaraman A.
Harinarayan V.
Doan A.
Data integration
Human curation
Information extraction
Knowledge base
Social media
Proceedings of the ACM SIGMOD International Conference on Management of Data English A knowledge base (KB) contains a set of concepts, instances, and relationships. Over the past decade, numerous KBs have been built, and used to power a growing array of applications. Despite this flurry of activities, however, surprisingly little has been published about the end-to-end process of building, maintaining, and using such KBs in industry. In this paper we describe such a process. In particular, we describe how we build, update, and curate a large KB at Kosmix, a Bay Area startup, and later at WalmartLabs, a development and research lab of Walmart. We discuss how we use this KB to power a range of applications, including query understanding, Deep Web search, in-context advertising, event monitoring in social media, product search, social gifting, and social mining. Finally, we discuss how the KB team is organized, and the lessons learned. Our goal with this paper is to provide a real-world case study, and to contribute to the emerging direction of building, maintaining, and using knowledge bases for data management applications. Copyright 0 0
C Arsan T.
Sen R.
Ersoy B.
Devri K.K.
Lecture Notes in Electrical Engineering English In this paper, we design and implement a novel all-in-one Media Center that can be directly connected to a high-definition television (HDTV). C# programming is used for developing modular structured media center for home entertainment. Therefore it is possible and easy to add new limitless number of modules and software components. The most importantly, user interface is designed by considering two important factors; simplicity and tidiness. Proposed media center provides opportunities to users to have an experience on listening to music/radio, watching TV, connecting to Internet, online Internet videos, editing videos, Internet connection to pharmacy on duty, checking weather conditions, song lyrics, CD/DVD burning, connecting to Wikipedia. All the modules and design steps are explained in details for user friendly cost effective all-in-one media center. 0 0
COLLEAP - COntextual Language LEArning Pipeline Wloka B.
Werner Winiwarter
Language learning
Natural Language Processing
Web crawling
Lecture Notes in Computer Science English In this paper we present a concept as well as a prototype of a tool pipeline to utilize the abundant information available on the World Wide Web for contextual, user driven creation and display of language learning material. The approach is to capture Wikipedia articles of the user's choice by crawling, to analyze the linguistic aspects of the text via natural language processing and to compile the gathered information into a visually appealing presentation of enriched language information. The tool is designed to address the Japanese language, with a focus on kanji, the pictographic characters used in Japanese scripture. 0 0
Can a Wiki be used as a knowledge service platform? Lin F.-R.
Wang C.-R.
Huang H.-Y.
Knowledge activity map
Knowledge service
Advances in Intelligent Systems and Computing English Many knowledge services have been developed as a matching platform for knowledge demanders and providers. However, most of these knowledge services have a common drawback that they cannot provide a list of experts corresponding to the knowledge demanders' need. Knowledge demanders have to post their questions in a public area and then wait patiently until corresponding knowledge providers appear. In order to facilitate knowledge demanders to acquire knowledge, this study proposes a knowledge service system based on Wikipedia to actively inform potential knowledge providers on behalf of knowledge demanders. This study also developed a knowledge activity map system used for the knowledge service system to identify Wikipedians' knowledge domains. The experimental evaluation results show that the knowledge service system is acceptable by leader users on Wikipedia, in which their domain knowledge can be identified and represented on their knowledge activity maps. 0 0
Can the Web turn into a digital library? Maurer H.
Mueller H.
Digital libary
Information consolidation
WWW library
International Journal on Digital Libraries English There is no doubt that the enormous amounts of information on the WWW are influencing how we work, live, learn and think. However, information on the WWW is in general too chaotic, not reliable enough and specific material often too difficult to locate that it cannot be considered a serious digital library. In this paper we concentrate on the question how we can retrieve reliable information from the Web, a task that is fraught with problems, but essential if the WWW is supposed to be used as serious digital library. It turns out that the use of search engines has many dangers. We will point out some of the possible ways how those dangers can be reduced and how dangerous traps can be avoided. Another approach to find useful information on the Web is to use "classical" resources of information like specialized dictionaries, lexica or encyclopaedias in electronic form, such as the Britannica. Although it seemed for a while that such resources might more or less disappear from the Web due to attempts such as Wikipedia, some to the classical encyclopaedias and specialized offerings have picked up steam again and should not be ignored. They do sometimes suffer from what we will call the "wishy-washy" syndrome explained in this paper. It is interesting to note that Wikipedia which is also larger than all other encyclopaedias (at least the English version) is less afflicted by this syndrome, yet has some other serious drawbacks. We discuss how those could be avoided and present a system that is halfway between prototype and production system that does take care of many of the aforementioned problems and hence may be a model for further undertakings in turning (part of) the Web into a useable digital library. 0 0
Capturing intra-operative safety information using surgical wikis Edwards M.
Agha R.
Coughlan J.
Intra-operative care
Surgical procedures
Informatics for Health and Social Care English Background Expert surgeons use a mass of intra-operative information, as well as pre-and post-operative information to complete operations safely. Trainees acquired this intra-operative knowledge at the operating table, now largely diminished by the working time directive. Wikis offer unexplored approaches to capturing and disseminating expert knowledge to further promote safer surgery for the trainee.Methods Grafting an abdominal aortic aneurysm represents a potentially high-risk operation demanding extreme safety measures. Operative details, presented on a surgical wiki in the form of a script and content analysed to classify types of safety information.Results The intra-operative part of the script contained 2,743 items of essential surgical information, comprising 21 sections, 405 steps and 2,317 items of back-up information; 155 (5.7%) of them were also specific intra-operative safety checks. Best case scenarios consisted of 1,077 items of intra-operative information, 69 of which were safety checks. Worse case and rare scenarios required a further 1,666 items of information, including 86 safety checks.Conclusions Wikis are relevant to surgical practice specifically as a platform for knowledge sharing and optimising the available operating time of trainees, as a very large amount of minutely detailed information essential for a safe major operation can be captured. 0 0
Characterizing and curating conversation threads: Expansion, focus, volume, re-entry Backstrom L.
Kleinberg J.
Lena Lee
Cristian Danescu-Niculescu-Mizil
Comment threads
Feed ranking
On-line communities
Social network
User generated content
WSDM 2013 - Proceedings of the 6th ACM International Conference on Web Search and Data Mining English Discussion threads form a central part of the experience on many Web sites, including social networking sites such as Facebook and Google Plus and knowledge creation sites such as Wikipedia. To help users manage the challenge of allocating their attention among the discussions that are relevant to them, there has been a growing need for the algorithmic curation of on-line conversations - - the development of automated methods to select a subset of discussions to present to a user. Here we consider two key sub-problems inherent in conversational curation: length prediction - - predicting the number of comments a discussion thread will receive - - and the novel task of re-entry prediction - - predicting whether a user who has participated in a thread will later contribute another comment to it. The first of these sub-problems arises in estimating how interesting a thread is, in the sense of generating a lot of conversation; the second can help determine whether users should be kept notified of the progress of a thread to which they have already contributed. We develop and evaluate a range of approaches for these tasks, based on an analysis of the network structure and arrival pattern among the participants, as well as a novel dichotomy in the structure of long threads. We find that for both tasks, learning-based approaches using these sources of information. 0 0
Chinese text filtering based on domain keywords extracted from Wikipedia Xiaolong Wang
Hua Li
Jia Y.
Jin S.
Text filtering
User profile
Lecture Notes in Electrical Engineering English Several machine learning and information retrieval algorithms have been used for text filtering. All these methods have a common ground that they need positive and negative examples to build user profile. However, not all applications can get good training documents. In this paper, we present a Wikipedia based method to build user profile without any other training documents. The proposed method extracts keywords of a special category from Wikipedia taxonomy and computes the weights of the extracted keywords based on Wikipedia pages. Experiment results on Chinese news text dataset SogouC show that the proposed method achieves good performance. 0 0
Classification of scientific publications according to library controlled vocabularies: A new concept matching-based approach Joorabchi A.
Mahdi A.E.
Automatic classification
Concept matching
Dewey Decimal Classification (DDC)
FAST subject headings
Information retrieval
Metadata generation
Scientific digital libraries and repositories
Subject indexing
Subject metadata
Library Hi Tech English Purpose: This paper aims to report on the design and development of a new approach for automatic classification and subject indexing of research documents in scientific digital libraries and repositories (DLR) according to library controlled vocabularies such as DDC and FAST. Design/methodology/approach: The proposed concept matching-based approach (CMA) detects key Wikipedia concepts occurring in a document and searches the OPACs of conventional libraries via querying the WorldCat database to retrieve a set of MARC records which share one or more of the detected key concepts. Then the semantic similarity of each retrieved MARC record to the document is measured and, using an inference algorithm, the DDC classes and FAST subjects of those MARC records which have the highest similarity to the document are assigned to it. Findings: The performance of the proposed method in terms of the accuracy of the DDC classes and FAST subjects automatically assigned to a set of research documents is evaluated using standard information retrieval measures of precision, recall, and F1. The authors demonstrate the superiority of the proposed approach in terms of accuracy performance in comparison to a similar system currently deployed in a large scale scientific search engine. Originality/value: The proposed approach enables the development of a new type of subject classification system for DLR, and addresses some of the problems similar systems suffer from, such as the problem of imbalanced training data encountered by machine learning-based systems, and the problem of word-sense ambiguity encountered by string matching-based systems. 0 0
Clustering editors of wikipedia by editor's biases Nakamura A.
Yu Suzuki
Ishikawa Y.
Edit histories
Peer reviews
Proceedings - 2013 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013 English Wikipedia is an Internet encyclopedia where any user can edit articles. Because editors act on their own judgments, editors' biases are reflected in edit actions. When editors' biases are reflected in articles, the articles should have low credibility. However, it is difficult for users to judge which parts in articles have biases. In this paper, we propose a method of clustering editors by editors' biases for the purpose that we distinguish texts' biases by using editors' biases and aid users to judge the credibility of each description. If each text is distinguished such as by colors, users can utilize it for the judgments of the text credibility. Our system makes use of the relationships between editors: agreement and disagreement. We assume that editors leave texts written by editors that they agree with, and delete texts written by editors that they disagree with. In addition, we can consider that editors who agree with each other have similar biases, and editors who disagree with each other have different biases. Hence, the relationships between editors enable to classify editors by biases. In experimental evaluation, we verify that our proposed method is useful in clustering editors by biases. Additionally, we validate that considering the dependency between editors improves the clustering performance. 0 0
Co-constructing an essay: Collaborative writing in class and on wiki Ansarimoghaddam S.
Tan B.H.
Collaborative authoring
Face-to-face interaction
Individual essay
3L: Language, Linguistics, Literature English This paper compares the quality of students' individually written essays resulting from both collaborative writing through wiki and face-to-face collaborative writing. Face-to-face collaborative writing refers to in class meeting of students for writing essays collaboratively. The study employed a counterbalance research design. Participants of the study were thirty tertiary ESL students from one class. They were divided into two experiment groups with each comprising 15 students. Before the experiment, each participant wrote an essay. After that they were given two treatments of collaborative writing through wiki and face-to-face. The order of giving the two treatments was different for the two groups to eliminate any practice effect. After an introduction to the collaborative process, the participants wrote two argumentative essays in groups, and wrote two essays individually. After the experiment, a semi-structured interview was conducted as a triangulation measure. Results suggest that collaborative writing using the wiki software can be more effective, and more enjoyable, than collaboration resulting from face-to-face meeting. 0 0
Collaboration at the Troy University Libraries Boyd E.E.
Casey O.
Elder R.
Slay J.
Multi-campus library system
Online manual
Technical services
Web 2.0
Cataloging and Classification Quarterly English With relatively new staff in all the Troy University campus libraries technical services departments, it was critical to collaborate on policies and procedures for consistency. Developing an online manual housed on a wiki that could be used and contributed to by staff on all three campuses was essential to this goal. Multi-campus meetings and online discussions are additional methods we use to promote collaboration. This article will include a literature review of collaboration and wikis along with methods the Troy University Libraries Technical Services departments are using to establish communication across the campuses. 0 0
Collaborative development of a semantic wiki on forest management decision support Marques A.F.
Rosset C.
Rasinmaki J.
Vacik H.
Gordon S.
Nobre S.
Falcao A.
Weber D.
Michael Granitzer
Eriksson L.O.
Decision support
Forest management
Knowledge management
Semantic MediaWiki
Scandinavian Journal of Forest Research English Semantic wikis support collaboratively editing, categorising, interlinking and retrieving web pages for a group of experts working in a certain domain. The use of semantic technologies allows the expression of wiki content in a more structured way, which increases its potential use. This contribution presents an overview of the development process towards a semantic wiki related to a repository of forest decision support systems, including models, methods and data used, as well as case studies and lessons learned. An international group of experts took part in the conceptualisation of the semantic wiki (i.e. identification of wiki properties and forms), provided content and developed queries to analyse the information gathered. The resulting ForestDSS wiki gives an overview of the current use, development and application of forest decision support systems worldwide. Based on the experiences gathered during the process, some challenges are reported and conclusions on further developments are made. 0 0
Collaborative development of data curation profiles on a wiki platform: Experience from free and open source software projects and communities Sowe S.K.
Koji Zettsu
Cloud computing
Data curation
Data curation profiles
Floss communities
Open collaboration
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English Wiki technologies have proven to be versatile and successful in aiding collaborative authoring of web content. Multitude of users can collaboratively add, edit, and revise wiki pages on the fly, with ease. This functionality makes wikis ideal platforms to support research communities curate data. However, without appropriate customization and a model to support collaborative editing of pages, wikis will fall sort in providing the functionalities needed to support collaborative work. In this paper, we present the architecture and design of a wiki platform, as well as a model that allow scientific communities, especially disaster response scientists, collaborative edit and append data to their wiki pages. Our experience in the implementation of the platform on MediaWiki demonstrates how wiki technologies can be used to support data curation, and how the dynamics of the FLOSS development process, its user and developer communities are increasingly informing our understanding about supporting collaboration and coordination on wikis. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; D.2.10 [Software Engineering]: Design-methodologies, representation General Terms Design, Human Factors, Management, Theory. Copyright 2010 ACM. 0 0
Collecting interaction traces in distributed semantic wikis Le A.-H.
Lefevre M.
Cordier A.
Hala Skaf-Molli
Distributed semantic wikis
Interaction traces
Model of trace
Trace collection process
Trace-based reasoning
User assistance
ACM International Conference Proceeding Series English In the Kolow project, our general objective is to develop an assistance engine suitable for distributed applications. In order to provide contextualized and relevant assistance, we feed the assistance engine with interaction traces. Interaction traces record events occurring while users are interacting with applications. These traces become containers of valuable knowledge to providing assistance. Collecting interaction traces is a challenging issue that has been thoroughly studied in the context of local applications. In contrast, few approaches focus on collecting interaction traces in distributed applications. Yet, when applications are distributed, collecting interaction traces is even more challenging because new difficulties arise, such as data synchronization and multi-synchronous collaboration. In this paper, we propose a model and a tool for collecting traces in a distributed environment. The originality of the model is that it is tailored to fit distributed applications. We implemented the model in Collectra, a tool to collect interaction traces in distributed web applications. Collectra collects interaction traces and stores them in a dedicated trace-base management system. We report on the experiments we have conducted in order to evaluate performances of Collectra (both response time and memory space). Results of the experiments show that Collectra performs well and that it can be used to support the assistance tasks carried out by the assistance engine. Copyright 0 0
Collective action towards enhanced knowledge management of neglected and underutilised species: Making use of internet opportunities Hermann M.
Kwek M.J.
Khoo T.K.
Amaya K.
Google Books
Wikimedia Commons
Acta Horticulturae English The disproportionate use of crops - with a few species accounting for most of global food production - is being re-enforced by the considerable research, breeding and development efforts that make global crops so competitive vis-à-vis "neglected and underutilised species" (NUS). NUS promotional rhetoric, preaching to the converted, complaints about the discrimination of the "food of the poor" and the loss of traditional dietary habits are unlikely to revert the neglect of the vast majority of crop species. We need to lessen the supply and demand constraints that affect the production and consumption of NUS. NUS attributes relevant to consumers, nutrition and climate change need to be substantiated, demand for NUS stimulated, discriminating agricultural and trade policies amended, and donors convinced to make greater investments in NUS research and development. Much fascinating NUS research and development is underway, but much of this is dissipated amongst countries, institutions and taxa. Researchers operate in unsupportive environments and are often unaware of each other's work. Their efforts remain unrecognised as addressing global concerns. We suggest that the much-needed enhancement of NUS knowledge management should be at the centre of collective efforts of the NUS community. This will underpin future research and development advances as well as inform the formulation and advocacy of policies. This paper recommends that the NUS community make greater use of Internet knowledge repositories to deposit research results, publications and images into the public domain. As examples for such a low-cost approach, we assess the usefulness of Wikipedia, Google Books and Wikimedia Commons for the documentation and dissemination of NUS knowledge. We urge donors and administrators to promote and encourage the use of these and other public and electronically accessible repositories as sources of verification for the achievement of project and research outputs. 0 0
Collective learning paradigm for rapidly evolving curriculum: Facilitating student and content engagement via social media Agarwal N.
Ahmed F.
Classroom learning
Collective learning
Social media
Team learning
19th Americas Conference on Information Systems, AMCIS 2013 - Hyperconnected World: Anything, Anywhere, Anytime English Curriculum in the information systems discipline has been rapidly evolving. This is not only challenging for the instructors to cope with the velocity of change in the curriculum, but also for the students. This paper illustrates a model that leverages the integrated use of social media technologies to facilitate collective learning in a university teaching/learning environment. However, the model could be adapted to other organizational environments. The model demonstrates how various challenges encountered in collective learning can be addressed with the help of social media technologies. A case study is presented to demonstrate the model's applicability, feasibility, utility, and success in a senior-level social computing course at the University of Arkansas at Little Rock. An evolving, non-linear, and self-sustaining wiki portal is developed to encourage engagement between the content, students, and instructor. We further outline the student-centric, content-centric, and learning-centric advantages of the proposed model for the next generation learning environment. 0 0
Combining lexical and semantic features for short text classification Yang L.
Chenliang Li
Ding Q.
Li L.
Feature selection
Short text
Topic model
Procedia Computer Science English In this paper, we propose a novel approach to classify short texts by combining both their lexical and semantic features. We present an improved measurement method for lexical feature selection and furthermore obtain the semantic features with the background knowledge repository which covers target category domains. The combination of lexical and semantic features is achieved by mapping words to topics with different weights. In this way, the dimensionality of feature space is reduced to the number of topics. We here use Wikipedia as background knowledge and employ Support Vector Machine (SVM) as classifier. The experiment results show that our approach has better effectiveness compared with existing methods for classifying short texts. 0 0
Communities, artifacts, interaction and contribution on the web Eleni Stroulia Computer-supported collaboration
Social network
Virtual worlds
Web-based collaborative platforms
Lecture Notes in Computer Science English Today, most of us are members of multiple online communities, in the context of which we engage in a multitude of personal and professional activities. These communities are supported by different web-based platforms and enable different types of collaborative interactions. Through our experience with the development of and experimentation with three different such platforms in support of collaborative communities, we recognized a few core research problems relevant across all such tools, and we developed SociQL, a language, and a corresponding software framework, to study them. 0 0
Community detection from signed networks Sugihara T.
Xiaojiang Liu
Murata T.
Community detection
Signed network
Transactions of the Japanese Society for Artificial Intelligence English Many real-world complex systems can be modeled as networks, and most of them exhibit community structures. Community detection from networks is one of the important topics in link mining. In order to evaluate the goodness of detected communities, Newman modularity is widely used. In real world, however, many complex systems can be modeled as signed networks composed of positive and negative edges. Community detection from signed networks is not an easy task, because the conventional detection methods for normal networks cannot be applied directly. In this paper, we extend Newman modularity for signed networks. We also propose a method for optimizing our modularity, which is an efficient hierarchical agglomeration algorithm for detecting communities from signed networks. Our method enables us to detect communities from large scale real-world signed networks which represent relationship between users on websites such as Wikipedia, Slashdot and Epinions. 0 0
Comparing expert and non-expert conceptualisations of the land: An analysis of crowdsourced land cover data Comber A.
Brunsdon C.
Linda See
Steffen Fritz
Ian McCallum
Geographically Weighted Kernel
Land Cover
Volunteered Geographical Information (VGI)
Lecture Notes in Computer Science English This research compares expert and non-expert conceptualisations of land cover data collected through a Google Earth web-based interface. In so doing it seeks to determine the impacts of varying landscape conceptualisations held by different groups of VGI contributors on decisions that may be made using crowdsourced data, in this case to select the best global land cover dataset in each location. Whilst much other work has considered the quality of VGI, as yet little research has considered the impact of varying semantics and conceptualisations on the use of VGI in formal scientific analyses. This study found that conceptualisation of cropland varies between experts and non-experts. A number of areas for further research are outlined. 0 0
Competitive Intelligence 2.0 Tools Deschamps C. Audio and video watch
Competitive Intelligence 2.0 Tools
Crowdsourcing and RSS
Micro-blogging services
Personalized portals and widgets
Social bookmarking services
Social network
Competitive Intelligence 2.0: Organization, Innovation and Territory English [No abstract available] 0 0
Complementary information for Wikipedia by comparing multilingual articles Fujiwara Y.
Yu Suzuki
Konishi Y.
Akiyo Nadamoto
Lecture Notes in Computer Science English Information of many articles is lacking in Wikipedia because users can create and edit the information freely. We specifically examined the multilinguality of Wikipedia and proposed a method to complement information of articles which lack information based on comparing different language articles that have similar contents. However, much non-complementary information is unrelated to a user's browsing article in the results. Herein, we propose improvement of the comparison area based on the classified complementary target. 0 0
Computing semantic relatedness from human navigational paths on wikipedia Singer P.
Niebler T.
Strohmaier M.
Hotho A.
Semantic relatedness
WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web English This paper presents a novel approach for computing semantic relatedness between concepts on Wikipedia by using human navigational paths for this task. Our results suggest that human navigational paths provide a viable source for calculating semantic relatedness between concepts on Wikipedia. We also show that we can improve accuracy by intelligent selection of path corpora based on path characteristics indicating that not all paths are equally useful. Our work makes an argument for expanding the existing arsenal of data sources for calculating semantic relatedness and to consider the utility of human navigational paths for this task. 0 0
Computing semantic relatedness using Wikipedia features Hadj Taieb M.A.
Ben Aouicha M.
Ben Hamadou A.
Semantic analysis
Semantic relatedness
Wikipedia category graph
Word relatedness
Knowledge-Based Systems English Measuring semantic relatedness is a critical task in many domains such as psychology, biology, linguistics, cognitive science and artificial intelligence. In this paper, we propose a novel system for computing semantic relatedness between words. Recent approaches have exploited Wikipedia as a huge semantic resource that showed good performances. Therefore, we utilized the Wikipedia features (articles, categories, Wikipedia category graph and redirection) in a system combining this Wikipedia semantic information in its different components. The approach is preceded by a pre-processing step to provide for each category pertaining to the Wikipedia category graph a semantic description vector including the weights of stems extracted from articles assigned to the target category. Next, for each candidate word, we collect its categories set using an algorithm for categories extraction from the Wikipedia category graph. Then, we compute the semantic relatedness degree using existing vector similarity metrics (Dice, Overlap and Cosine) and a new proposed metric that performed well as cosine formula. The basic system is followed by a set of modules in order to exploit Wikipedia features to quantify better as possible the semantic relatedness between words. We evaluate our measure based on two tasks: comparison with human judgments using five datasets and a specific application "solving choice problem". Our result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches. © 2013 Elsevier B.V. All rights reserved. 0 0
Computing semantic relatedness using word frequency and layout information of wikipedia Chan P.
Hijikata Y.
Nishida S.
Layout information
Semantic relatedness
Wikipedia article
Word frequency
Proceedings of the ACM Symposium on Applied Computing English Computing the semantic relatedness between two words or phrases is an important problem for fields such as information retrieval and natural language processing. One state-of-the-art approach to solve the problem is Explicit Semantic Analysis (ESA). ESA uses the word frequency in Wikipedia articles to estimate the relevance, so the relevance of words with low frequency cannot always be well estimated. To improve the relevance estimate of the low frequency words, we use not only word frequency but also layout information in Wikipedia articles. Empirical evaluation shows that on the low frequency words, our method achieves better estimate of semantic relatedness over ESA. Copyright 2013 ACM. 0 0
Constructing a focused taxonomy from a document collection Olena Medelyan
Manion S.
Broekstra J.
Divoli A.
Huang A.-L.
Witten I.H.
Lecture Notes in Computer Science English We describe a new method for constructing custom taxonomies from document collections. It involves identifying relevant concepts and entities in text; linking them to knowledge sources like Wikipedia, DBpedia, Freebase, and any supplied taxonomies from related domains; disambiguating conflicting concept mappings; and selecting semantic relations that best group them hierarchically. An RDF model supports interoperability of these steps, and also provides a flexible way of including existing NLP tools and further knowledge sources. From 2000 news articles we construct a custom taxonomy with 10,000 concepts and 12,700 relations, similar in structure to manually created counterparts. Evaluation by 15 human judges shows the precision to be 89% and 90% for concepts and relations respectively; recall was 75% with respect to a manually generated taxonomy for the same domain. 0 0
Construction of a Japanese gazetteers for Japanese local toponym disambiguation Yoshioka M.
Fujiwara T.
Proceedings of the 7th Workshop on Geographic Information Retrieval, GIR 2013 English When processing toponym information in natural language text, it is crucial to have a good gazetteers. There are several well-organized gazetteers for English text, but they do not cover Japanese local toponyms. In this paper, we introduce a Japanese gazetteers based on Open Data (e.g., the Toponym database distributed by Japanese ministries, Wikipedia, and GeoNames) and propose a toponym disambiguation framework that uses the constructed gazetteers. We also evaluate our approach based on a blog corpus that contains place names with high ambiguity. 0 0
Consuming and creating: Early-adopting science teachers' perceptions and use of a wiki to support professional development Donnelly D.F.
Boniface S.
Knowledge building
Professional development
Teacher practice
Technology integration
Computers and Education English Many teachers have little opportunity to share and discuss their practice in the course of a normal school day beyond chance meetings in the staff room. Such a lack of opportunity can leave many teachers feeling isolated. However, online resources are continuously providing teachers with greater opportunities to engage with other teachers. This research studied early-adopting New Zealand science teachers' perceptions and integration of one such online resource, a wiki, for professional development. The wiki was developed to support teacher portfolios consisting of mediums called Content Representations (CoRes) and Pedagogical and Professional-experience Repertoires (PaP-eRs). Initial interviews were conducted with six teachers and were followed by case studies of three of these teachers. Data included pre/post interviews, field notes from feedback on observations, and teachers' use of the wiki. Findings discuss important factors organised around three themes in relation to teacher perceptions and engagement in knowledge sharing on a wiki: technology competence, technology utility, and technology resourcing. © 2013 Elsevier Ltd. All rights reserved. 0 0
Contributor profiles, their dynamics, and their importance in five Q&A sites Furtado A.
Andrade N.
Oliveira N.
Brasileiro F.
Datamining and machine learning
Empirical methods
Q&A sites
Studies of wikipedia/web
English Q&A sites currently enable large numbers of contributors to collectively build valuable knowledge bases. Naturally, these sites are the product of contributors acting in different ways - creating questions, answers or comments and voting in these -, contributing in diverse amounts, and creating content of varying quality. This paper advances present knowledge about Q&A sites using a multifaceted view of contributors that accounts for diversity of behavior, motivation and expertise to characterize their profiles in five sites. This characterization resulted in the definition of ten behavioral profiles that group users according to the quality and quantity of their contributions. Using these profiles, we find that the five sites have remarkably similar distributions of contributor profiles. We also conduct a longitudinal study of contributor profiles in one of the sites, identifying common profile transitions, and finding that although users change profiles with some frequency, the site composition is mostly stable over time. Copyright 2013 ACM. 0 0
Could someone please translate this? - Activity analysis of wikipedia article translation by non-experts Ari Hautasaari Activity analysis
English Wikipedia translation activities aim to improve the quality of the multilingual Wikipedia through article translation. We performed an activity analysis of the translation work done by individual English to Chinese non-expert translators, who translated linguistically complex Wikipedia articles in a laboratory setting. From the analysis, which was based on Activity Theory, and which examined both information search and translation activities, we derived three translation strategies that were used to inform the design of a support system for human translation activities in Wikipedia. Copyright 2013 ACM. 0 0
Crawling deep web entity pages He Y.
Xin D.
Ganti V.
Rajaraman S.
Shah N.
Deep-web crawl
Web data
WSDM 2013 - Proceedings of the 6th ACM International Conference on Web Search and Data Mining English Deep-web crawl is concerned with the problem of surfacing hidden content behind search interfaces on the Web. While many deep-web sites maintain document-oriented textual content (e.g., Wikipedia, PubMed, Twitter, etc.), which has traditionally been the focus of the deep-web literature, we observe that a significant portion of deep-web sites, including almost all online shopping sites, curate structured entities as opposed to text documents. Although crawling such entity-oriented content is clearly useful for a variety of purposes, existing crawling techniques optimized for document oriented content are not best suited for entity-oriented sites. In this work, we describe a prototype system we have built that specializes in crawling entity-oriented deep-web sites. We propose techniques tailored to tackle important subproblems including query generation, empty page filtering and URL deduplication in the specific context of entity oriented deep-web sites. These techniques are experimentally evaluated and shown to be effective. 0 0
Cross language prediction of vandalism on wikipedia using article views and revisions Tran K.-N.
Christen P.
Lecture Notes in Computer Science English Vandalism is a major issue on Wikipedia, accounting for about 2% (350,000+) of edits in the first 5 months of 2012. The majority of vandalism are caused by humans, who can leave traces of their malicious behaviour through access and edit logs. We propose detecting vandalism using a range of classifiers in a monolingual setting, and evaluated their performance when using them across languages on two data sets: the relatively unexplored hourly count of views of each Wikipedia article, and the commonly used edit history of articles. Within the same language (English and German), these classifiers achieve up to 87% precision, 87% recall, and F1-score of 87%. Applying these classifiers across languages achieve similarly high results of up to 83% precision, recall, and F1-score. These results show characteristic vandal traits can be learned from view and edit patterns, and models built in one language can be applied to other languages. 0 0
Cross lingual entity linking with bilingual topic model Zhang T.
Kang Liu
Jun Zhao
IJCAI International Joint Conference on Artificial Intelligence English Cross lingual entity linking means linking an entity mention in a background source document in one language with the corresponding real world entity in a knowledge base written in the other language. The key problem is to measure the similarity score between the context of the entity mention and the document of the cand idate entity. This paper presents a general framework for doing cross lingual entity linking by leveraging a large scale and bilingual knowledge base, Wikipedia. We introduce a bilingual topic model that mining bilingual topic from this knowledge base with the assumption that the same Wikipedia concept documents of two different languages share the same semantic topic distribution. The extracted topics have two types of representation, with each type corresponding to one language. Thus both the context of the entity mention and the document of the cand idate entity can be represented in a space using the same semantic topics. We use these topics to do cross lingual entity linking. Experimental results show that the proposed approach can obtain the competitive results compared with the state-of-art approach. 0 0
Cross-media topic mining on wikipedia Xiaolong Wang
Yuanyuan Liu
Dingquan Wang
Fei Wu
Cross media
Topic modeling
MM 2013 - Proceedings of the 2013 ACM Multimedia Conference English As a collaborative wiki-based encyclopedia, Wikipedia pro- vides a huge amount of articles of various categories. In addition to their text corpus, Wikipedia also contains plenty of images which makes the articles more intuitive for readers to understand. To better organize these visual and textual data, one promising area of research is to jointly model the embedding topics across multi-modal data (i.e, cross-media) from Wikipedia. In this work, we propose to learn the projection matrices that map the data from heterogeneous feature spaces into a unified latent topic space. Different from previous approaches, by imposing the ℓ1 regularizers to the projection matrices, only a small number of relevant visual/textual words are associated with each topic, which makes our model more interpretable and robust. Further- more, the correlations of Wikipedia data in different modalities are explicitly considered in our model. The effectiveness of the proposed topic extraction algorithm is verified by several experiments conducted on real Wikipedia datasets. Copyright 0 0
Crowd-sourced open courseware authoring with slidewiki.org Sören Auer
Khalili A.
Tarasowa D.
International Journal of Emerging Technologies in Learning English While many Learning Content Management Systems are available, the collaborative, community-based creation of rich e-learning content is still not sufficiently well supported. Few attempts have been made to apply crowd-sourcing and wiki-approaches for the creation of elearning content. In this article, we showcase SlideWiki - an Open Courseware Authoring platform supporting the crowdsourced creation of richly structured learning content. 0 0
DFT-extractor: A system to extract domain-specific faceted taxonomies from wikipedia Wei B.
Liu J.
Jun Ma
Zheng Q.
Weinan Zhang
Feng B.
Faceted taxonomy
Network motif
WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web English Extracting faceted taxonomies from the Web has received increasing attention in recent years from the web mining community. We demonstrate in this study a novel system called DFT-Extractor, which automatically constructs domain-specific faceted taxonomies from Wikipedia in three steps: 1) It crawls domain terms from Wikipedia by using a modified topical crawler. 2) Then it exploits a classification model to extract hyponym relations with the use of motif-based features. 3) Finally, it constructs a faceted taxonomy by applying a community detection algorithm and a group of heuristic rules. DFT-Extractor also provides a graphical user interface to visualize the learned hyponym relations and the tree structure of taxonomies. 0 0
Damage Detection and Mitigation in Open Collaboration Applications Andrew G. West University of Pennsylvania English Collaborative functionality is changing the way information is

amassed, refined, and disseminated in online environments. A subclass of these systems characterized by "open collaboration" uniquely allow participants to *modify* content with low barriers-to-entry. A prominent example and our case study, English Wikipedia, exemplifies the vulnerabilities: 7%+ of its edits are blatantly unconstructive. Our measurement studies show this damage manifests in novel socio-technical forms, limiting the effectiveness of computational detection strategies from related domains. In turn this has made much mitigation the responsibility of a poorly organized and ill-routed human workforce. We aim to improve all facets of this incident response workflow.

Complementing language based solutions we first develop content agnostic predictors of damage. We implicitly glean reputations for system entities and overcome sparse behavioral histories with a spatial reputation model combining evidence from multiple granularity. We also identify simple yet indicative metadata features that capture participatory dynamics and content maturation. When brought to bear over damage corpora our contributions: (1) advance benchmarks over a broad set of security issues ("vandalism"), (2) perform well in the first anti-spam specific approach, and (3) demonstrate their portability over diverse open collaboration use cases.

Probabilities generated by our classifiers can also intelligently route human assets using prioritization schemes optimized for capture rate or impact minimization. Organizational primitives are introduced that improve workforce efficiency. The whole of these strategies are then implemented into a tool ("STiki") that has been used to revert 350,000+ damaging instances from Wikipedia. These uses are analyzed to learn about human aspects of the edit review process, properties including scalability, motivation, and latency. Finally, we conclude by measuring practical impacts of work, discussing how to better integrate our solutions, and revealing outstanding vulnerabilities

that speak to research challenges for open collaboration security.
0 0
Decentering Design: Wikipedia and Indigenous Knowledge Maja van der Velden International Journal of Human-Computer Interaction English This article is a reflection on the case of Wikipedia, the largest online reference site with 23 million articles, with 365 million readers, and without a page called Indigenous knowledge. A Postcolonial Computing lens, extended with the notion of decentering, is used to find out what happened with Indigenous knowledge in Wikipedia. Wikipedia's ordering technologies, such as policies and templates, play a central role in producing knowledge. Two designs, developed with and for Indigenous communities, are introduced to explore if another Wikipedia's design is possible. 0 0
Defining, Understanding, and Supporting Open Collaboration: Lessons From the Literature Andrea Forte
Cliff Lampe
Open collaboration
Open source software
Peer production
American Behavioral Scientist English In this short introductory piece, we define open collaboration and contextualize the diverse articles in this special issue in a common vocabulary and history. We provide a definition of open collaboration and situate the phenomenon within an interrelated set of scholarly and ideological movements. We then examine the properties of open collaboration systems that have given rise to research and review major areas of scholarship. We close with a summary of consistent findings in open collaboration research to date. 0 0
Demonstration of a Loosely Coupled M2M System Using Arduino, Android and Wiki Software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Sensor network
Social network
Message oriented middleware
The 38th IEEE Conference on Local Computer Networks (LCN) English A Machine-to-Machine (M2M) system, in which terminals are loosely coupled with Wiki software, is proposed. This system acquires sensor data from remote terminals, processes the data by remote terminals and controls actuators at remote terminals according to the processed data. The data is passed between terminals using wiki pages. Each terminal consists of an Android terminal and an Arduino board. The mobile terminal can be controlled by a series of commands which is written on a wiki page. The mobile terminal has a data processor and the series of commands may have a program which controls the processor. The mobile terminal can read data from not only the sensors of the terminal but also wiki pages on the Internet. The input data may be processed by the data processor of the terminal. The processed data may be sent to a wiki page. The mobile terminal can control the actuators of the terminal by reading commands on the wiki page or by running the program on the wiki page. This system realizes an open communication forum for not only people but also for machines. 8 0
Design and implementation of wiki content transformations and refactorings Hannes Dohrn
Dirk Riehle
Wiki markup
Wiki object model
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English The organic growth of wikis requires constant attention by contributors who are willing to patrol the wiki and improve its content structure. However, most wikis still only oer textual editing and even wikis which oer WYSIWYG editing do not assist the user in restructuring the wiki. Therefore, "gardening" a wiki is a tedious and error-prone task. One of the main obstacles to assisted restructuring of wikis is the underlying content model which prohibits automatic transformations of the content. Most wikis use either a purely textual representation of content or rely on the representational HTML format. To allow rigorous definitions of transformations we use and extend a Wiki Object Model. With theWiki Object Model installed we present a catalog of transformations and refactorings that helps users to easily and consistently evolve the content and structure of a wiki. Furthermore we propose XSLT as language for transformation specification and provide working examples of selected transformations to demonstrate that theWiki Object Model and the transformation framework are well designed. We believe that our contribution significantly simplifies wiki "gardening" by introducing the means of eortless restructuring of articles and groups of articles. It furthermore provides an easily extensible foundation for wiki content transformations. Categories and Subject Descriptors H.4 [Information Systems]: Information Systems Applications; I.7 [Computing Methodologies]: Document and Text Processing; D.2 [Software]: Software Engineering General Terms Design, Languages. Copyright 2010 ACM. 0 0
Designing a chat-bot that simulates an historical figure Haller E.
Rebedea T.
Conversational Agent
Information extraction
Proceedings - 19th International Conference on Control Systems and Computer Science, CSCS 2013 English There are many applications that are incorporating a human appearance and intending to simulate human dialog, but in most of the cases the knowledge of the conversational bot is stored in a database created by a human experts. However, very few researches have investigated the idea of creating a chat-bot with an artificial character and personality starting from web pages or plain text about a certain person. This paper describes an approach to the idea of identifying the most important facts in texts describing the life (including the personality) of an historical figure for building a conversational agent that could be used in middle-school CSCL scenarios. 0 0
Detecting collaboration from behavior Bauer T.
Garcia D.
Colbaugh R.
Glass K.
IEEE ISI 2013 - 2013 IEEE International Conference on Intelligence and Security Informatics: Big Data, Emergent Threats, and Decision-Making in Security Informatics English This paper describes a method for inferring when a person might be coordinating with others based on their behavior. We show that, in Wikipedia, editing behavior is more random when coordinating with others. We analyzed this using both entropy and conditional entropy. These algorithms rely only on timestamped events associated with entities, making them broadly applicable to other domains. In this paper, we will discuss previous research on this topic, how we adapted that research to the problem ofWikipedia edit behavior, describe how we extended it, and discuss our results. 0 0
Detecting controversy on the web Dori-Hacohen S.
Allan J.
Controversy detection
Critical literacy
Sentiment analysis
International Conference on Information and Knowledge Management, Proceedings English A useful feature to facilitate critical literacy would alert users when they are reading a controversial web page. This requires solving a binary classification problem: does a given web page discuss a controversial topic? We explore the feasibility of solving the problem by treating it as supervised k-nearest-neighbor classification. Our approach (1) maps a webpage to a set of neighboring Wikipedia articles which were labeled on a controversiality metric; (2) coalesces those labels into an estimate of the webpage's controversiality; and finally (3) converts the estimate to a binary value using a threshold. We demonstrate the applicability of our approach by validating it on a set of webpages drawn from seed queries. We show absolute gains of 22% in F 0.5 on our test set over a sentiment-based approach, highlighting that detecting controversy is more complex than simply detecting opinions. Copyright is held by the owner/author(s). 0 0
Detection of article qualities in the chinese wikipedia based on c4.5 decision tree Xiao K.
Li B.
He P.
Yang X.-H.
Application of supervised learning
Article quality
Data ming
Decision tree
Lecture Notes in Computer Science English The number of articles in Wikipedia is growing rapidly. It is important for Wikipedia to provide users with high quality and reliable articles. However, the quality assessment metric provided by Wikipedia are inefficient, and other mainstream quality detection methods only focus on the qualities of the English Wikipedia articles, and usually analyze the text contents of articles, which is also a time-consuming process. In this paper, we propose a method for detecting the article qualities of the Chinese Wikipedia based on C4.5 decision tree. The problem of quality detection is transformed to classification problem of high-quality and low-quality articles. By using the fields from the tables in the Chinese Wikipedia database, we built the decision trees to distinguish high-quality articles from low-quality ones. 0 0
Determinants of collective intelligence quality: Comparison between Wiki and Q&A services in English and Korean users Joo J.
Normatov I.
Collective intelligence
Collective intelligence quality
Q&A service
Wiki service
Service Business English Although web-enabled collective intelligence (CI) plays a critical role in organizational innovation and collaboration, the dubious quality of CI is still a substantial problem faced by many CI services. Thus, it is important to identify determinants of CI quality and to analyze the relationship between CI quality and its usefulness. One of the most successful services of web-enabled CI is Wikipedia accessible all over the world. Another type of CI service is Naver KnowledgeiN, a typical and popular CI site offering question and answer (Q&A) services in Korea. Wikipedia is a multilingual and web-based encyclopedia. Thus, it is necessary to study the influence relationships among CI quality, its determinants, and CI usefulness according to different CI type and languages. In this paper, we propose a new research model reflecting multi-dimensional factors related to CI quality from user's perspective. To test a total of 15 hypotheses drawn from the research model, a total of 691 responses were collected from Wikipedia and KnowledgeiN users in South Korea and US. Expertise of contributors, community size, and diversity of contributors were identified as determinants of perceived CI quality. Perceived CI quality has significantly influenced on perceived CI usefulness from user's perspective. CI type and different language partially play a role of moderators. The expertise of contributors plays a more important role in CI quality in the case of Q&A services such as KnowledgeiN compared to Wiki services such as Wikipedia. This implies that Q&A service requires more expertise and experiences in particular areas rather than the case of Wiki service to improve service quality. The relationship between community size and perceived CI quality was different according to CI type. The community size has a greater effect on CI quality in case of Wiki service than that of Q&A service. The number of contributors in Wikipedia is important because Wiki is an encyclopedia service which is edited and revised repeatedly from many contributors while the answer given in Naver KnowledgeiN cannot be edited by others. Finally, CI quality has a greater effect on its usefulness in case of Wiki service rather than Q&A service. In this paper, we suggested implications for practitioners and theorists. 0 0
Determining leadership in contentious discussions Jain S.
Hovy E.
Contentious discussion
Discussion leader discovery
Discussion participant role
Natural Language Processing
Social multimedia
Electronic Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2013 English Participants in online decision making environments assume different roles. Especially in contentious discussions, the outcome often depends critically on the discussion leader(s). Recent work on automated leadership analysis has focused on collaborations where all the participants have the same goal. In this paper we focus on contentious discussions, in which the participants have different goals based on their opinion, which makes the notion of leader very different. We analyze discussions on the Wikipedia Articles for Deletion (AfD) forum. We define two complimentary models, Content Leader and SilentOut Leader. The models quantify the basic leadership qualities of participants and assign leadership points to them. We compare the correlation between the leaders' rank produced by the two models using the Spearman Coefficient. We also propose a method to verify the quality of the leaders identified by each model. 0 0
Determining relation semantics by mapping relation phrases to knowledge base Liu F.
Yuanyuan Liu
Guangyou Zhou
Kang Liu
Jun Zhao
Open Information Extraction
Relation Mapping
Wikipedia Infobox
Proceedings - 2nd IAPR Asian Conference on Pattern Recognition, ACPR 2013 English 0 0
Development and evaluation of an ensemble resource linking medications to their indications Wei W.-Q.
Cronin R.M.
Xu H.
Lasko T.A.
Bastarache L.
Denny J.C.
Journal of the American Medical Informatics Association English Objective: To create a computable MEDication Indication resource (MEDI) to support primary and secondary use of electronic medical records (EMRs). Materials and methods: We processed four public medication resources, RxNorm, Side Effect Resource (SIDER) 2, MedlinePlus, and Wikipedia, to create MEDI. We applied natural language processing and ontology relationships to extract indications for prescribable, single-ingredient medication concepts and all ingredient concepts as defined by RxNorm. Indications were coded as Unified Medical Language System (UMLS) concepts and International Classification of Diseases, 9th edition (ICD9) codes. A total of 689 extracted indications were randomly selected for manual review for accuracy using dual-physician review. We identified a subset of medication-indication pairs that optimizes recall while maintaining high precision. Results: MEDI contains 3112 medications and 63 343 medication-indication pairs. Wikipedia was the largest resource, with 2608 medications and 34 911 pairs. For each resource, estimated precision and recall, respectively, were 94% and 20% for RxNorm, 75% and 33% for MedlinePlus, 67% and 31% for SIDER 2, and 56% and 51% for Wikipedia. The MEDI high-precision subset (MEDI-HPS) includes indications found within either RxNorm or at least two of the three other resources. MEDI-HPS contains 13 304 unique indication pairs regarding 2136 medications. The mean±SD number of indications for each medication in MEDI-HPS is 6.22±6.09. The estimated precision of MEDI-HPS is 92%. Conclusions: MEDI is a publicly available, computable resource that links medications with their indications as represented by concepts and billing codes. MEDI may benefit clinical EMR applications and reuse of EMR data for research. 0 0
Development and evaluation of wiki collaboration space for e-Learning Esichaikul V.
Aung W.M.
Bechter C.
Rehman M.
Collaboration space
Collaborative learning
Collaborative tools
Web technologies
Journal of Enterprise Information Management English Purpose: The purpose of this paper is to define standard guidelines for the development of a wiki collaboration space for e-Learning, in order to provide collaborative activities among students, and between instructors and students. Design/methodology/approach: The general requirements and extended features of wiki collaboration space were determined by conducting a requirement study and discussion with major stakeholders, i.e. students and tutors. Then, the wiki collaboration space was developed based on an open source wiki system. Finally, a wiki collaboration space was evaluated in terms of usability and collaboration effectiveness. Findings: A comparison was performed between the wiki collaboration space and the original wiki in students' works in an online course. The results showed that the effectiveness of collaboration and usefulness of wiki collaboration space were higher than original wiki in collaborative assignment. Practical implications: As for practical implications, e-Learning developers/managers can use the outcome of this study as a guideline to integrate wiki and/or other social software to supplement e-Learning systems for better collaboration. Originality/value: There is a need to define standard guidelines that provide the necessary features for wiki in e-Learning. In this study, extended features of wiki as collaborative learning tool were identified and evaluated to meet the needs of students in e-Learning environment. 0 0
Digital histories for the digital age: Collaborative writing in large lecture courses Soh L.-K.
Nobel Khandaker
Thomas W.G.
Digital History
Digital Humanities
Proceedings of the International Conference e-Learning 2013 English The digital environment has had an immense effect on American society, learning, and education: we have more sources available at our fingertips than any previous generation. Teaching and learning with these new sources, however, has been a challenging transition. Students are confronted with an ocean of digital objects and need skills to navigate the World Wide Web and numerous proprietary databases. Writing and disciplinary habits of mind are more important than ever in this environment, so how do we teach these in the digital age? This paper examines the current digital environment that humanities faculty face in their teaching and explores new tools that might support collaborative writing and digital skills development for students. In particular, this paper considers the effectiveness of a specially configured multi-agent wiki system for writing in a large lecture humanities course and explores the results of its deployment over two years. 0 0
Digital services in immersive urban virtual environments Meira C.
Freitas J.
Barbosa L.
Melo M.
Bessa M.
Magalhaes L.
Digital Services
Immersive Virtual Environments
Iberian Conference on Information Systems and Technologies, CISTI Portuguese Virtual Environments (VE) systems may provide a new way to deliver information and services in many areas, for example in tourism, urban planning and education. In urban VE there is a close link between the virtual environment and the urban environment that are intended to represent. These VE can be an intuitive way to access a set of services with a direct association to the real object or entity to which they are related. In this article, we describe a case study that aimed at exploring the possibility of using new interfaces to exploit and use services in urban VE with a greater sense of immersiveness. The results indicate that the VE interfaces are a natural and intuitive access to digital services. While users have felt a greater difficulty in performing some of the tasks in immersive scenario, the majority considered that this scenario provided a greater sense of immersion and realism. 0 0
Disambiguation to Wikipedia: A language and domain independent approach Nguyen T.-V.T. Lecture Notes in Computer Science English Disambiguation to Wikipedia (D2W) is the task of linking mentions of concepts in text to their corresponding Wikipedia articles. Traditional approaches to D2W has focused either in only one language (e.g. English) or in formal texts (e.g. news articles). In this paper, we present a multilingual framework with a set of new features that can be obtained purely from the online encyclopedia, without the need of any natural language specific tool. We analyze these features with different languages and different domains. The approach shows as fully language-independent and has been applied successfully to English, Italian, Polish, with a consistent improvement. We show that only a sufficient number of Wikipedia articles is needed for training. When trained on real-world data sets for English, our new features yield substantial improvement compared to current local and global disambiguation algorithms. Finally, the adaption to the Bridgeman query logs in digital libraries shows the robustness of our approach even in the lack of disambiguation context. Also, as no natural language specific tool is needed, the method can be applied to other languages in a similar manner with little adaptation. 0 0
Discovering details and scene structure with hierarchical iconoid shift Weyand T.
Leibe B.
Hierarchical clustering
Image clustering
Medoid shift
Scale space
Semantic labelling
Proceedings of the IEEE International Conference on Computer Vision English Current landmark recognition engines are typically aimed at recognizing building-scale landmarks, but miss interesting details like portals, statues or windows. This is because they use a flat clustering that summarizes all photos of a building facade in one cluster. We propose Hierarchical Iconoid Shift, a novel landmark clustering algorithm capable of discovering such details. Instead of just a collection of clusters, the output of HIS is a set of dendrograms describing the detail hierarchy of a landmark. HIS is based on the novel Hierarchical Medoid Shift clustering algorithm that performs a continuous mode search over the complete scale space. HMS is completely parameter-free, has the same complexity as Medoid Shift and is easy to parallelize. We evaluate HIS on 800k images of 34 landmarks and show that it can extract an often surprising amount of detail and structure that can be applied, e.g., to provide a mobile user with more detailed information on a landmark or even to extend the landmark's Wikipedia article. 0 0
Discovering missing semantic relations between entities in Wikipedia Xu M.
Zhe Wang
Bie R.
Jing-Woei Li
Zheng C.
Ke W.
Zhou M.
Linked data
Lecture Notes in Computer Science English Wikipedia's infoboxes contain rich structured information of various entities, which have been explored by the DBpedia project to generate large scale Linked Data sets. Among all the infobox attributes, those attributes having hyperlinks in its values identify semantic relations between entities, which are important for creating RDF links between DBpedia's instances. However, quite a few hyperlinks have not been anotated by editors in infoboxes, which causes lots of relations between entities being missing in Wikipedia. In this paper, we propose an approach for automatically discovering the missing entity links in Wikipedia's infoboxes, so that the missing semantic relations between entities can be established. Our approach first identifies entity mentions in the given infoboxes, and then computes several features to estimate the possibilities that a given attribute value might link to a candidate entity. A learning model is used to obtain the weights of different features, and predict the destination entity for each attribute value. We evaluated our approach on the English Wikipedia data, the experimental results show that our approach can effectively find the missing relations between entities, and it significantly outperforms the baseline methods in terms of both precision and recall. 0 0
Discovering stakeholders' interests in Wiki-based architectural documentation Nicoletti M.
Diaz-Pace J.A.
Schiaffino S.
Architectural documentation
Software architecture
Text mining
User profiling
CIbSE 2013: 16th Ibero-American Conference on Software Engineering - Memorias de la 16th Conferencia Iberoamericana de Ingenieria de Software, CIbSE 2013 English The Software Architecture Document (SAD) is an important artifact in the early stages of software development, as it serves to share and discuss key design and quality-attribute concerns among the stakeholders of the project. Nowadays, architectural documentation is commonly hosted in Wikis in order to favor communication and interactions among stakeholders. However, the SAD is still a large and complex document, in which stakeholders often have difficulties in finding information that is relevant to their interests or daily tasks. We argue that the discovery of stakeholders' interests is helpful to tackle this information overload problem, because a recommendation tool can leverage on those interests to provide each stakeholder with SAD sections that match his/her profile. In this work, we propose an approach to infer stakeholders' interests, based on applying a combination of Natural Language Processing and User Profiling techniques. The interests are partially inferred by monitoring the stakeholders' behavior as they browse a Wiki-based SAD. A preliminary evaluation of our approach has shown its potential for making recommendations to stakeholders with different profiles and support them in architectural tasks. 0 0
Discovering unexpected information on the basis of popularity/unpopularity analysis of coordinate objects and their relationships Tsukuda K.
Hiroaki Ohshima
Michihiro Yamamoto
Hirotoshi Iwasaki
Katsumi Tanaka
Coordinate term
Unexpected information
Proceedings of the ACM Symposium on Applied Computing English Although many studies have addressed the problem of finding Web pages seeking relevant and popular information from a query, very few have focused on the discovery of unexpected information. This paper provides and evaluates methods for discovering unexpected information for a keyword query. For example, if the user inputs "Michael Jackson," our system first discovers the unexpected related term "karate" and then returns the unexpected information "Michael Jackson is good at karate." Discovering unexpected information is useful in many situations. For example, when a user is browsing a news article on the Web, unexpected information about a person associated with the article can pique the user's interest. If a user is sightseeing or driving, providing unexpected, additional information about a building or the region is also useful. Our approach collects terms related to a keyword query and evaluates the degree of unexpectedness of each related term for the query on the basis of (i) the relationships of coordinate terms of both the keyword query and related terms, and (ii) the degree of popularity of each related term. Experimental results show that considering these two factors are effective for discovering unexpected information. Copyright 2013 ACM. 0 0
Discussing the factors contributing to students' involvement in an EFL collaborative wiki project Lee H.-C.
Wang P.-L.
Higher education
Language learning
Peer collaboration
ReCALL English A growing number of researchers have acknowledged the potential for using wikis in online collaborative language learning. While researchers appreciate the wikis platform for engaging students in virtual team work and authentic language learning, many also have recognized the limitations of using wikis to promote student collaboration (Alyousef & Picard, 2011; Arnold, Ducate & Kost, 2009; Coniam & Kit, 2008; Judd, Kennedy & Cropper, 2010; Warschauer, 2010). The current study aims to examine what factors facilitated or hindered student collaboration when a wiki environment was used to engage 103 Taiwanese students from two universities in an online picture book production project. Divided into 17 groups of four to six members, the students spent approximately one academic year forming online communities, learning to conduct peer editing, and collaboratively completing a final learning product, an online picture book. A variety of data, including the electronically archived versions of the wiki pages, students' responses to retrospective surveys, and focused follow-up interviews were collected and analysed. The findings suggested that the nature of the learning tasks, students' constant communication and appreciation of different opinions, the difficulties they encountered when communicating asynchronously, and students' expectations toward English learning affected to what extent they were involved in the online collaboration. 0 0
Distant supervision learning of DBPedia relations Zajac M.
Przepiorkowski A.
Distant supervision learning
Information extraction
Ontology construction
Semantic web
Lecture Notes in Computer Science English This paper presents DBPediaExtender, an information extraction system that aims at extending an existing ontology of geographical entities by extracting information from text. The system uses distant supervision learning - the training data is constructed on the basis of matches between values from infoboxes (taken from the Polish DBPedia) and Wikipedia articles. For every relevant relation, a sentence classifier and a value extractor are trained; the sentence classifier selects sentences expressing a given relation and the value extractor extracts values from selected sentences. The results of manual evaluation for several selected relations are reported. 0 0
Diversifying Query Suggestions by using Topics from Wikipedia Hu H.
Maoyuan Zhang
He Z.
Pu Wang
Weiping Wang
Query suggestion diversification
Proceedings - 2013 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013 English Diversifying query suggestions has emerged recently, by which the recommended queries can be both relevant and diverse. Most existing works diversify suggestions by query log analysis, however, for structured data, not all query logs are available. To this end, this paper studies the problem of suggesting diverse query terms by using topics from Wikipedia. Wikipedia is a successful online encyclopedia, and has high coverage of entities and concepts. We first obtain all relevant topics from Wikipedia, and then map each term to these topics. As the mapping is a nontrivial task, we leverage information from both Wikipedia and structured data to semantically map each term to topics. Finally, we propose a fast algorithm to efficiently generate the suggestions. Extensive evaluations are conducted on a real dataset, and our approach yields promising results. 0 0
Document analytics through entity resolution Santos J.
Martins B.
Batista D.S.
Entity Resolution
Information extraction
Text Mining
Lecture Notes in Computer Science English We present a prototype system for resolving named entities, mentioned in textual documents, into the corresponding Wikipedia entities. This prototype can aid in document analysis, by using the disambiguated references to provide useful information in context. 0 0
Document listing on versioned documents Claude F.
Munro J.I.
Lecture Notes in Computer Science English Representing versioned documents, such as Wikipedia history, web archives, genome databases, backups, is challenging when we want to support searching for an exact substring and retrieve the documents that contain the substring. This problem is called document listing. We present an index for the document listing problem on versioned documents. Our index is the first one based on grammar-compression. This allows for good results on repetitive collections, whereas standard techniques cannot achieve competitive space for solving the same problem. Our index can also be addapted to work in a more standard way, allowing users to search for word-based phrase queries and conjunctive queries at the same time. Finally, we discuss extensions that may be possible in the future, for example, supporting ranking capabilities within the index itself. 0 0
Documenting software using adaptive software artifacts Correia F.F. Documentation
SPLASH 2013 - Proceedings of the 2013 Companion Publication for Conference on Systems, Programming, and Applications: Software for Humanity English Creating and using software documentation presents numerous challenges, namely in what concerns the expression of knowledge structures, consistency maintenance and classification. Adaptive Software Artifacts is a flexible approach to expressing structured contents that tackles these concerns, and that is being realized in the context of a Software Forge. Copyright © 2013 by the Association for Computing Machinery, Inc. (ACM). 0 0
Does formal authority still matter in the age of wisdom of crowds: Perceived credibility, peer and professor endorsement in relation to college students' wikipedia use for academic purposes Sook Lim Academic use
Peer endorsement
Proceedings of the ASIST Annual Meeting English This study explores whether or not formal authority still matters for college students using Wikipedia by examining the variables of individual perceived credibility, peer endorsement and professor endorsement in relation to students' academic use of Wikipedia. A web survey was used to collected data in fall 2011. A total of 142 students participated in the study, of which a total of 123 surveys were useable for this study. The findings show that the more professors approved of Wikipedia, the more students used it for academic purposes. In addition, the more students perceived Wikipedia as credible, the more they used it for academic purposes. The results indicate that formal authority still influences students' use of usergenerated content (UGC) in their formal domain, academic work. The results can be applicable to other UGC, which calls attention to educators' active intervention to appropriate academic use of UGC. Professors' guidelines for UGC would benefit students. 0 0
Dynamic information retrieval using, constructing concepts maps with SW principles Nalini T. Concept maps Visual notation
Dynamic information retrieval
Middle - East Journal of Scientific Research English Concept Maps are the straightforward way to keep in mind about a topic, visual image is that the major half that's being centered here. This paper makes an attempt to demonstrate the thought map of a Wikipedia page. The highlight of the work is to style of associate degree formula that retrieves the data dynamically from the Wikipedia page and Concept maps ar drawn by considering the principles of visual notations in software system Engineering. This method is enforced in such some way that a mobile that incorporates a little screen through that ton of content can't be scan however will be viewed as a concept map and therefore the sub-topics of the content are shown as its branches and this branches also can be developed as a brand new thought map for that specific word as per user's would like. 0 0
E-learning and the Quality of Knowledge in a Globalized World Van De Bunt-Kokhuis S. Blended learning
Digital competence
Distance education
Flexible learning
Information and communication technologies (ICT)
Learning management systems (LMS)
Social network
Target groups of learners
Teaching and learning
Teaching and learning process
Virtual learning spaces
Virtual Open Initiatives and Resources project (AVOIR)
Distance and E-Learning in Transition: Learning Innovation, Technology and Social Challenges English [No abstract available] 0 0
Early Prediction of Movie Box Office Success Based on Wikipedia Activity Big Data Márton Mestyán
Taha Yasseri
János Kertész
PLoS ONE English Use of socially generated "big data" to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society's reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between "real time monitoring" and "early predicting" remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia. 0 0
Education in Health Research Methodology: Use of a Wiki for Knowledge Translation Hamm M.P.
Klassen T.P.
Scott S.D.
Moher D.
Hartling L.
PLoS ONE English Introduction:A research-practice gap exists between what is known about conducting methodologically rigorous randomized controlled trials (RCTs) and what is done. Evidence consistently shows that pediatric RCTs are susceptible to high risk of bias; therefore novel methods of influencing the design and conduct of trials are required. The objective of this study was to develop and pilot test a wiki designed to educate pediatric trialists and trainees in the principles involved in minimizing risk of bias in RCTs. The focus was on preliminary usability testing of the wiki.Methods:The wiki was developed through adaptation of existing knowledge translation strategies and through tailoring the site to the identified needs of the end-users. The wiki was evaluated for usability and user preferences regarding the content and formatting. Semi-structured interviews were conducted with 15 trialists and systematic reviewers, representing varying levels of experience with risk of bias or the conduct of trials. Data were analyzed using content analysis.Results:Participants found the wiki to be well organized, easy to use, and straightforward to navigate. Suggestions for improvement tended to focus on clarification of the text or on esthetics, rather than on the content or format. Participants liked the additional features of the site that were supplementary to the text, such as the interactive examples, and the components that focused on practical applications, adding relevance to the theory presented. While the site could be used by both trialists and systematic reviewers, the lack of a clearly defined target audience caused some confusion among participants.Conclusions:Participants were supportive of using a wiki as a novel educational tool. The results of this pilot test will be used to refine the risk of bias wiki, which holds promise as a knowledge translation intervention for education in medical research methodology. 0 0
Effcient feature integration with Wikipedia-based semantic feature extraction for Turkish text summarization Gran A.
Bayazit N.G.
Grbz M.Z.
Analytical hierarchical process
Artificial bee colony algorithm
Latent semantic analysis
Turkish text summarization
Turkish Wikipedia
Turkish Journal of Electrical Engineering and Computer Sciences English This study presents a novel hybrid Turkish text summarization system that combines structural and semantic features. The system uses 5 structural features, 1 of which is newly proposed and 3 are semantic features whose values are extracted from Turkish Wikipedia links. The features are combined using the weights calculated by 2 novel approaches. The first approach makes use of an analytical hierarchical process, which depends on a series of expert judgments based on pairwise comparisons of the features. The second approach makes use of the artificial bee colony algorithm for automatically determining the weights of the features. To confirm the significance of the proposed hybrid system, its performance is evaluated on a new Turkish corpus that contains 110 documents and 3 human-generated extractive summary corpora. The experimental results show that exploiting all of the features by combining them results in a better performance than exploiting each feature individually. 0 0
Effectiveness of shared leadership in Wikipedia Haiping Zhu
Kraut R.E.
Aniket Kittur
Aversive leadership
Directive leadership
Online community
Person-based leadership
Shared leadership
Transactional leadership
Human Factors English Objective: The objective of the paper is to understand leadership in an online community, specifically, Wikipedia. Background: Wikipedia successfully aggregates millions of volunteers' efforts to create the largest encyclopedia in human history. Without formal employment contracts and monetary incentives, one significant question for Wikipedia is how it organizes individual members with differing goals, experience, and commitment to achieve a collective outcome. Rather than focusing on the role of the small set of people occupying a core leadership position, we propose a shared leadership model to explain the leadership in Wikipedia. Members mutually influence one another by exercising leadership behaviors, including rewarding, regulating, directing, and socializing one another. Method: We conducted a two-phase study to investigate how distinct types of leadership behaviors (transactional, aversive, directive, and person-focused), the legitimacy of the people who deliver the leadership, and the experience of the people who receive the leadership influence the effectiveness of shared leadership in Wikipedia. Results: Our results highlight the importance of shared leadership in Wikipedia and identify trade-offs in the effectiveness of different types of leadership behaviors. Aversive and directive leadership increased contribution to the focal task, whereas transactional and person-focused leadership increased general motivation. We also found important differences in how newcomers and experienced members responded to leadership behaviors from peers. Application: These findings extend shared leadership theories, contribute new insight into the important underlying mechanisms in Wikipedia, and have implications for practitioners who wish to design more effective and successful online communities. Copyright 0 0
Effects of implicit positive ratings for quality assessment of Wikipedia articles Yu Suzuki Edit history
Journal of Information Processing English In this paper, we propose a method to identify high-quality Wikipedia articles by using implicit positive ratings. One of the major approaches for assessing Wikipedia articles is a text survival ratio based approach. In this approach, when a text survives beyond multiple edits, the text is assessed as high quality. However, the problem is that many low quality articles are misjudged as high quality, because every editor does not always read the whole article. If there is a low quality text at the bottom of a long article, and the text has not seen by the other editors, then the text survives beyond many edits, and the text is assessed as high quality. To solve this problem, we use a section and a paragraph as a unit instead of a whole page. In our method, if an editor edits an article, the system considers that the editor gives positive ratings to the section or the paragraph that the editor edits. From experimental evaluation, we confirmed that the proposed method could improve the accuracy of quality values for articles. 0 0
Effects of peer feedback on contribution: A field experiment in Wikipedia Haiping Zhu
Zhang A.
He J.
Kraut R.E.
Aniket Kittur
Field experiment
Online community
Peer feedback
Conference on Human Factors in Computing Systems - Proceedings English One of the most significant challenges for many online communities is increasing members' contributions over time. Prior studies on peer feedback in online communities have suggested its impact on contribution, but have been limited by their correlational nature. In this paper, we conducted a field experiment on Wikipedia to test the effects of different feedback types (positive feedback, negative feedback, directive feedback, and social feedback) on members' contribution. Our results characterize the effects of different feedback types, and suggest trade-offs in the effects of feedback between the focal task and general motivation, as well as differences in how newcomers and experienced editors respond to peer feedback. This research provides insights into the mechanisms underlying peer feedback in online communities and practical guidance to design more effective peer feedback systems. Copyright 0 0
Enabling e-Collaboration and e-Pedagogy at an academic institution in the UAE Tarazi J.
Akre V.L.
Academic Collaboration
Microsoft SharePoint
Proceedings of the 2013 International Conference on Current Trends in Information Technology, CTIT 2013 English Academic Institutions have come a long way from the time when teachers used to teach the concepts using chalks and blackboards and students used to listen to the lecture and rapidly take down notes in their notebooks. The world has witnessed a sea of change in academic delivery and pedagogy of teaching. Other notable change that have been observed recently is the use of technology in teacher collaboration and academic administration. Paper attendance sheets have made way to electronic attendance musters, where teachers can mark students' presence, absence or even enter late marks, which can instantaneously be viewed across the system by different department chairs and administrators. Many Universities have started using Microsoft SharePoint as an academic e-Collaboration tool. Weblogs, or blogs are increasingly being used as collaborative and business intelligence tools by Corporate organizations. Wikis represent flexible tools functioning as open-ended environments for collaboration while also offering support for group writing support. This paper aims to portray the efforts undertaken by a leading academic institution on the United Arab Emirates (UAE) to incorporate technologies such as Microsoft SharePoint, Blogs and Wikis to reinforce its academic processes as well as provide for effective collaboration among its faculty and administration. It attempts at sketching a detailed account of how the institution tried to adopt e-Collaboration using Microsoft SharePoint and to develop an e-Pedagogy using Microsoft SharePoint, Wikis and Blogs using a structured methodology. 0 0
Encoding local correspondence in topic models Mehdi R.E.
Mohamed Q.
Mustapha A.
Automatic Image Annotation
Local Influence
Probabilistic Graphical Models
Topic Models
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI English Exploiting label correlations is a challenging and crucial problem especially in multi-label learning context. Labels correlations are not necessarily shared by all instances and have generally a local definition. This paper introduces LOC-LDA, which is a latent variable model that adresses the problem of modeling annotated data by locally exploiting correlations between annotations. In particular, we represent explicitly local dependencies to define the correspondence between specific objects, i.e. regions of images and their annotations. We conducted experiments on a collection of pictures provided by the Wikipedia 'Picture of the day' website, and evaluated our model on the task of 'automatic image annotation'. The results validate the effectiveness of our approach. 0 0
English nominal compound detection with Wikipedia-based methods Nagy T. I.
Veronika Vincze
Multiword expressions
MWE detection
Nominal compounds
Silver standard corpus
Lecture Notes in Computer Science English Nominal compounds (NCs) are lexical units that consist of two or more elements that exist on their own, function as a noun and have a special added meaning. Here, we present the results of our experiments on how the growth of Wikipedia added to the performance of our dictionary labeling methods to detecting NCs. We also investigated how the size of an automatically generated silver standard corpus can affect the performance of our machine learning-based method. The results we obtained demonstrate that the bigger the dataset, the better the performance will be. 0 0
Enhancing learning environments by integrating external applications Alario-Hoyos C.
Bote-Lorenzo M.L.
Gomez-Sanchez E.
Asensio-Perez J.I.
Vega-Gorgojo G.
Ruiz-Calleja A.
External applications
Learning environments
Bulletin of the Technical Committee on Learning Technology English This paper discusses the lightweight integration of external applications in different learning environments like LMSs, PLEs or MOOCs using the GLUE! architecture. Also, the current status of GLUE! is presented, describing the particularities of integrating external applications in Moodle, LAMS and MediaWiki. Finally, the paper gives instructions to those interested in trying GLUE! or contributing to the integration of new applications or environments. 0 0
Enhancing successful outcomes of wiki-based collaborative writing: a state-of-the-art review of facilitation frameworks Stoddart A.
Chan J.Y.-Y.
Liu G.-Z.
Web 2.0
Interactive Learning Environments English This state-of-the-art review research undertook a survey of a variety of studies regarding wiki-based collaborative writing projects and from this body of work extracted the best practices tenets of facilitation. Wiki-based collaborative writing projects are becoming more common in second language (L2) pedagogy. Such projects have multiple aims. These include, among other benefits, L2 acquisition, P2P learning, collaboration and immersion in new technologies that will inform the social and professional lives of the students. By mining a variety of wiki-based collaborative writing projects for the specific meta and secondary facilitation practices, the researchers were able to develop a general framework that will assist instructors of university or advanced high school students who wish to engage their students in such projects. The attributes of good facilitation that the researchers have isolated are by no means exhaustive, nor are they a guarantee of successful outcomes. These attributes do, however, provide a good starting point for any teacher or instructional designer who wants to provide an environment that fosters student satisfaction, motivation and learning. 0 0
Enriching patent search with external keywords: A feasibility study Nikolova I.
Temnikova I.
Angelova G.
International Conference Recent Advances in Natural Language Processing, RANLP English This article presents a feasibility study for retrieving Wikipedia articles matching patents' topics. The long term motivation behind it is to facilitate patent search by enriching patent indexing with relevant keywords found in external (terminological) resources, with their monolingual synonyms and multilingual translations. The similarity between patents and Wikipedia articles is measured using various filtering techniques and patent document sections. The most similar Wikipedia articles happen to be the closest ones to the respective patent in 33% of the cases, otherwise they are within the top 12 ranked articles. 0 0
Enseigner la révision à l'ère des wikis ou là où on trouve la technologie alors qu'on ne l'attendait pas Brunette
Louise et Gagnon
Enseignement de la révision
Enseignement de la traduction
Revisors training
Translators training
JoSTrans, , no 1 In academic teaching, there are very few experiences on collaborative wiki revision. In a Quebec university, we experimented upon a wiki revision activity with translation students in their third final year. We specifically chose to revise texts in Wikipedia because its environment shares similarities with the labor market in the language industry and because we believed that the wiki allowed us to achieve the overall objectives of the revision class, such as we define them. Throughout the experience, we monitored the progress of students’ revision interventions on Wikipedia texts as well as exchanges taking place between revisees and reviewers. All our research observations were made possible by the convoluted but systematic structure in Wikipedia. Here, we report on the experiment at the Université du Québec en Outaouais and let our academic teaching readers decide whether the exercise is right for them. For us, it was convincing. RÉSUMÉ Dans l’enseignement universitaire, on dénombre très peu d’expériences de révision sur des wikis. Dans une université du Québec, nous nous sommes lancées dans une activité de révision wiki avec des étudiants de traduction en classe de terminale, soit en troisième année. Nous avons opté pour la révision d’un texte de Wikipédia en raison, entre autres, des similitudes de l’expérience avec le marché du travail et parce que nous croyions que le wiki assurait l’atteinte des objectifs généraux des cours de révision, tels que nous les définissons. Tout au cours de l’exercice, nous avons surveillé le progrès des révisions, les interventions des étudiants sur les textes de même que les échanges entre révisés et réviseurs. Toutes ces observations sont rendues possibles par la structure, alambiquée, mais systématique de Wikipédia. Nous livrons nos réflexions sur l’expérience menée à l’Université du Québec en Outaouais et laissons à nos lecteurs enseignants le soin de décider si l’exercice leur convient. Pour nous, il a été convaincant. 0 0
Enterprise wikis and enterprise modelling environments: An integrative framework for purposefully using both Bittmann S.
Michael Fellmann
Oliver Thomas
Collaborative Modelling
Enterprise Modelling
Enterprise Wikis
Operative Information Modelling
Proceedings - 2013 IEEE International Conference on Business Informatics, IEEE CBI 2013 English Enterprise wikis enable communication, consolidation and sharing of knowledge. Recently they have gained a wide acceptance. In contrast, enterprise modelling provides a holistic view on the enterprise from a managerial point of view. It focuses not on the knowledge and experience that have been obtained from the execution of operative processes by the employees, which are responsible for their execution. In this paper, we develop an integrative approach to combine these two fields. A framework will be proposed for the joint usage of enterprise wikis and enterprise modelling. The framework was applied in a case study. The results and experiences of applying the framework will be presented during the discussion of the case study. 0 0
Entityclassifier.eu: Real-time classification of entities in text with Wikipedia Dojchinovski M.
Kliegr T.
Lecture Notes in Computer Science English Targeted Hypernym Discovery (THD) performs unsupervised classification of entities appearing in text. A hypernym mined from the free-text of the Wikipedia article describing the entity is used as a class. The type as well as the entity are cross-linked with their representation in DBpedia, and enriched with additional types from DBpedia and YAGO knowledge bases providing a semantic web interoperability. The system, available as a web application and web service at entityclassifier.eu , currently supports English, German and Dutch. 0 0
Erfolgsfaktoren von Social Media: Wie "funktionieren" Wikis? Florian L. Mayer Wiki
Organizational Communication
Online collaboration
Otto-Friedrich-Universität Bamberg German Wann sind Wikis oder allgemeiner: Social Media erfolgreich? Wenn sie kommunikativ "lebendig" sind! Diesem "kommunikativen Erfolg" liegen Strukturprinzipien zugrunde, die diese Arbeit sichtbar macht. Sie beschreibt konkrete Aufmerksamkeits-, Motivations- und Organisationsstrukturen, und macht so den Erfolg der Leuchttürme wie Wikipedia oder Facebook, aber auch die Schwierigkeiten im Einsatz von Social Media in Organisationen und Gruppen verstehbar. Mit den Begriffen Mikrokommunikation und Mikrokollaboration liefert sie darüber hinaus eine Beschreibung neuer Formen gesellschaftlicher Kommunikation. 0 0
Escaping the trap of too precise topic queries Libbrecht P. Learning resources
Mathematical documents search
Mathematics classifications
Mathematics subjects
Search user interface
Topics search
Web mathematics library
Lecture Notes in Computer Science English At the very center of digital mathematics libraries lie controlled vocabularies which qualify the topic of the documents. These topics are used when submitting a document to a digital mathematics library and to perform searches in a library. The latter are refined by the use of these topics as they allow a precise classification of the mathematics area this document addresses. However, there is a major risk that users employ too precise topics to specify their queries: they may be employing a topic that is only "close-by" but missing to match the right resource. We call this the topic trap. Indeed, since 2009, this issue has appeared frequently on the i2geo.net platform. Other mathematics portals experience the same phenomenon. An approach to solve this issue is to introduce tolerance in the way queries are understood by the user. In particular, the approach of including fuzzy matches but this introduces noise which may prevent the user of understanding the function of the search engine. In this paper, we propose a way to escape the topic trap by employing the navigation between related topics and the count of search results for each topic. This supports the user in that search for close-by topics is a click away from a previous search. This approach was realized with the i2geo search engine and is described in detail where the relation of being related is computed by employing textual analysis of the definitions of the concepts fetched from the Wikipedia encyclopedia. 0 0
Establishing an innovative plant learning platform with expandable learning materials using wiki software Cheng S.-C.
Shao C.-M.
Plant search system
Workshop Proceedings of the 21st International Conference on Computers in Education, ICCE 2013 English Currently, plant education in elementary schools is an insignificant part of Nature courses, and students learn only the basic knowledge of plants, rather than profound knowledge. This study aims to establish an innovative plant learning platform to help students gain knowledge of plants, as based on the instructional website of a wiki engine. Through the characteristics of wiki, it invites scholars in plant studies to edit plant data and design related tests on the platform. Students can check their knowledge of plants on this system by various platforms, such as computers or mobile phones. The keywords can be the characteristics of leaves, flowers, and names of plants. In the experiment of study, a pretest is conducted on students using the items proposed by scholars, and a posttest is conducted after the students used the proposed system. The results of the two tests were compared. This study anticipates that the proposed system can allow students to have higher interest in learning about plants, thus gaining more knowledge on plants. 0 0
Evaluating article quality and editor reputation in Wikipedia Lu Y.
Lei Zhang
Jing-Woei Li
Editor reputation
Factor graph
Quality evaluation
Communications in Computer and Information Science English We study a novel problem of quality and reputation evaluation for Wikipedia articles. We propose a difficult and interesting question: How to generate reasonable article quality score and editor reputation in a framework at the same time? In this paper, We propose a dual wing factor graph(DWFG) model, which utilizes the mutual reinforcement between articles and editors to generate article quality and editor reputation. To learn the proposed factor graph model, we further design an efficient algorithm. We conduct experiments to validate the effectiveness of the proposed model. By leveraging the belief propagation between articles and editors, our approach obtains significant improvement over several alternative methods(SVM, LR, PR, CRF). 0 0
Evaluating entity linking with wikipedia Ben Hachey
Will Radford
Joel Nothman
Matthew Honnibal
Curran J.R.
Information extraction
Named Entity Linking
Semi-structured resources
Artificial Intelligence English Named Entity Linking (nel) grounds entity mentions to their corresponding node in a Knowledge Base (kb). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate entities and then disambiguate them, returning either the best candidate or nil. However, comparison has focused on disambiguation accuracy, making it difficult to determine how search impacts performance. Furthermore, important approaches from the literature have not been systematically compared on standard data sets. We reimplement three seminal nel systems and present a detailed evaluation of search strategies. Our experiments find that coreference and acronym handling lead to substantial improvement, and search strategies account for much of the variation between systems. This is an interesting finding, because these aspects of the problem have often been neglected in the literature, which has focused largely on complex candidate ranking algorithms. © 2012 Elsevier B.V. All rights reserved. 0 0
Evaluation of ILP-based approaches for partitioning into colorful components Bruckner S.
Huffner F.
Komusiewicz C.
Niedermeier R.
Lecture Notes in Computer Science English The NP-hard Colorful Components problem is a graph partitioning problem on vertex-colored graphs. We identify a new application of Colorful Components in the correction of Wikipedia interlanguage links, and describe and compare three exact and two heuristic approaches. In particular, we devise two ILP formulations, one based on Hitting Set and one based on Clique Partition. Furthermore, we use the recently proposed implicit hitting set framework [Karp, JCSS 2011; Chandrasekaran et al., SODA 2011] to solve Colorful Components. Finally, we study a move-based and a merge-based heuristic for Colorful Components. We can optimally solve Colorful Components for Wikipedia link correction data; while the Clique Partition-based ILP outperforms the other two exact approaches, the implicit hitting set is a simple and competitive alternative. The merge-based heuristic is very accurate and outperforms the move-based one. The above results for Wikipedia data are confirmed by experiments with synthetic instances. 0 0
Evaluation of WikiTalk - User studies of human-robot interaction Anastasiou D.
Kristiina Jokinen
Graham Wilcock
Multimodal human-robot interaction
Lecture Notes in Computer Science English The paper concerns the evaluation of Nao WikiTalk, an application that enables a Nao robot to serve as a spoken open-domain knowledge access system. With Nao WikiTalk the robot can talk about any topic the user is interested in, using Wikipedia as its knowledge source. The robot suggests some topics to start with, and the user shifts to related topics by speaking their names after the robot mentions them. The user can also switch to a totally new topic by spelling the first few letters. As well as speaking, the robot uses gestures, nods and other multimodal signals to enable clear and rich interaction. The paper describes the setup of the user studies and reports on the evaluation of the application, based on various factors reported by the 12 users who participated. The study compared the users' expectations of the robot interaction with their actual experience of the interaction. We found that the users were impressed by the lively appearance and natural gesturing of the robot, although in many respects they had higher expectations regarding the robot's presentation capabilities. However, the results are positive enough to encourage research on these lines. 0 0
Evaluation of named entity recognition tools on microposts Dlugolinsky S.
Marek Ciglan
Laclavik M.
INES 2013 - IEEE 17th International Conference on Intelligent Engineering Systems, Proceedings English In this paper we evaluate eight well-known Information Extraction (IE) tools on a task of Named Entity Recognition (NER) in microposts. We have chosen six NLP tools and two Wikipedia concept extractors for the evaluation. Our intent was to see how these tools would perform on relatively short texts of microposts. Evaluation dataset has been adopted from the MSM 2013 IE Challenge. This dataset contained manually annotated microposts with classification restricted to four entity types: PER, LOC, ORG and MISC. 0 0
Every move you make I'll be watching you: Geographical focus detection on Twitter Peregrino F.S.
Tomas D.
Llopis F.
Geographic information retrieval
Geographical focus
Language models
Social network
Proceedings of the 7th Workshop on Geographic Information Retrieval, GIR 2013 English On-line Social Networks have increased their popularity rapidly since their creation, providing a huge amount of data which can be leverage to extract useful information related to commercial and social human behaviours. One of the most useful information that can be extracted is the geographical one. This paper shows an approach to detect the geographical focus of Twitter users at city level based on the text of the tweets that users have sent and external information from Wikipedia. The main goal of this work is to show how important could be external formal text resources such as Wikipedia when it comes to resolve the geographical focus in short pieces of informal natural language text. In order to accomplish this objective, we have assessed our system with a language model system, comparing the results using only the informal pieces of text (tweets) and merging it with formal text coming from Wikipedia. In our experiments, we found that the aid of formal pieces of text, such as those obtained from the Wikipedia articles and links, could be useful when the existing amount of data is rather limited. 0 0
Evolution of peer production system based on limited matching and preferential selection Li X.
Li S.-W.
Computational experiments
Limited matching
Peer production system
Preferential selection
Shanghai Ligong Daxue Xuebao/Journal of University of Shanghai for Science and Technology Chinese Based on the real background of Wikipedia adopted as a classic peer production system and many users taking part in its editing, the two characteristics of preferential selection and limited matching during the editing process were considered. Two rules for " preferential selection" and " limited matching" and the evolving model of peer production system were presented. The analysis was based on computational experiments on the times of page editing, the status variation of pages and users, the affection of matching degree on page editing times, etc. The computational experiments show that the Wikipedia system evolves to a stable status under the action of the two rules. In the stable status, the times of page editing follow power-law distribution; the difference between user's status and page status(i. e. the matching degree)is toward to zero; the larger the matching degree of user and page, the smaller the power index of power-law distribution, so the longer the tail of power-law distribution. 0 0
Experiences of Wiki topic teaching in postgraduate orthodontics: What do the learners think? Ireland A.J.
Atack N.E.
Sandy J.R.
European Journal of Dental Education English Introduction: Traditionally, the academic content of many 3-year full-time postgraduate courses in orthodontics in the UK has been delivered using tutorial and lecture-based teaching. This is often teacher lead rather than learner centred. Even with the advent of teaching modules on the national virtual learning environment, although well liked by students, is still often teacher lead. An alternative on-line approach to learner-centred teaching is to use Wikis. Materials and methods: Nine postgraduate students in the first term of their full-time 3-year specialist training programme at Bristol Dental School were divided into three groups and wrote a Wiki on three interrelated topics. This process was repeated in the second term using three different, but still interrelated topics. Following each, they were asked to give detailed feedback on their Wiki topic teaching. Results and discussion: The results showed that students felt writing the Wikis was useful for team work, provided a more learner-centred approach, created a body of work in a live format that would be useful for revision and was a welcome variation on traditional teaching methods. The biggest problem encountered was the IT platform used to create the Wikis. The students also felt the Wikis should be assessed as a piece of group work rather than as separate individuals. Conclusions: Wiki topic teaching is a useful tool in the teaching of postgraduate orthodontics providing variation and a more learner-centred approach. Further exploration of the available IT platforms is required. 0 0
Exploiting the Arabic Wikipedia for semi-automatic construction of a lexical ontology Boudabous M.M.
Belguith L.H.
Sadat F.
Arabic lexical ontology
Arabic ontology
Arabic Wikipedia
Morpho-lexical patterns
Lexical ontology
Semantic relations.
International Journal of Metadata, Semantics and Ontologies English In this paper, we propose a hybrid (numerical/linguistic) method to build a lexical ontology for the Arabic language. This method is based on the Arabic Wikipedia. It consists of two phases: analysing the description section in order to build core ontology and then using the physical structure of Wikipedia articles (info-boxes, category pages and redirect links) and their contents for enriching the core ontology. The building phase of the core ontology is implemented via the TBAO system. The obtained core ontology contains more than 200,000 concepts. Copyright 0 0
Exploiting the category structure of Wikipedia for entity ranking Rianne Kaptein
Jaap Kamps
Category structure
Entity ranking
Link structure
Artificial Intelligence English The Web has not only grown in size, but also changed its character, due to collaborative content creation and an increasing amount of structure. Current Search Engines find Web pages rather than information or knowledge, and leave it to the searchers to locate the sought information within the Web page. A considerable fraction of Web searches contains named entities. We focus on how the Wikipedia structure can help rank relevant entities directly in response to a search request, rather than retrieve an unorganized list of Web pages with relevant but also potentially redundant information about these entities. Our results demonstrate the benefits of using topical and link structure over the use of shallow statistics. Our main findings are the following. First, we examine whether Wikipedia category and link structure can be used to retrieve entities inside Wikipedia as is the goal of the INEX (Initiative for the Evaluation of XML retrieval) Entity Ranking task. Category information proves to be a highly effective source of information, leading to large and significant improvements in retrieval performance on all data sets. Secondly, we study how we can use category information to retrieve documents for ad hoc retrieval topics in Wikipedia. We study the differences between entity ranking and ad hoc retrieval in Wikipedia by analyzing the relevance assessments. Considering retrieval performance, also on ad hoc retrieval topics we achieve significantly better results by exploiting the category information. Finally, we examine whether we can automatically assign target categories to ad hoc and entity ranking queries. Guessed categories lead to performance improvements that are not as large as when the categories are assigned manually, but they are still significant. We conclude that the category information in Wikipedia is a useful source of information that can be used for entity ranking as well as other retrieval tasks. © 2012 Elsevier B.V. All rights reserved. 0 0
Exploring the Cautionary Attitude Toward Wikipedia in Higher Education: Implications for Higher Education Institutions Bayliss G. Collaboratively produced knowledge
Information literacy
Web 2.0
New Review of Academic Librarianship English This article presents the research findings of a small-scale study which aimed to explore the cautionary attitude toward the use of Wikipedia in the process of learning. A qualitative case study approach was taken, using literature review, institutional documentation, and semi-structured interviews with five members of academic teaching staff from a UK Business School. Analysis found the reasons for the cautionary attitude were due to a lack of understanding of Wikipedia, a negative attitude toward collaborative knowledge produced outside academia, and the perceived detrimental effects of the use of Web 2.0 applications not included in the university suite. 0 0
Extending BCDM to cope with proposals and evaluations of updates Anselma L.
Bottrighi A.
Montani S.
Terenziani P.
Database design
Database semantics
Modeling and management
Temporal databases
IEEE Transactions on Knowledge and Data Engineering English The cooperative construction of data/knowledge bases has recently had a significant impulse (see, e.g., Wikipedia [1]). In cases in which data/knowledge quality and reliability are crucial, proposals of update/insertion/deletion need to be evaluated by experts. To the best of our knowledge, no theoretical framework has been devised to model the semantics of update proposal/evaluation in the relational context. Since time is an intrinsic part of most domains (as well as of the proposal/evaluation process itself), semantic approaches to temporal relational databases (specifically, Bitemporal Conceptual Data Model (henceforth, BCDM) [2]) are the starting point of our approach. In this paper, we propose BCDMPV, a semantic temporal relational model that extends BCDM to deal with multiple update/insertion/deletion proposals and with acceptances/rejections of proposals themselves. We propose a theoretical framework, defining the new data structures, manipulation operations and temporal relational algebra and proving some basic properties, namely that BCDMPV is a consistent extension of BCDM and that it is reducible to BCDM. These properties ensure consistency with most relational temporal database frameworks, facilitating implementations. 0 0
Extending inter-professional learning through the use of a multi-disciplinary Wiki Stephens M.
Robinson L.
McGrath D.
Shared learning
Nurse Education in Practice English This paper reports our experiences of a student learning activity which employed a Wiki for student radiographers and nurses to build on an inter-professional learning event. The aim of the Wiki was to facilitate inter-professional learning for students who, having met face-to-face once for a classroom based activity, would not be timetabled to meet again. It was designed to allow students from differing disciplines to: construct knowledge together, learn from and about one another, and collaboratively produce a textual learning resource.150 nursing and radiography undergraduates were provided with a PBL trigger related to the acute presentation of stroke. The students met once (5 mixed-discipline groups) to discuss the role of the professions and the outcomes for the trigger scenario. Further learning was enabled through the provision of a Wiki for each group. At week 4, all Wikis were made visible for group peer assessment. Wiki editing skills were provided by student 'Wiki champions', who cascaded training to their peers. We report and reflect on the students' evaluations of both the Wiki as process and outcome and discuss the value of Wikis for inter-professional learning. Findings show that, in addition to being an enjoyable and flexible learning experience, the Wiki satisfied its intended aims. There was a variation in the level and quality of student participation the causes of which are discussed. Ground rules for effective Wiki use are proposed. 0 0
Extending the possibilities for collaborative work with TEI/XML through the usage of a wiki system Entrup B.
Binder F.
Lobin H.
Collaborative work
ACM International Conference Proceeding Series English This paper presents and discusses an integrated project-specific working environment for editing TEI/XML files and linking entities of interest to a dedicated wiki system. This working environment has been specifically tailored to the workflow in our interdisciplinary digital humanities project "GeoBib". It addresses some challenges that arose while working with person-related data and geographical references in a growing collection of TEI/XML files. While our current solution provides some essential benefits, we also discuss several critical issues and challenges that remain. 0 0
Extracting PROV provenance traces from Wikipedia history pages Missier P.
Zheng Chen
Design ACM International Conference Proceeding Series English Wikipedia History pages contain provenance metadata that describes the history of revisions of each Wikipedia article. We have developed a simple extractor which, starting from a user-specified article page, crawls through the graph of its associated history pages, and encodes the essential elements of those pages according to the PROV data model. The crawling is performed on the live pages using the Wikipedia REST interface. The resulting PROV provenance graphs are stored in a graph database (Neo4J), where they can be queried using the Cypher graph query language (proprietary to Neo4J), or traversed programmatically using the Neo4J Java Traversal API. 0 0
Extracting complementary information from Wikipedia articles of different languages Akiyo Nadamoto
Fujiwara Y.
Konishi Y.
Yu Suzuki
Complementary information
International Journal of Business Intelligence and Data Mining English In Wikipedia, users can create and edit information freely. Few editors take responsibility for editing the articles. Therefore, information of many Wikipedia articles is lacking. Furthermore, Wikipedia has different levels of value of its information depending on the language version of the site. In this paper, we propose the extraction of complementary information from different language Wikipedia and its automatic presentation. The important points of our method are: 1) extraction of comparison articles from different language Wikipedia; 2) extraction of complementary information; 3) presentation of complementary information. 0 0
Extracting event-related information from article updates in Wikipedia Georgescu M.
Kanhabua N.
Krause D.
Wolfgang Nejdl
Siersdorfer S.
Lecture Notes in Computer Science English Wikipedia is widely considered the largest and most up-to-date online encyclopedia, with its content being continuously maintained by a supporting community. In many cases, real-life events like new scientific findings, resignations, deaths, or catastrophes serve as triggers for collaborative editing of articles about affected entities such as persons or countries. In this paper, we conduct an in-depth analysis of event-related updates in Wikipedia by examining different indicators for events including language, meta annotations, and update bursts. We then study how these indicators can be employed for automatically detecting event-related updates. Our experiments on event extraction, clustering, and summarization show promising results towards generating entity-specific news tickers and timelines. 0 0
Extracting knowledge from Wikipedia articles through distributed semantic analysis Hieu N.T.
Di Francesco M.
Yla-Jaaski A.
Distributed computing
Semantic analysis
Wikipedia knowledge
Word relatedness
ACM International Conference Proceeding Series English Computing semantic word similarity and relatedness requires access to vast amounts of semantic space for effective analysis. As a consequence, it is time-consuming to extract useful information from a large amount of data on a single workstation. In this paper, we propose a system, called Distributed Semantic Analysis (DSA), that integrates a distributed-based approach with semantic analysis. DSA builds a list of concept vectors associated with each word by exploiting the knowledge provided by Wikipedia articles. Based on such lists, DSA calculates the degree of semantic relatedness between two words through the cosine measure. The proposed solution is built on top of the Hadoop MapReduce framework and the Mahout machine learning library. Experimental results show two major improvements over the state of the art, with particular reference to the Explicit Semantic Analysis method. First, our distributed approach significantly reduces the computation time to build the concept vectors, thus enabling the use of larger inputs that is the basis for more accurate results. Second, DSA obtains a very high correlation of computed relatedness with reference benchmarks derived by human judgements. Moreover, its accuracy is higher than solutions reported in the literature over multiple benchmarks. 0 0
Extracting protein terminologies in literatures Gim J.
Kim D.J.
Myunggwon Hwang
Song S.-K.
Jeong D.-H.
Hanmin Jung
Keyword refinement
Protein terminologies
Wikipedia terminologis
Proceedings - 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, GreenCom-iThings-CPSCom 2013 English Recently, key terminologies in literatures play an important role in analyzing and predicting research trends. Extracting those terminologies therefore used in the papers of researchers' has become the most major issue in a variety of fields. To extract those terminologies, dictionary-based approach that contains terminologies has been applied. Wikipedia also can be considered as a dictionary since Wikipedia has abundant terminologies and power of the collective intelligence. It means that the terminologies are continuously modified and extended every day. Thus it could be an answer set to compare with the terminologies in literatures. However, it hardly extracts terminologies that are newly defined and coined by researchers. In order to solve this issue, we propose a method to derive a set of terminology candidates by comparing terminologies in literatures and Wikipedia. The candidate set extracted from the method showed an accuracy of about 64.33%, which is a good result as an initial study. 0 0
Extracting term relationships from Wikipedia Mathiak B.
Pena V.M.M.
Wira-Alam A.
Ontology matching
Relationship extraction
Lecture Notes in Business Information Processing English When looking at the relationship between two terms, we are not only interested on how much they are related, but how we may explain this relationship to the user. This is an open problem in ontology matching, but also in other tasks, from information retrieval to lexicography. In this paper, we propose a solution based on snippets taken from Wikipedia. These snippets are found by looking for connectors between the two terms, e.g. the terms themselves, but also terms that occur often in both articles or terms that link to both articles. With a user study, we establish that this is particularly useful when dealing with not well known relationships, but well-known concepts. The users were learning more about the relationship and were able to grade it accordingly. On real life data, there are some issues with near synonyms, which are not detected well and terms from different communities, but aside from that we get usable and useful explanations of the term relationships. 0 0
Extracting traffic information from web texts with a D-S evidence theory based approach Qiu P.
Lu F.
Haisu Zhang
D-S evidence theory
Text clustering
Traffic state
Web texts
International Conference on Geoinformatics English Web texts, such as web pages, BBS, or microblogs, usually contain a great amount of real-time traffic information, which can be expected to become an important data source for city traffic collection. However, due to the characteristics of ambiguity and uncertainty in the description of traffic condition with natural language, and the difference of description quality for web texts among various publishers and text types, there may exist much inconsistency, or even contradiction for the traffic condition on similar spatial-temporal contexts. An efficient information fusion process is crucial to take advantage of the mass web sources for real-time traffic collection. In this paper, we propose a traffic state extraction approach from massive web texts based on D-S evidence theory to solve the above problem. Firstly, an evaluation index system for the traffic state information collected from the web texts is built with the help of semantic similarity based on Wikipedia, to eliminate ambiguity. Then, D-S evidence theory is adopted to judge and fuse the extracted traffic state information, with evidence combination and decision, which can solve the problem of uncertainty and difference. An experiment shows that the presented approach can effectively judge the traffic state information contained in massive web texts, and can fully utilize the data from different websites. Meanwhile, the proposed approach is arguably more accurate than the traditional text clustering algorithm. 0 0
Extraction of biographical data from Wikipedia Viseur R. Biography
Open data
Text mining
DATA 2013 - Proceedings of the 2nd International Conference on Data Technologies and Applications English Using the content of Wikipedia articles is common in academic research. However the practicalities are rarely analysed. Our research focuses on extracting biographical information about personalities from Belgium. Our research is divided into three sections. The first section describes the state of the art for data extraction from Wikipedia. A second section presents the case study about data extraction for biographies of Belgian personalities. Different solutions are discussed and the solution adopted is implemented. In the third section, the quality of the extraction is discussed. Practical recommendations for researchers wishing to use Wikipedia are also proposed on the basis of our case study. 0 0
Extraction of linked data triples from japanese wikipedia text of ukiyo-e painters Kimura F.
Mitsui K.
Maeda A.
Linked data
Proceedings - 2013 International Conference on Culture and Computing, Culture and Computing 2013 English DBpedia provides Linked Data extracted from info boxes in Wikipedia articles. Extraction is easier from an infobox than from text because an info box has a fixed-format table to represent structured information. To provide more Linked Data, we propose a method for Linked Data triple extraction from Wikipedia text. In this study, we conducted an experiment to extract Linked Data triples from Wikipedia text of ukiyo-e painters and achieved precision of 0.605. 0 0
Factors that determine the level of interaction in wikis, blogs and forums in private virtual communities Valerio G.
Rangel I.
Flores M.
Virtual communities
Proceedings of the IADIS International Conference Web Based Communities and Social Media 2013, Proceedings of the IADIS International Conference Collaborative Technologies 2013 English Transformative changes due to globalization and the revolution of knowledge are forcing organizations to constantly innovate and create new capabilities in order to meet the pressure of increasing performance. One of these innovations is the creation of virtual communities as mechanisms to facilitate knowledge transfer, especially when companies have distributed experts around the globe. CEMEX, a multinational Mexican company from the construction sector, deployed a the IBM Connections social media platform that internally is called SHIFT. With the general objective of identifying the factors that determine the level of interaction throughout Wikis, Blogs and Forums, a mixed nature research was performed. Relating the qualitative section of the research, 990 activities were observed and analyzed inside the platform. Different uses given to each of the tools were identified as well as the potential factors that could affect the interaction. Also, the quantitative statistical analysis determined which of these factors had the greatest impact on the interaction. Overall, the results showed that each tool has different determinant factors that impact the interaction level. 0 0
Filling the gaps among DBpedia multilingual chapters for question answering Cojan J.
Cabrio E.
Fabien Gandon
Linked data
Property alignment
Question answering
Proceedings of the 3rd Annual ACM Web Science Conference, WebSci 2013 English To publish information extracted from multilingual pages of Wikipedia in a structured way, the Semantic Web community has started an effort of internationalization of DBpe-dia. Multilingual chapters of DBpedia can in fact contain different information with respect to the English version, in particular they provide more specificity on certain topics, or fill information gaps. DBpedia multilingual chapters are well connected through instance interlinking, extracted from Wikipedia. An alignment between properties is also carried out by DBpedia contributors as a mapping from the terms used in Wikipedia to a common ontology, enabling the exploitation of information coming from the multilingual chapters of DBpedia. However, the mapping process is currently incomplete, it is time consuming since it is manually performed, and may lead to the introduction of redundant terms in the ontology, as it becomes difficult to navigate through the existing vocabulary. In this paper we propose an approach to automatically extend the existing alignments, and we integrate it in a question answering system over linked data. We report on experiments carried out applying the QAKiS (Question Answering wiKiframework-based) system on the English and French DBpedia chapters, and we show that the use of such approach broadens its coverage. Copyright 2013 ACM. 0 0
Finding relevant missing references in learning courses Siehndel P.
Kawase R.
Hadgu A.T.
Herder E.
Linked data
WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web English Reference sites play an increasingly important role in learning processes. Teachers use these sites in order to identify topics that should be covered by a course or a lecture. Learners visit online encyclopedias and dictionaries to find alternative explanations of concepts, to learn more about a topic, or to better understand the context of a concept. Ideally, a course or lecture should cover all key concepts of the topic that it encompasses, but often time constraints prevent complete coverage. In this paper, we propose an approach to identify missing references and key concepts in a corpus of educational lectures. For this purpose, we link concepts in educational material to the organizational and linking structure ofWikipedia. Identifying missing resources enables learners to improve their understanding of a topic, and allows teachers to investigate whether their learning material covers all necessary concepts. 0 0
Formal and informal context factors as contributors to student engagement in a guided discovery-based program of game design learning Rebecca Reynolds
Chiu M.M.
Design-based research
Digital divide
Digital literacy
Educational technology
Evidence-based practice
Game design
Informal learning
Learning, Media and Technology English This paper explored informal (after-school) and formal (elective course in-school) learning contexts as contributors to middle-school student attitudinal changes in a guided discovery-based and blended e-learning program in which students designed web games and used social media and information resources for a full school year. Formality of the program context did not substantially influence attitude changes but did appear to influence learning outcomes. While intrinsic motivation did not change in the aggregate from pre- to post-program among students, positive changes in intrinsic motivation were found to be associated with engagement in almost all areas of student engagement in Globaloria, with several at-home engagement changes measured. This finding challenges critiques of discovery-based learning as being de-motivating. Lower parent education among students was associated with positive changes in self-efficacy for online research indicating that disadvantaged students may stand to benefit from programs like this one. The study offers support for the need to more definitively explicate instructional design and context factors in educational technology research when investigating influences upon learning outcomes. The study holds implications for designing effective digital literacy interventions, and contributes to theory in the learning sciences and socio-technical systems research. 0 0
Frequency and types of revision made in wiki assisted writing classroom Shah Yusoff Z.
Mat Daud N.
Computer Mediated Communication
Process writing
World Applied Sciences Journal English Some writing classrooms have included the use of computer-mediated communication (CMC) online tools to facilitate and enhance students' writing process. This case study investigates how one of the tools, namely wiki can be used in writing as a process (versus writing as a product) activity in providing feedback to students' academic report. Data consisted of the feedback given and revisions made by ESL engineering students via wiki. The students' first and final drafts were also evaluated for writing improvement. In addition, they were interviewed using semi-structured interview technique. Findings show that they made various surface level revisions after receiving feedback via wiki and these revisions resulted in an improvement in their final draft. The findings of the study suggest that wiki provides a platform for process writing activities without the need for face-to-face interaction. However, a supportive teaching and learning environment is essential to ensure a greater impact of the tool on process writing activities. 0 0
From Machu-Picchu to "rafting the urubamba river": Anticipating information needs via the entity-query graph Bordino I.
De Francisci Morales G.
Ingmar Weber
Bonchi F.
Entity extraction
Implicit search
Query suggestions
WSDM 2013 - Proceedings of the 6th ACM International Conference on Web Search and Data Mining English We study the problem of anticipating user search needs, based on their browsing activity. Given the current web page p that a user is visiting we want to recommend a small and diverse set of search queries that are relevant to the content of p, but also non-obvious and serendipitous. We introduce a novel method that is based on the content of the page visited, rather than on past browsing patterns as in previous literature. Our content-based approach can be used even for previously unseen pages. We represent the topics of a page by the set of Wikipedia entities extracted from it. To obtain useful query suggestions for these entities, we exploit a novel graph model that we call EQGraph (Entity-Query Graph), containing entities, queries, and transitions between entities, between queries, as well as from entities to queries. We perform Personalized PageRank computation on such a graph to expand the set of entities extracted from a page into a richer set of entities, and to associate these entities with relevant query suggestions. We develop an efficient implementation to deal with large graph instances and suggest queries from a large and diverse pool. We perform a user study that shows that our method produces relevant and interesting recommendations, and outperforms an alternative method based on reverse IR. 0 0
Gene wiki reviews: Marrying crowdsourcing with traditional peer review Su A.I.
Good B.M.
Van Wijnen A.J.
Gene Wiki
Gene English [No abstract available] 0 0
Generating web-based corpora for video transcripts categorization Perea-Ortega J.M.
Montejo-Raez A.
Teresa Martin-Valdivia M.
Alfonso Urena-Lopez L.
Automatic Speech Recognition (ASR)
Video tagging
Video transcripts categorization
Web-based corpora generation
Expert Systems with Applications English This paper proposes the use of Internet as a rich source of information in order to generate learning corpora for video transcripts categorization systems. Our main goal in this work has been to study the behavior of different learning corpora generated from the Internet and analyze some of their features. Specifically, Wikipedia, Google and the blogosphere have been employed to generate these learning corpora, using the VideoCLEF 2008 track as the evaluation framework for the different experiments carried out. Based on this evaluation framework, we conclude that the proposed approach is a promising strategy for the video classification task using the transcripts of the videos. The different sizes of the corpora generated could lead to believe that better results are achieved when the corpus size is larger, but we demonstrate that this feature may not always be a reliable indicator of the behavior of the learning corpus. The obtained results show that the integration of knowledge from the blogosphere or Google allows generating more reliable corpora for this task than those based on Wikipedia. © 2012 Elsevier Ltd. All rights reserved. 0 0
Geowiki + Route analysis = Improved transportation planning Masli M.
Owen A.
Bouma L.
Loren Terveen
Geographic wikis
Route analysis
Transportation planning
English This poster describes the design of a novel route analysis tool based on a community-driven, geographic wiki to assist transportation planners to make better decisions. We highlight the advantages of our tool over other, similar ones-gained due to the use of a wiki-based platform-through a real-life usage scenario. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM). 0 0
Getting to the source: Where does wikipedia get its information from? Heather Ford
Shilad Sen
Musicant D.R.
Nora Miller
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English We ask what kinds of sources Wikipedians value most and compare Wikipedia's stated policy on sources to what we observe in practice. We find that primary data sources developed by alternative publishers are both popular and persistent, despite policies that present such sources as inferior to scholarly secondary sources. We also find that Wikipedians make almost equal use of information produced by associations such as nonprofits as from scholarly publishers, with a significant portion coming from government information sources. Our findings suggest the rise of new influential sources of information on the Web but also reinforce the traditional geographic patterns of scholarly publication. This has a significant effect on the goal of Wikipedians to represent "the sum of all human knowledge." Categories and Subject Descriptors H.3.4 [Information Systems]: Systems and SoftwareInformation Networks; H.5.3 [Information Systems]: Group and Organization Interfacescomputer-supported collaborative work General Terms Human Factors, Measurement. Copyright 2010 ACM. 0 0
HCI aspects of social media in collaboration of software developers Savkovic M.
Stavljanin V.
Minovic M.
Social media
International Journal of Engineering Education English While collaborating using social networks, software developers are stimulated not only to consume content but to create it as well. Software developers are often geographically dispersed and therefore work in different time zones. Besides collaborating using standard means of communication they are often engaged in a very interactive process involving not only their immediate colleagues but also other members of social networks as well. HCI aspects of social media in collaborating environments are still to be explored. Latest mobile devices (smart phones and tablets) with high-resolution displays and impressive specifications offer possibilities for HCI change when it comes to social media and Web 2.0 applications. Software developers began using forums then Wikis and now are relying more and more on micro-blogging and social networks. They are stimulated to consume as well as create new content and their status changes when they solve problems and help others. 0 0
Harmonizing and combining existing land cover/land use datasets for cropland area monitoring at the African continental scale Vancutsem C.
Marinho E.
Kayitakire F.
Linda See
Steffen Fritz
Crop mask
Food security
Remote Sensing English Mapping cropland areas is of great interest in diverse fields, from crop monitoring to climate change and food security. Recognizing the value of a reliable and harmonized crop mask that entirely covers the African continent, the objectives of this study were to (i) consolidate the best existing land cover/land use datasets, (ii) adapt the Land Cover Classification System (LCCS) for harmonization, (iii) assess the final product, and (iv) compare the final product with two existing datasets. Ten datasets were compared and combined through an expert-based approach in order to create the derived map of cropland areas at 250 m covering the whole of Africa. The resulting cropland mask was compared with two recent cropland extent maps at 1 km: one derived from MODIS and one derived from five existing products. The accuracy of the three products was assessed against a validation sample of 3,591 pixels of 1km regularly distributed over Africa and interpreted using high resolution images, which were collected using the Geo-Wiki tool. The comparison of the resulting crop mask with existing products shows that it has a greater agreement with the expert validation dataset, in particular for places where the cropland represents more than 30% of the area of the validation pixel. 0 0
Harvesting models from web 2.0 databases Diaz O.
Gorka Puente
Canovas Izquierdo J.L.
Garcia Molina J.
Data re-engineering
Model-driven engineering
Software and Systems Modeling English Data rather than functionality are the sources of competitive advantage for Web2. 0 applications such as wikis, blogs and social networking websites. This valuable information might need to be capitalized by third-party applications or be subject to migration or data analysis. Model-Driven Engineering (MDE) can be used for these purposes. However, MDE first requires obtaining models from the wiki/blog/website database (a. k. a. model harvesting). This can be achieved through SQL scripts embedded in a program. However, this approach leads to laborious code that exposes the iterations and table joins that serve to build the model. By contrast, a Domain-Specific Language (DSL) can hide these "how" concerns, leaving the designer to focus on the "what", i. e. the mapping of database schemas to model classes. This paper introduces Schemol, a DSL tailored for extracting models out of databases which considers Web2. 0 specifics. Web2. 0 applications are often built on top of general frameworks (a. k. a. engines) that set the database schema (e. g., MediaWiki, Blojsom). Hence, table names offer little help in automating the extraction process. In addition, Web2. 0 data tend to be annotated. User-provided data (e. g., wiki articles, blog entries) might contain semantic markups which provide helpful hints for model extraction. Unfortunately, these data end up being stored as opaque strings. Therefore, there exists a considerable conceptual gap between the source database and the target metamodel. Schemol offers extractive functions and view-like mechanisms to confront these issues. Examples using Blojsom as the blog engine are available for download. 0 0
He's gone and wrote over it': The use of wikis for collaborative report writing in a primary school classroom Doult W.
Walker S.A.
Education 3-13 English Wikis (websites that can be edited quickly by multiple authors) were used with upper-primary school children to write group reports on a science topic. Two teachers observed the children working, and their observations were used alongside the texts from the wikis and group interviews with children to explore the question of whether using wikis would lead to a change in writing practices and attitudes. This study found that although children often felt proprietorial about their texts, there was some evidence of negotiation and of joint content building. There was also evidence of peer-supported learning of information and communications technology (ICT) skills. Furthermore, the quality and quantity of writing were greater when using wikis than in conventional writing contexts, and the groups which engaged in more discussion produced more text. 0 0
Hot Off the Wiki: Structures and Dynamics of Wikipedia's Coverage of Breaking News Events Brian Keegan
Darren Gergle
Noshir Contractor
Social network
American Behavioral Scientist English Wikipedia's coverage of breaking news and current events dominates editor contributions and reader attention in any given month. Collaborators on breaking news articles rapidly synthesize content to produce timely information in spite of steep coordination demands. Wikipedia's coverage of breaking news events thus presents a case to test theories about how open collaborations coordinate complex, time-sensitive, and knowledge-intensive work in the absence of central authority, stable membership, clear roles, or reliable information. Using the revision history from Wikipedia articles about over 3,000 breaking news events, we investigate the structure of interactions between editors and articles. Because breaking article collaborations unfold more rapidly and involve more editors than most Wikipedia articles, they potentially regenerate prior forms of organizing. We analyze whether the structures of breaking and nonbreaking article networks are (a) similarly structured over time, (b) exhibit features of organizational regeneration, and (c) have similar collaboration dynamics over time. Breaking and nonbreaking article exhibit similarities in their structural characteristics over the long run, and there is less evidence of organizational regeneration on breaking articles than nonbreaking articles. However, breaking articles emerge into well-connected collaborations more rapidly than nonbreaking articles, suggesting early contributors play a crucial role in supporting these high-tempo collaborations. 0 0
How do metrics of link analysis correlate to quality, relevance and popularity in Wikipedia? Hanada R.T.S.
Marco Cristo
Pimentel M.D.G.C.
Information retrieval
Link analysis
Quality of content
WebMedia 2013 - Proceedings of the 19th Brazilian Symposium on Multimedia and the Web English Many links between Web pages can be viewed as indicative of the quality and importance of the pages they pointed to. Accordingly, several studies have proposed metrics based on links to infer web page content quality. However, as far as we know, the only work that has examined the correlation between such metrics and content quality consisted of a limited study that left many open questions. In spite of these metrics having been shown successful in the task of ranking pages which were provided as answers to queries submitted to search engines, it is not possible to determine the specific contribution of factors such as quality, popularity, and importance to the results. This difficulty is partially due to the fact that such information is hard to obtain for Web pages in general. Unlike ordinary Web pages, the quality, importance and popularity of Wikipedia articles are evaluated by human experts or might be easily estimated. Thus, it is feasible to verify the relation between link analysis metrics and such factors in Wikipedia articles, our goal in this work. To accomplish that, we implemented several link analysis algorithms and compared their resulting rankings with the ones created by human evaluators regarding factors such as quality, popularity and importance. We found that the metrics are more correlated to quality and popularity than to importance, and the correlation is moderate. 0 0
How much information is geospatially referenced? Networks and cognition Hahmann S.
Burghardt D.
Cognition of geographic information
Geographic information retrieval
Geospatial reference
Scale-free networks
International Journal of Geographical Information Science English The aim of this article is to provide a basis in evidence for (or against) the much-quoted assertion that 80% of all information is geospatially referenced. For this purpose, two approaches are presented that are intended to capture the portion of geospatially referenced information in user-generated content: a network approach and a cognitive approach. In the network approach, the German Wikipedia is used as a research corpus. It is considered a network with the articles being nodes and the links being edges. The Network Degree of Geospatial Reference (NDGR) is introduced as an indicator to measure the network approach. We define NDGR as the shortest path between any Wikipedia article and the closest article within the network that is labeled with coordinates in its headline. An analysis of the German Wikipedia employing this approach shows that 78% of all articles have a coordinate themselves or are directly linked to at least one article that has geospatial coordinates. The cognitive approach is manifested by the categories of geospatial reference (CGR): direct, indirect, and non-geospatial reference. These are categories that may be distinguished and applied by humans. An empirical study including 380 participants was conducted. The results of both approaches are synthesized with the aim to (1) examine correlations between NDGR and the human conceptualization of geospatial reference and (2) to separate geospatial from non-geospatial information. From the results of this synthesis, it can be concluded that 56-59% of the articles within Wikipedia can be considered to be directly or indirectly geospatially referenced. The article thus describes a method to check the validity of the '80%-assertion' for information corpora that can be modeled using graphs (e.g., the World Wide Web, the Semantic Web, and Wikipedia). For the corpus investigated here (Wikipedia), the '80%-assertion' cannot be confirmed, but would need to be reformulated as a '60%-assertion'. 0 0
How much is said in a tweet? A multilingual, information-theoretic perspective Neubig G.
Kevin Duh
AAAI Spring Symposium - Technical Report English This paper describes a multilingual study on how much information is contained in a single post of microblog text from Twitter in 26 different languages. In order to answer this question in a quantitative fashion, we take an information-theoretic approach, using entropy as our criterion for quantifying "how much is said" in a tweet. Our results find that, as expected, languages with larger character sets such as Chinese and Japanese contain more information per character than other languages. However, we also find that, somewhat surprisingly, information per character does not have a strong correlation with information per microblog post, as authors of microblog posts in languages with more information per character do not necessarily use all of the space allotted to them. Finally, we examine the relative importance of a number of factors that contribute to whether a language has more or less information content in each character or post, and also compare the information content of microblog text with more traditional text from Wikipedia. Copyright 0 0
Is Wikipedia a Relevant Model for E-Learning? Pierre-Carl Langlais E-learning
Collaborative learning
Social constructivism
English This article gives a global appraisal of wiki-based pedagogic projects. The growing influence of Wikipedia on students’ research practices have actually made these a promising area for educational research.

A compilation of data published by 30 previous academic case studies reveals several recurrent features. Wikis are not so easily adopted: most wiki learning programs begin by a slow initial phase, marked by a general unwillingness to adapt to an unusual environment. Some sociological factors, like age and, less clearly, gender may contribute to increase this initial reluctance.

In spite of their uneasiness, wikis proved precious tools on one major aspect: they give a vivid representation of scientific communities. Students get acquainted with some valuable epistemic practices and norms, such as collaborative work and critical thought. While not improving significantly the memorization of information, wikis clearly enhance research abilities.

This literature review can assist teachers in determining whether the use of wiki fits their pedagogic aims.
10 0
ISICIL: Semantics and social networks for business intelligence Michel Buffa
Delaforge N.
Ereteo G.
Fabien Gandon
Giboin A.
Limpens F.
Business intelligence
Semantic wiki
Social network
Social network analysis
Social semantic web
Lecture Notes in Computer Science English The ISICIL initiative (Information Semantic Integration through Communities of Intelligence onLine) mixes viral new web applications with formal semantic web representations and processes to integrate them into corporate practices for technological watch, business intelligence and scientific monitoring. The resulting open source platform proposes three functionalities: (1) a semantic social bookmarking platform monitored by semantic social network analysis tools, (2) a system for semantically enriching folksonomies and linking them to corporate terminologies and (3) semantically augmented user interfaces, activity monitoring and reporting tools for business intelligence. 0 0
Identifying multilingual wikipedia articles based on cross language similarity and activity Tran K.-N.
Christen P.
International Conference on Information and Knowledge Management, Proceedings English Wikipedia is an online free and open access encyclopedia available in many languages. Wikipedia articles across over 280 languages are written by millions of editors. However, the growth of articles and their content is slowing, especially within the largest Wikipedia language: English. The stabilization of articles presents opportunities for multilingual Wikipedia editors to apply their translation skills to add articles and content to smaller Wikipedia languages. In this poster, we propose similarity and activity measures of Wiki-pedia articles across two languages: English and German. These measures allow us to evaluate the distribution of articles based on their knowledge coverage and their activity across languages. We show the state of Wikipedia articles as of June 2012 and discuss how these measures allow us to develop recommendation and verification models for multilingual editors to enrich articles and content in Wikipedia languages with relatively smaller knowledge coverage. Copyright 2013 ACM. 0 0
Identifying, understanding and detecting recurring, harmful behavior patterns in collaborative wikipedia editing - Doctoral proposal Flock F.
Elena Simperl
Rettinger A.
Collaboration systems
Collective intelligence
Editing behavior
Social dynamics
User modeling
Web science
WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web English In this doctoral proposal, we describe an approach to identify recurring, collective behavioral mechanisms in the collaborative interactions of Wikipedia editors that have the potential to undermine the ideals of quality, neutrality and completeness of article content. We outline how we plan to parametrize these patterns in order to understand their emergence and evolution and measure their effective impact on content production in Wikipedia. On top of these results we intend to build end-user tools to increase the transparency of the evolution of articles and equip editors with more elaborated quality monitors. We also sketch out our evaluation plans and report on already accomplished tasks. 0 0
Impact of Web 2.0 technologies on academic libraries: A survey of ARL libraries Mahmood K.
Richardson J.V.
Academic libraries
Electronic Library English Purpose - The paper aims to present the results of a survey of academic libraries about the adoption and perceived impact of Web 2.0 technologies. Design/methodology/approach - A total of 67 US academic libraries participated among the members of the Association of Research Libraries. Findings - It was found that each library was using some form of technology, such as RSS, blogs, social networking sites, wikis and instant messaging. On a Likert-type scale the participant librarians significantly preferred the advantages of Web 2.0 over its disadvantages. There was a significant positive correlation between the extent of Web 2.0 adoption in libraries and librarians' opinion about their advantages. Originality/value - The paper is useful for future planning of the use of Web 2.0 technologies in academic libraries. Copyright © 2013 Emerald Group Publishing Limited. All rights reserved. 0 0
Impact of Wikipedia on citation trends Marashi S.-A.
Hosseini-Nami S.M.A.
Alishah K.
Hadi M.
Karimi A.
Hosseinian S.
Ramezanifard R.
Mirhassani R.S.
Hosseini Z.
Shojaie Z.
EXCLI Journal English It has been suggested that the "visibility" of an article influences its citation count. More spe-cifically, it is believed that the social media can influence article citations.Here we tested the hypothesis that inclusion of scholarly references in Wikipedia affects the citation trends. To perform this analysis, we introduced a citation "propensity" measure, which is inspired by the concept of amino acid propensity for protein secondary structures. We show that although ci-tation counts generally increase during time, the citation "propensity" does not increase after inclusion of a reference in Wikipedia. 0 0
Impact of wikipedia on market information environment: Evidence on management disclosure and investor reaction Xu S.X.
Zhang X.M.
Financial market
Information aggregation
Information environment
Management disclosure
Social media
MIS Quarterly: Management Information Systems English In this paper, we seek to determine whether a typical social media platform, Wikipedia, improves the information environment for investors in the financial market. Our theoretical lens leads us to expect that information aggregation about public companies on Wikipedia may influence how management's voluntary information disclosure reacts to market uncertainty with respect to investors' information about these companies. Our empirical analysis is based on a unique data set collected from financial records, management disclosure records, news article coverage, and a Wikipedia modification history of public companies. On the supply side of information, we find that information aggregation on Wikipedia can moderate the timing of managers' voluntary disclosure of companies' earnings disappointments, or bad news. On the demand side of information, we find that Wikipedia's information aggregation moderates investors' negative reaction to bad news. Taken together, these findings support the view that Wikipedia improves the information environment in the financial market and underscore the value of information aggregation through the use of information technology. 0 0
Implementing organizational self awareness: A semantic mediawiki based enterprise ontology management approach Aveiro D.
Pinto D.
Abstract syntax
Concrete syntax
Enterprise engineering
Semantic web
IC3K 2013; KEOD 2013 - 5th International Conference on Knowledge Engineering and Ontology Development, Proceedings English In this paper we present a solution currently being developed to enable collaborative enterprise ontology model management using the Semantic MediaWiki as a base tool. This solution is solidly grounded on the theoretical foundations of Organizational Self-Awareness and φ-theory of enterprise ontology and is a valuable contribution to facilitate general and distributed enterprise model management and also concrete and abstract syntax specification, i.e., the specification of a language's meta-model. This allows flexibility and ease of use in creation and adaptation of organizational models and also the use of semantic queries as to detect and inform users on any violation of meta-model rules. Copyright 0 0
Improved concept-based query expansion using Wikipedia Yuvarani M.
Iyengar N.Ch.S.N.
Kannan A.
Direction finder
Index terms
Query expansion
Query formulation
Web search
International Journal of Communication Networks and Distributed Systems English The query formulation has always been a challenge for the users. In this paper, we propose a novel interactive query expansion methodology that identifies and presents the potential directions (generalised concepts) for the given query enabling the user to explore the interested topic further. The methodology proposed is concept-based direction (CoD) finder which relies on the external knowledge repository for finding the directions. Wikipedia, the most important non-profit crowdsourcing project, is considered as the external knowledge repository for CoD finder methodology. CoD finder identifies the concepts for the given query and derives the generalised direction for each of the concepts, based on the content of the Wikipedia article and the categories it belongs to. The CoD finder methodology has been evaluated in the crowdsourcing marketplace - Amazon's Mechanical Turk for measuring the quality of the identified potential directions. The evaluation result shows that the potential directions identified by the CoD finder methodology produces better precision and recall for the given queries. Copyright 0 0
Improved text annotation with wikipedia entities Makris C.
Plegas Y.
Theodoridis E.
Semantic data linking on web
Semantics and ontologies in data integration
Text semantic annotation
Wikipedia entities
Proceedings of the ACM Symposium on Applied Computing English Text annotation is the procedure of initially identifying, in a segment of text, a set of dominant in meaning words and later on attaching to them extra information (usually drawn from a concept ontology, implemented as a catalog) that expresses their conceptual content in the current context. Attaching additional semantic information and structure helps to represent, in a machine interpretable way, the topic of the text and is a fundamental preprocessing step to many Information Retrieval tasks like indexing, clustering, classification, text summarization and cross-referencing content on web pages, posts, tweets etc. In this paper, we deal with automatic annotation of text documents with entities of Wikipedia, the largest online knowledge base; a process that is commonly known as Wikification. Moving similarly to previous approaches the cross-reference of words in the text to Wikipedia articles is based on local compatibility between the text around the term and textual information embedded in the article. The main contribution of this paper is a set of disambiguation techniques that enhance previously published approaches by employing both the WordNet lexical database and the Wikipedia article's PageRank scores in the disambiguation process. The experimental evaluation performed depicts that the exploitation of these additional semantic information sources leads to more accurate Text Annotation. Copyright 2013 ACM. 0 0
Improving large-scale search engines with semantic annotations Fuentes-Lorenzo D.
Fernandez N.
Fisteus J.A.
Sanchez L.
Click-through data
Collaborative tagging
Ranking algorithm
Semantic annotation
Semantic search
Expert Systems with Applications English Traditional search engines have become the most useful tools to search the World Wide Web. Even though they are good for certain search tasks, they may be less effective for others, such as satisfying ambiguous or synonym queries. In this paper, we propose an algorithm that, with the help of Wikipedia and collaborative semantic annotations, improves the quality of web search engines in the ranking of returned results. Our work is supported by (1) the logs generated after query searching, (2) semantic annotations of queries and (3) semantic annotations of web pages. The algorithm makes use of this information to elaborate an appropriate ranking. To validate our approach we have implemented a system that can apply the algorithm to a particular search engine. Evaluation results show that the number of relevant web resources obtained after executing a query with the algorithm is higher than the one obtained without it. © 2012 Elsevier Ltd. All rights reserved. 0 0
Improving revision in wiki-based writing: Coordination pays off Wichmann A.
Rummel N.
Collaboration scripts
Collaborative authoring
Computers and Education English Wiki-based writing possesses a great deal of educational potential, yet students face difficulties while writing a shared document. Revising a shared document, in particular, seems to be a demanding activity for students. This study investigated whether collaboration scripts can help to improve students' revision activities and overall text quality. We compared scripted (script+) with unscripted (script-) collaboration in a wiki-based writing setting that was adapted for educational purposes. Students from two university courses participated in a one-week collaborative writing activity. Results showed that students in the scripted condition outperformed students in the unscripted condition with respect to revision behavior and text coherence. Furthermore, we found that students' revision behavior correlated positively with text coherence. Results from analyzing students' discussions during the writing activity revealed more frequent coordination with respect to task division and increased communication frequency for students in the scripted condition. Results also indicate that collaboration scripts can foster coordination. Our findings suggest that collaboration scripts are promising means of structuring collaboration during wiki-based writing. © 2012 Elsevier Ltd. All rights reserved. 0 0
Improving semi-supervised text classification by using wikipedia knowledge Zhang Z.
Hong Lin
Li P.
Haofen Wang
Lu D.
Clustering Based Classification
Semi-supervised Text Classification
Lecture Notes in Computer Science English Semi-supervised text classification uses both labeled and unlabeled data to construct classifiers. The key issue is how to utilize the unlabeled data. Clustering based classification method outperforms other semi-supervised text classification algorithms. However, its achievements are still limited because the vector space model representation largely ignores the semantic relationships between words. In this paper, we propose a new approach to address this problem by using Wikipedia knowledge. We enrich document representation with Wikipedia semantic features (concepts and categories), propose a new similarity measure based on the semantic relevance between Wikipedia features, and apply this similarity measure to clustering based classification. Experiment results on several corpora show that our proposed method can effectively improve semi-supervised text classification performance. 0 0
Improving students' summary writing ability through collaboration: A comparison between online wiki group and conventional face-to-face group Wichadee S. Online collaboration
Writing ability
Turkish Online Journal of Educational Technology English Wikis, as one of the Web 2.0 social networking tools, have been increasingly integrated into second language (L2) instruction to promote collaborative writing. The current study examined and compared summary writing abilities between students learning by wiki-based collaboration and students learning by traditional face-to-face collaboration. The experimental research was conducted with students enrolled in EN 111 course in the first semester of academic year 2011. The instruments employed in the study were summary writing tests, a questionnaire, and products of summary writing. Data were analyzed by using means, standard deviations, percentages, and t-tests. The results indicate that the post-test scores of both groups were significantly higher than the pre-test scores. (p<.05). However, no significant difference was found between the two groups' writing mean scores and satisfaction with the learning methods. In addition, the writing products which students in both groups submitted were not different in quality. Although there were minor drawbacks, a lot of advantages were identified, showing students' positive attitudes towards learning through wiki. 0 0
Improving text categorization with semantic knowledge in wikipedia Xiaolong Wang
Jia Y.
Chen K.
Fan H.
Zhou B.
Document representation
Se-mantic matrix
Text categorization
IEICE Transactions on Information and Systems English Text categorization, especially short text categorization, is a difficult and challenging task since the text data is sparse and multidimen-sional. In traditional text classification methods, document texts are repre-sented with Bag of Words (BOW) text representation schema, which is based on word co-occurrence and has many limitations. In this paper, we mapped document texts to Wikipedia concepts and used the Wikipedia-concept-based document representation method to take the place of tradi-tional BOW model for text classification. In order to overcome the weak-ness of ignoring the semantic relationships among terms in document rep-resentation model and utilize rich semantic knowledge in Wikipedia, we constructed a semantic matrix to enrich Wikipedia-concept-based docu-ment representation. Experimental evaluation on five real datasets of long and short text shows that our approach outperforms the traditional BOW method. 0 0
Improving the transcription of academic lectures for information retrieval Mbogho A.
Marquard S.
Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013 English Recording university lectures through lecture capture systems is increasingly common, generating large amounts of audio and video data. Transcribing recordings greatly enhances their usefulness by making them easy to search. However, the number of recordings accumulates rapidly, rendering manual transcription impractical. Automatic transcription, on the other hand, suffers from low levels of accuracy, partly due to the special language of academic disciplines, which standard language models do not cover. This paper looks into the use of Wikipedia to dynamically adapt language models for scholarly speech. We propose Ranked Word Correct Rate as a new metric better aligned with the goals of improving transcript search ability and specialist word recognition. The study shows that, while overall transcription accuracy may remain low, targeted language modelling can substantially improve search ability, an important goal in its own right. 0 0
… further results