From WikiPapers
Jump to: navigation, search
<< 2008 - 2009 - 2010 - 2011 - 2012 - 2013 - 2014 >>

This is a list of 7 events celebrated and 923 publications published in 2011.


Name City Country DateThis property is a special property in this wiki.
Iberocoop 2011 Buenos Aires Argentina 24 June 2011
RecentChangesCamp 2011 Boston Boston United States 11 March 2011
RecentChangesCamp 2011 Canberra Canberra Australia 28 January 2011
Wiki Conference India 2011 Mumbai India 18 November 2011
Wiki Loves Monuments 2011 Europe September 2011
WikiSym 2011 Mountain View United States 3 October 2011
Wikimania 2011 Haifa Israel 4 August 2011


Title Author(s) Keyword(s) Published in Language Abstract R C
"How should I go from-to-without getting killed?" Motivation and benefits in open collaboration Katherine Panciera
Masli M.
Loren Terveen
Open content
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English Many people rely on open collaboration projects to run their computer (Linux), browse the web (Mozilla Firefox), and get information (Wikipedia). While these projects are successful, many such efforts suffer from lack of participation. Understanding what motivates users to participate and the benefits they perceive from their participation can help address this problem. We examined these issues through a survey of contributors and information consumers in the Cyclopath geographic wiki. We analyzed subject responses to identify a number of key motives and perceived benefits. Based on these results, we articulate several general techniques to encourage more and new forms of participation in open collaboration communities. Some of these techniques have the potential to engage information consumers more deeply and productively in the life of open collaboration communities. 0 0
"Lexicon of Love": Genre Description of Popular Music Is Not as Simple as ABC Kulczak D.E.
Lennertz Jetton L.
Encyclopedia of Popular Music
Popular music
Subject headings
Music Reference Services Quarterly English In 2007, the University of Arkansas Libraries received a large donation of 5,295 popular music recordings. This gift nearly doubled the existing CD holdings and greatly altered the collection's emphasis. The present authors sought to enhance retrieval in the local catalog, concentrating particularly on subject access. To this end, they analyzed popular music genre terminology used by three well-known online resources: allmusic.com, Wikipedia, and The Encyclopedia of Popular Music. This article reports their findings about the genres that each assigned to the specific artists in the collection and whether the genres were applied consistently among the three resources. 0 0
0-Step K-means for clustering Wikipedia search results Szymanski J.
Wegrzynowicz K.
INISTA 2011 - 2011 International Symposium on INnovations in Intelligent SysTems and Applications English This article describes an improvement for K-means algorithm and its application in the form of a system that clusters search results retrieved from Wikipedia. The proposed algorithm eliminates K-means disadvantages and allows one to create a cluster hierarchy. The main contributions of this paper include the following: (1) The concept of an improved K-means algorithm and its application for hierarchical clustering. (2) Description of the WikiClusterSearch system that employs the proposed algorithm to organize Wikipedia search results into clusters. 0 0
2 nd international workshop on intelligent user interfaces for developing regions: IUI4DR Agarwal S.K.
Rajput N.
Thies B.
Paek T.
Developing countries
User interfaces
International Conference on Intelligent User Interfaces, Proceedings IUI English Information Technology (IT) has had significant impact on the society and has touched all aspects of our lives. Up and until now computers and expensive devices have fueled this growth. It has resulted in several benefits to the society. The challenge now is to take this success of IT to its next level where IT services can be accessed by the users in developing regions. The focus of the workshop in 2011 is to identify the alternative sources of intelligence and use them to ease the interaction process with information technology. We would like to explore the different modalities, their usage by the community, the intelligence that can be derived by the usage, and finally the design implications on the user interface. We would also like to explore ways in which people in developing regions would react to collaborative technologies and/or use collaborative interfaces that require community support to build knowledge bases (example Wikipedia) or to enable effective navigation of content and access to services. 0 0
5th Workshop on Wikis for Software Engineering Ademar Aguiar
Paulo Merson
Software development
Wikis for software engineering
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English Using a wiki in software engineering settings dates back to its first usage in 1995. In fact, that was the motivation for Ward Cunningham to create the first wiki. Due to its simplicity, attractiveness and effectiveness for collaborative authoring and knowledge management, wikis are now massively disseminated and used in different domains. This workshop focuses on wikis for the specific domain of software engineering. It aims at bringing together researchers, practitioners, and enthusiasts interested on researching, exploring and learning how wikis can be improved, customized and used to better support software projects. Based on lessons learned and obstacles identified, a research agenda will be defined with key opportunities and challenges. 0 0
959 Nematode Genomes: a semantic wiki for coordinating sequencing projects Sujai Kumar
Philipp H. Schiffer
Mark Blaxter
English Genome sequencing has been democratized by second-generation technologies, and even small labs can sequence metazoan genomes now. In this article, we describe '959 Nematode Genomes'-a community-curated semantic wiki to coordinate the sequencing efforts of individual labs to collectively sequence 959 genomes spanning the phylum Nematoda. The main goal of the wiki is to track sequencing projects that have been proposed, are in progress, or have been completed. Wiki pages for species and strains are linked to pages for people and organizations, using machine- and human-readable metadata that users can query to see the status of their favourite worm. The site is based on the same platform that runs Wikipedia, with semantic extensions that allow the underlying taxonomy and data storage models to be maintained and updated with ease compared with a conventional database-driven web site. The wiki also provides a way to track and share preliminary data if those data are not polished enough to be submitted to the official sequence repositories. In just over a year, this wiki has already fostered new international collaborations and attracted newcomers to the enthusiastic community of nematode genomicists. www.nematodegenomes.org. 0 0
A Characterization of Wikipedia Content Based on Motifs in the Edit Graph Guangyu Wu
Martin Harrigan
Pádraig Cunningham
SMUC '11: Proceedings of the 3rd international workshop on Search and mining user-generated contents English Good Wikipedia articles are authoritative sources due to the collaboration of a number of knowledgeable contributors. This is the many eyes idea. The edit network associated with a Wikipedia article can tell us something about its quality or authoritativeness. In this paper we explore the hypothesis that the characteristics of this edit network are predictive of the quality of the corresponding article's content. We characterize the edit network using a profile of network motifs and we show that this network motif profile is predictive of the Wikipedia quality classes assigned to articles by Wikipedia editors. We further show that the network motif profile can identify outlier articles particularly in the 'Featured Article' class, the highest Wikipedia quality class. 8 0
A DSL for corporate wiki initialization Lecture Notes in Computer Science English 0 0
A Distributed Wiki System Based on Peer-to-Peer File Sharing Principles Alexander Craig
Alan Davoust
Babak Esfandiari
Graph queries
WI-IAT English 0 0
A Framework for Adopting Collaboration 2.0 Tools for Virtual Group Decision Making Turban E.
Liang T.-P.
Wu S.P.J.
Collaborative decision making
Collective intelligence
Discussion forums
Enterprise 2.0
Group support systems
Social network
Social software Web 2.0
Virtual teams
Group Decision and Negotiation English Decision making in virtual teams is gaining momentum due to globalization, mobility of employees, and the need for collective and rapid decision making by members who are in different locations. These factors resulted in a proliferation of virtual team software support tools for decision making, the latest of which is social software (also known as collaboration 2.0), which includes tools such as wikis, blogs, microblogs, discussion forums, and social networking platforms. This paper describes the potential use of collaboration 2.0 software for improving the process and the specific tasks in virtual group decision making. The paper proposes a framework for exploring the fitness between social software and the major activities in the group decision making process and how such tools can be successfully adopted. Specifically, we use a fit-viability model to help assessing whether social software fit a decision task and what organizational factors are important for such tools to be effective. Representative research issues related to the use of such tools are also presented. © 2010 Springer Science+Business Media B.V. 0 0
A RESTful technique for collaborative learning content transclusion by Wiki-style mashups Tosic M.
Manic M.
End user programming
Proceedings - 2011 5th IEEE International Conference on E-Learning in Industrial Electronics, ICELIE 2011 English In this paper we propose a simple pragmatic technique, called fladget, for enabling end-users to mashup multimedia content within Wiki pages of their community peers. Since the fladget considers Wiki as a content as well as mashup repository service, Wiki RESTful API is proposed. The fladget extends functionality of existing plugin mechanism, so it can use rich-client technology for interaction with distributed multimedia content, but in a pragmatic Wiki-like manner. The presented concept is illustrated by a hypothetical Linked Active Learning Community example demonstrating how the presented mechanism can be used at the community interaction level. 0 0
A Research for the Centrality of Article Edit Collective in Wikipedia Dongjie Zhao
Haitao Yang
Jian Jiang
Deyi Li
Haisu Zhang
Article edit interaction network
Networked data mining
Collective intelligence
ICM English 0 0
A Study of using collaborative mode to construct researcher knowledge Shieh J.-C.
Wu C.-T.
Knowledge management
Knowledge sharing
Subject map
Journal of Educational Media and Library Science Chinese; English Whenever the researcher took his first step into some academic field to explore knowledge and do research, most difficult experience is that he did not know what domain knowledge does have. It is very difficult to find out his required information correctly by search engines without definite targets and keywords in limited time. In order to resolve the problem, this research proposes the concept and mechanism of collaborative mode to let researchers jointly create domain knowledge and corresponding knowledge structure. The purpose is to preserve their research knowledge and facilitate researchers to inquiry, browse the domain knowledge structure and share their research experiences to complete their own work smoothly. This research takes the domain of information architecture as a study case for constructing knowledge structure. Sharing knowledge on the Wiki collaborative platform, researchers can share what they are reading, write subject knowledge with others, quote reference correlations, etc. Additionally, the knowledge contents and marked concept keywords that researchers shared, can be reorganized and analyzed, and then may become the elements of knowledge construction and classification. After defining semantic relations of keywords and describing linkages between knowledge concepts, we apply the technology of subject map to construct the knowledge structure of the information architecture domain. The resulted structure will facilitate researchers to accomplish their research work completely and conveniently. 0 0
A Technological Reinvention of the Textbook: A Wikibooks Project Patrick M. O’Shea
James C. Onderdonk
Douglas Allen
Dwight W. Allen
Journal of Digital Learning in Teacher Education English Education traditionally has been defined as a one-way relationship between teacher and learner. However, new technologies are dramatically changing that relationship in a multitude of ways. In this article, the authors describe some of these changes and explore one example of the intersection between technology and pedagogy, describing a college course in which students compose the course text using the wiki platform. The process described proceeds from the premise that the needs and capacity of learners in the information age have been transformed and discusses one way that using an appropriate technology may address them. For this wikibook, the creators of the content become the prime users of the content as well. The authors discuss both the philosophical underpinnings and practical implications of this approach. Evaluation of the project suggests that the methodology produces an active, credible learning process. This study explores the advantages and disadvantages of this wiki process to provide context concerning the efficacy and utility of employing particular types of Web 2.0 tools. The course development rationale points to its potential for radically changing how students and teachers interact with the phenomenon of ubiquitous learning. 2 0
A Wikipedia Literature Review Owen S. Martin English This paper was originally designed as a literature review for a doctoral dissertation focusing on Wikipedia. This exposition gives the structure of Wikipedia and the latest trends in Wikipedia research. 0 2
A bounded confidence approach to understanding user participation in peer production systems Ciampaglia G.L. Lecture Notes in Computer Science English Commons-based peer production does seem to rest upon a paradox. Although users produce all contents, at the same time participation is commonly on a voluntary basis, and largely incentivized by achievement of project's goals. This means that users have to coordinate their actions and goals, in order to keep themselves from leaving. While this situation is easily explainable for small groups of highly committed, like-minded individuals, little is known about large-scale, heterogeneous projects, such as Wikipedia. In this contribution we present a model of peer production in a large online community. The model features a dynamic population of bounded confidence users, and an endogenous process of user departure. Using global sensitivity analysis, we identify the most important parameters affecting the lifespan of user participation. We find that the model presents two distinct regimes, and that the shift between them is governed by the bounded confidence parameter. For low values of this parameter, users depart almost immediately. For high values, however, the model produces a bimodal distribution of user lifespan. These results suggest that user participation to online communities could be explained in terms of group consensus, and provide a novel connection between models of opinion dynamics and commons-based peer production. 0 0
A capstone wiki knowledge base: A case study of an online tool designed to promote life-long learning through engineering literature research Issues in Science and Technology Librarianship English 0 0
A case study analysis of a constructionist knowledge building community with activity theory Ang C.S.
Panayiotis Zaphiris
Wilson S.
Activity theory
Game community
Knowledge building
Behaviour and Information Technology English This article investigates how activity theory can help research a constructionist community. We present a constructionist activity model called CONstructionism Through ACtivity Theory (CONTACT) model and explain how it can be used to analyse the constructionist activity in knowledge building communities. We then illustrate the model through its application to analysing the Wiki-supported community associated with a computer game. Our analysis focuses mainly on two perspectives: individual and collective actions, as well as individual and collective mediations. Experiences and challenges from the analysis are reported to demonstrate how CONTACT is helpful in analysing such communities. 0 0
A category-driven approach to deriving domain specific subset of Wikipedia Korshunov A.
Denis Turdakov
Jeong J.
Lee M.
Moon C.
CEUR Workshop Proceedings English While many researchers attempt to build up different kinds of ontologies by means of Wikipedia, the possibility of deriving high-quality domain specific subset of Wikipedia using its own category structure still remains undervalued. We prove the necessity of such processing in this paper and also propose an appropriate technique. As a result, the size of knowledge base for our text processing framework has been reduced by more than order, while the precision of disambiguating musical metadata (ID3 tags) has decreased from 98% to 64%. 0 0
A cloud-based semantic wiki for user training in healthcare process management Studies in Health Technology and Informatics English 0 0
A co-writing development approach to wikis: Pedagogical issues and implications Hadjerrouit S. Co-writing development approach
Socio-constructivist epistemology
World Academy of Science, Engineering and Technology English Wikis are promoted as collaborative writing tools that allow students to transform a text into a collective document by information sharing and group reflection. However, despite the promising collaborative capabilities of wikis, their pedagogical value regarding collaborative writing is still questionable. Wiki alone cannot make collaborative writing happen, and students do not automatically become more active, participate, and collaborate with others when they use wikis. To foster collaborative writing and active involvement in wiki development there is a need for a systematic approach to wikis. Themain goal of this paper is to propose and evaluate a co-writing approach to the development of wikis, along with the study of three wiki applications to report on pedagogical implications of collaborative writing in higher education. 0 0
A collaborative, wiki-based organic chemistry project incorporating free chemistry software on the Web Journal of Chemical Education English 0 0
A comparative assessment of answer quality on four question answering sites Fichman P. Community question answering
Information quality
Q&A sites
Social Q&A
Social reference
Journal of Information Science English Question answering (Q&A) sites, where communities of volunteers answer questions, may provide faster, cheaper, and better services than traditional institutions. However, like other Web 2.0 platforms, user-created content raises concerns about information quality. At the same time, Q&A sites may provide answers of different quality because they have differen communities and technological platforms. This paper compares answer quality on four Q&A sites: Askville, WikiAnswers, Wikipedia Reference Desk, and Yahoo! Answers. Findings indicate that: (1) similar collaborative processes on these sites result in a wide range of outcomes, and significant differences in answer accuracy, completeness, and verifiability were evident; (2) answer multiplication does not always result in better information; it yields more complete and verifiable answers but does not result in higher accuracy levels; and (3) a Q&A site's popularity does not correlate with its answer quality, on all three measures. 0 0
A comparison of four association engines in divergent thinking support systems on wikipedia Kobkrit Viriyayudhakorn
Susumu Kunifuji
Mizuhito Ogawa
KICSS English 0 0
A empirical study on application of Wiki-based collaborative lesson-preparing Yingjie Ren
Chaohua Gong
Collaborative lesson-preparing
Knowledge management
Proceedings - 2011 International Conference of Information Technology, Computer Engineering and Management Sciences, ICM 2011 English Lesson-preparing is an important stage in the field of teaching activity. The aim of this paper was to explore the use of Eduwiki as a new effective collaborative lessonpreparing platform to support teachers' collaboration and teaching. Furthermore, to verify and explore how to integrate Eduwiki into teachers' daily lesson-preparing activities, a single-group post-test and interview were used in the experiments. The study showed that Eduwiki was effective in motivating peer-supported collaborative lesson-preparing activity, as well as for teachers' mutual development. School leaders' support was the first important motivator for implementing the experiments in Expriment1 and interested the collaboration support environment was the first important motivators for implementing the experiments in Experiment2. The external condition for teachers participating collaborative lesson-preparing was very easy for operation. It showed that those experienced teachers passed their experiences on to novices using Eduwiki, and made the novices achieved high performance by collaborative lesson-preparing. 0 0
A framework for integrating DBpedia in a multi-modality ontology news image retrieval system Khalid Y.I.A.
Noah S.A.
Image Retrieval
Multi-Modality Ontology and Sport News
Text Retrieval
2011 International Conference on Semantic Technology and Information Retrieval, STAIR 2011 English Knowledge sharing communities like Wikipedia and automated extraction like DBpedia enable a large construction of machine processing knowledge bases with relational fact of entities. These options give a great opportunity for researcher to use it as a domain concept between low-level features and high-level concepts for image retrieval. The collection of images attached to entities, such as on-line news articles with images, are abundant on the Internet. Still, it is difficult to retrieve accurate information on these entities. Using entity names in a search engine yields large lists, but often results in imprecise and unsatisfactory outcomes. Our goal is to populate a knowledge base with on-line image news resources in the BBC sport domain. This system will yield high precision, a high recall and include diverse sports photos for specific entities. A multi-modality ontology retrieval system, with relational facts about entities for generating expanded queries, will be used to retrieve results. DBpedia will be used as a domain sport ontology description, and will be integrated with a textual description and a visual description, both generated by hand. To overcome semantic interoperability between ontologies, automated ontology alignment is used. In addition, visual similarity measures based on MPEG7 descriptions and SIFT features, are used for higher diversity in the final rankings. 0 0
A framework for personalized and collaborative clustering of search results Anastasiu D.C.
Gao B.J.
Buttler D.
Document clustering
Information retrieval
Mass collaboration
Personalized clustering
Search result clustering
Search result organization
Social tagging
International Conference on Information and Knowledge Management, Proceedings English How to organize and present search results plays a critical role in the utility of search engines. Due to the unprecedented scale of the Web and diversity of search results, the common strategy of ranked lists has become increasingly inadequate, and clustering has been considered as a promising alternative. Clustering divides a long list of disparate search results into a few topic-coherent clusters, allowing the user to quickly locate relevant results by topic navigation. While many clustering algorithms have been proposed that innovate on the automatic clustering procedure, we introduce ClusteringWiki, the first prototype and framework for personalized clustering that allows direct user editing of the clustering results. Through a Wiki interface, the user can edit and annotate the membership, structure and labels of clusters for a personalized presentation. In addition, the edits and annotations can be shared among users as a mass-collaborative way of improving search result organization and search engine utility. 0 0
A generalized method for word sense disambiguation based on wikipedia Chenliang Li
Aixin Sun
Anwitaman Datta
Context pruning
Word sense disambiguation
ECIR English 0 0
A gripe suína na Wikipédia em português: análise da dinâmica de edições e qualificação do conteúdo de dois artigos Bernardo Esteves Gonçalves da Costa
Carlos Frederico de Brito d’Andréa
Swine flu
Intexto Portuguese This article intends to analyze and compare the collaborative edition of two articles about pandemic influenza A (H1N1) — or swine flu — in the Portuguese-language edition of Wikipedia. We have monitored the edits made in those articles during one month after they were created on April 25, 2009. We have characterized the edition of the texts and the dynamics of interactions among the editors. Additionally, we have analyzed their contents according to three criteria: authority, verifiability and timeliness. 11 0
A lexicon for processing archaic language: the case of XIXth century Slovene Tomaž Erjavec
Christoph Ringlstetter
Maja Žorga
Annette Gotscharek
WoLeR 2011: International Workshop on Lexical Resources English The paper presents a lexicon to support computational processing of historical Slovene texts. Historical Slovene texts are being increasingly digitised and made available on the internet but are still underutilised as no language technology support is offered for their processing. Appropriate tools and resources would enable full-text searching with modern-day lemmas, modernisation of archaic language to make it more accessible to today‟s readers, and automatic OCR correction. We discuss the lexicon needed to support tokenisation, modernisation, lemmatisation and part-of-speech tagging of historical texts. The process of lexicon acquisition relies on a proof-read corpus, a large lexicon of contemporary Slovene, and tools to map historical forms to their contemporary equivalents via a set of rewrite rules, and to provide an editing environment for lexicon construction. The lexicon, currently work in progress, will be made publicly available; it should help not only in making digital libraries more accessible but also provide a quantitative basis for linguistic explorations of historical Slovene texts and a prototype electronic dictionary of archaic Slovene. 1 0
A lightweight approach to enterprise architecture modeling and documentation Buckl S.
Florian Matthes
Christian Neubert
Schweda C.M.
Collaborative modeling
Enterprise architecture management
Information model
Wisdom of the crowds
Lecture Notes in Business Information Processing English Not quite a few enterprise architecture (EA) management endeavors start with the design of an information model covering the EA-related interests of the various stakeholders. In the design of this model, the enterprise architects resort to prominent frameworks, but often create what would be called an "ivory tower" model. This model at best case misses if not ignores the knowledge of the people that are responsible for business processes, applications, services etc. In this paper, we describe how the wisdom of the crowds can be used to develop information models. Making use of Web 2.0 techniques, wikis, and an open templating mechanism, our approach ties together the EA relevant information in a way, which is accessible to both humans and applications. We demonstrate how the ivory tower syndrome can be cured, typical pitfalls can be avoided, and employees can be empowered to contribute their expert knowledge to EA modeling and documentation. 0 0
A link-based visual search engine for Wikipedia David N. Milne
Ian H. Witten
Exploratory search
Information retrieval
Information visualization
Semantic relatedness
JCDL English 0 0
A meta-reflective wiki for collaborative design Li Zhu
Ivan Vaghi
Barbara Rita Barricelli
Hive-Mind Space model
Boundary objects
End-user development
Habitable environment
WikiSym English 0 0
A methodology to discover semantic features from textual resources Vicient C.
Sanchez D.
Moreno A.
Feature discovery
Information extraction
Proceedings - 2011 6th International Workshop on Semantic Media Adaptation and Personalization, SMAP 2011 English Data analysis algorithms focused on processing textual data rely on the extraction of relevant features from text and the appropriate association to their formal semantics. In this paper, a method to assist this task, annotating extracted textual features with concepts from a background ontology, is presented. The method is automatic and unsupervised and it has been designed in a generic way, so it can be applied to textual resources ranging from plain text to semi-structured resources (like Wikipedia articles). The system has been tested with tourist destinations and Wikipedia articles showing promising results. 0 0
A multimethod study of information quality in wiki collaboration Gerald C. Kane Web 2.0
Electronic collaboration
Electronic communities
Information quality
Multimethod studies
Virtual teams
ACM Trans. Manage. Inf. Syst. English 0 0
A named entity mining method based on transfer learning Zhai H.-J.
Guo Y.
Guo J.-F.
Cheng X.-Q.
Named entity mining
One class learning
Transfer learning
Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University Chinese This paper addresses the problem of mining named entities from query logs. A novel scheme was introduced based on transfer learning, which trains classifier for target category by leveraging Wikipedia data source. In this way it can greatly make use of supervised learning and also deal with the large scale labeling problem. The experiment results show the effectiveness of the novel scheme based on transfer learning. 0 0
A new approach for Arabic text classification using Arabic field-association terms Atlam E.-S.
Kazuhiro Morita
Masao Fuketa
Aoe J.-I.
Journal of the American Society for Information Science and Technology English Field-association (FA) terms give us the knowledge to identify document fields using a limited set of discriminating terms. Although many earlier methods tried to extract automatically relevant FA terms to build a comprehensive dictionary, the problem lies in the lack of an effective method to extract automatically relevant FA terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other languages such as Arabic could benefit future research in the field. We present a new method to build a comprehensive Arabic dictionary using part-of-speech, pattern rules, and corpora in Arabic language. Experimental evaluation is carried out for various fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhayah news selected average of 2,825 FA terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79%, respectively. We propose amended text classification methodology based on field association terms. Our approach is compared with Nave Bayes (NB) and kNN classifiers on 5,959 documents from Wikipedia dumps and Alhayah news. The new approach achieved a precision of 80.65% followed by NB (72.79%) and kNN (36.15%). 0 0
A novel approach to sentence alignment from comparable corpora Li M.-H.
Vitaly Klyuev
Wu S.-H.
Information retrieval
Text mining
Proceedings of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IDAACS'2011 English This paper introduces a new technique to select candidate sentences for alignment from bilingual comparable corpora. Tests were done utilizing Wikipedia as a source for bilingual data. Our test languages are English and Chinese. A high quality of sentence alignment is illustrated by a machine translation application. 0 0
A probabilistic XML merging tool Talel Abdessalem
Lamine Ba M.
Pierre Senellart
Probabilistic XML
Tree merge
XML merge
ACM International Conference Proceeding Series English This demonstration paper presents a probabilistic XML data merging tool, that represents the outcome of semi-structured document integration as a probabilistic tree. The system is fully automated and integrates methods to evaluate the uncertainty (modeled as probability values) of the result of the merge. It is based on the two-way tree-merge technique and an uncertain data model defined using probabilistic event variables. The resulting probabilistic repository can be queried using a subset of the XPath query language. The demonstration application is based on revisions of the Wikipedia encyclopedia: a Wikipedia article is no longer considered as the latest valid revision but as the merge of all possible revisions, some of which are uncertain. 0 0
A probabilistic approach to semantic collaborative filtering using world knowledge Lee J.-W.
Lee S.-G.
Kim H.-J.
Bayesian belief network
Semantic collaborative filtering
World knowledge
Journal of Information Science English Collaborative filtering, which is a popular approach for developing recommendation systems, exploits the exact match of items that users have accessed. If the users access different items, they are considered as unlike-minded users even though they may actually be semantically like-minded. To solve this problem, we propose a semantic collaborative filtering model that represents the semantics of users' preferences and items with their corresponding concepts. In this work, we extend the Bayesian belief network (BBN)-based model because it provides a clear formalism for representing users' preferences and items with concepts. Because the conventional BBN-based model regards the index terms derived from items as concepts, it does not exploit domain knowledge. We have therefore extended this conventional model to exploit concepts derived from domain knowledge. A practical approach to exploiting domain knowledge is to use world knowledge such as the Open Directory Project web directory or the Wikipedia encyclopaedia. Through experiments, we show that our model outperforms other conventional collaborative filtering models while comparing the recommendation quality when using different world knowledge. 0 0
A quantitative examination of the impact of featured articles in Wikipedia Antonio J. Reinoso
Jesús M. González-Barahona
Rocío Muñoz Mansilla
Israel Herraiz
Featured articles
Usage patterns
Traffic characterization
Quantitative analysis
ICSOFT English This paper presents a quantitative examination of the impact of the presentation of featured articles as quality content in the main page of several Wikipedia editions. Moreover, the paper also presents the analysis performed to determine the number of visits received by the articles promoted to the featured status. We have analyzed the visits not only in the month when articles awarded the promotion or were included in the main page, but also in the previous and following ones. The main aim for this is to assess the attention attracted by the featured content and the different dynamics exhibited by each community of users in respect to the promotion process. The main results of this paper are twofold: it shows how to extract relevant information related to the use of Wikipedia, which is an emerging research topic, and it analyzes whether the featured articles mechanism achieve to attract more attention. 3 0
A query expansion technique using the EWC semantic relatedness measure Vitaly Klyuev
Haralambous Y.
Query expansion
Relatedness measure
Search engine
Word net
Informatica (Ljubljana) English This paper analyses the efficiency of the EWC semantic relatedness measure in an ad-hoc retrieval task. This measure combines the Wikipedia-based Explicit Semantic Analysis (ESA) measure, the WordNet path measure and the mixed collocation index. EWC considers encyclopaedic, ontological, and collocational knowledge about terms. This advantage of EWC is a key factor to find precise terms for automatic query expansion. In the experiments, the open source search engine Terrier is utilised as a tool to index and retrieve data. The proposed technique is tested on the NTCIR data collection. The experiments demonstrated superiority of EWC over ESA. 0 0
A repository of real-world examples for students and academics Choubey B. Community developed teaching tools
Internet based teaching aids
Online repository
Teaching of electronic circuits
International Conference on Information Society, i-Society 2011 English This paper reports the development of an online repository of real world examples related to concepts taught in a typical undergraduate curricula. Designed as a moderated wiki, the repository allows academics to upload as well as download such examples for use in their teaching. Simultaneously, it provides a large database of applications to students to correlate with their studies. In addition, it also provides an insight into university curriculum for parents as well as general public. 0 0
A resource-based method for named entity extraction and classification Gamallo P.
Garcia M.
Lecture Notes in Computer Science English We propose a resource-based Named Entity Classification (NEC) system, which combines named entity extraction with simple language-independent heuristics. Large lists (gazetteers) of named entities are automatically extracted making use of semi-structured information from the Wikipedia, namely infoboxes and category trees. Language-independent heuristics are used to disambiguate and classify entities that have been already identified (or recognized) in text. We compare the performance of our resource-based system with that of a supervised NEC module implemented for the FreeLing suite, which was the winner system in CoNLL-2002 competition. Experiments were performed over Portuguese text corpora taking into account several domains and genres. 0 0
A review of the millipede genus Sinocallipus Zhang, 1993 (Diplopoda, Callipodida, Sinocallipodidae), with notes on gonopods monotony vs. peripheral diversity in millipedes Pavel Stoev
Enghoff H.
Functional anatomy
Gonopod monotony
Identification key
Pensoft Wiki Convertor
Southeast Asia
ZooKeys English The millipede genus Sinocallipus is reviewed, with four new cave-dwelling species, S. catba, S. deharvengi, S. jaegeri and S. steineri, being described from caves in Laos and Vietnam. With the new records the number of species in the genus reaches six and the genus range is extended to Central Vietnam and North and Central Laos. Both, S. jaegeri from Khammouan Province in Laos and S. simplipodicus Zhang, 1993 from Yunnan, China, show high level of reduction of eyes, which has not been recorded in other Callipodida. Peripheral characters such as the relative lengths of antennomeres, the number of ocelli, the number of pleurotergites or even the shape of paraprocts and the coloration seem to provide more information for the distinction of the species than do the relatively uniform gonopods. The diff erences in gonopods mainly concern the shape and length of cannula, the length and shape of coxal processes g and k, and the number of the acicular projections of the femoroid. An explanation is offered for the function of the trochanteral lobe of 9 th leg-pair. It provides mechanical support for the cannula and seems to assist sperm charge and insemination during copulation. An identification key to the species in the genus is produced to accommodate the new species. The new species descriptions were automatically exported at the time of publication to a wiki (www.species-id.net) through a specially designed software tool, the Pensoft Wiki Convertor (PWC), implemented here for the first time together with a newly proposed citation mechanism for simultaneous journal/wiki publications. © P. Stoev, H. Enghoff. 0 0
A scourge to the pillar of neutrality: A WikiProject fighting systemic bias Livingstone R.M. Bias
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English WikiProject Countering Systemic Bias consists of a small group of English-language Wikipedia editors attempting to counterbalance Western-leaning content on the site. A population survey of members of this WikiProject is currently underway and will be followed by online interviews with select editors. This poster will present preliminary findings from the survey and interviews in order to understand how this group perceives bias on Wikipedia and how they work together to fight it. 0 0
A self organizing document map algorithm for large scale hyperlinked data inspired by neuronal migration Kotaro Nakayama
Yutaka Matsuo
Link analysis
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English Web document clustering is one of the research topics that is being pursued continuously due to the large variety of applications. Since Web documents usually have variety and diversity in terms of domains, content and quality, one of the technical difficulties is to find a reasonable number and size of clusters. In this research, we pay attention to SOMs (Self Organizing Maps) because of their capability of visualized clustering that helps users to investigate characteristics of data in detail. The SOM is widely known as a "scalable" algorithm because of its capability to handle large numbers of records. However, it is effective only when the vectors are small and dense. Although several research efforts on making the SOM scalable have been conducted, technical issues on scalability and performance for sparse high-dimensional data such as hyperlinked documents still remain. In this paper, we introduce MIGSOM, an SOM algorithm inspired by a recent discovery on neuronal migration. The two major advantages of MIGSOM are its scalability for sparse high-dimensional data and its clustering visualization functionality. In this paper, we describe the algorithm and implementation, and show the practicality of the algorithm by applying MIGSOM to a huge scale real data set: Wikipedia's hyperlink data. 0 0
A semantic wiki based on spatial hypertext Journal of Universal Computer Science English 0 0
A semantic wiki for user training in ePrescribing processes D. Papakonstantinou
F. Malamateniou
G. Vassilacopoulos
Cloud computing
Semantic wiki
User training
PETRA English 0 0
A simultaneous journal / wiki publication and dissemination of a new species description: Neobidessodes darwiniensis sp. n. from northern Australia (Coleoptera, Dytiscidae, Bidessini) Lars Hendrich
Michael Balke
Species ID
Online species pages
Sequence data
DNA barcoding
Molecular biodiversity assessment
ZooKeys English Here, we describe a new Australian species in journal format and simultaneously open the description in a wiki format on the www.species-id.net. The wiki format will always link to the fixed original journal description of the taxon, however it permits future edits and additions to species' taxonomy and biology. The diving beetle Neobidessodes darwiniensis sp. n. (Coleoptera: Dytiscidae, Bidessini) is described based on a single female, collected in a rest pool of the Harriet Creek in the Darwin Area, Northern Territory. Within Neobidessodes the new species is well characterized by its elongate oval body with rounded sides, short and stout segments of antennae, length of body and dorsal surface coloration. In addition to external morphology, we used mitochondrial cox1 sequence data to support generic assignment and to delineate the new species from other Australian Bidessini including all other known Neobidessodes. Illustrations based on digital images are provided here and as online resources. A modified key is provided. Altogether ten species of the genus are now known worldwide, nine from Australia and one from New Guinea. 0 1
A slang open book: An exploration of Wiki for ESL learners Jing-Woei Li
Ma W.
Qing C.
Secret M.
Social Constructivism
Ubiquitous Learning English This paper describes the Online Slang Open Book project (www.wikislang.org) we developed based on MediaWiki (www.MediaWiki.org). The project was designed to help ESL learners to study English slang in an authentic context and help them appreciate the cultural differences between English and their mother languages. This study's results suggest that a Wiki is an effective tool to improve the ESL learning process and learners tend to collaborative well in Wikis. The implication of the project and the future work are discussed at the end. 0 0
A statistical approach for automatic keyphrase extraction Abulaish M.
Dey L.
Information extraction
Keyphrase extraction
Natural Language Processing
Text mining
Proceedings of the 5th Indian International Conference on Artificial Intelligence, IICAI 2011 English Due to availability of voluminous textual data either on the World Wide Web or in textual databases automatic keyphrase extraction has gained increasing popularity in recent past to summarize and characterize text documents. Consequently, a number of machine learning techniques, mostly supervised, have been proposed to mine keyphrases in an automatic way. But, the non-availability of annotated corpus for training such systems is the main hinder for their success. In this paper, we propose the design of an automatic keyphrase extraction system which uses NLP and statistical approach to mine keyphrases from unstructured text documents. The efficacy of the proposed system is established over texts crawled from Wikipedia server. On evaluation we found that the proposed method outperforms KEA which uses naïve Bayes classification technique for keyphrase extraction. 0 0
A study of category expansion for related entity finding Jinghua Zhang
Qu Y.
Entity ranking
Related entity finding
Type filtering
Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011 English Entity is an important information carrier in Web pages. Searchers often want a ranked list of relevant entities directly rather a list of documents. So the research of related entity finding (REF) is very meaningful. In this paper we investigate the most important task of REF: Entity Ranking. To address the issue of wrong entity type in entity ranking: some retrieved entities don't belong to the target entity type. We make use of category expansion to deal with the issue of wrong entity type polluting entity ranking. We use Wikipedia and Dbpedia as data sources in the experiment. We found category expansion based on original type achieves a better result in recall and precision proved by experiment. 0 0
A support tool for deriving domain taxonomies from Wikipedia Kotlerman L.
Avital Z.
Ido Dagan
Lotan A.
Weintraub O.
International Conference Recent Advances in Natural Language Processing, RANLP English Organizing data into category hierarchies (taxonomies) is useful for content discovery, search, exploration and analysis. In industrial settings targeted taxonomies for specific domains are mostly created manually, typically by domain experts, which is time consuming and requires a high level of expertise. This paper presents an algorithm and an implemented interactive system for automatically generating target-domain taxonomies based on the Wikipedia Category Hierarchy. The system also enables human post-editing, facilitated by intelligent assistance. 0 0
A survey on web archiving initiatives Gomes D.
Miranda J.
Costa M.
Lecture Notes in Computer Science English Web archiving has been gaining interest and recognized importance for modern societies around the world. However, for web archivists it is frequently difficult to demonstrate this fact, for instance, to funders. This study provides an updated and global overview of web archiving. The obtained results showed that the number of web archiving initiatives significantly grew after 2003 and they are concentrated on developed countries. We statistically analyzed metrics, such as, the volume of archived data, archive file formats or number of people engaged. Web archives all together must process more data than any web search engine. Considering the complexity and large amounts of data involved in web archiving, the results showed that the assigned resources are scarce. A Wikipedia page was created to complement the presented work and be collaboratively kept up-to-date by the community. 3 0
A web 2.0 approach for organizing search results using Wikipedia Darvish Morshedi Hosseini M.
Shakery A.
Moshiri B.
Search result Organization
Lecture Notes in Computer Science English Most current search engines return a ranked list of results in response to the user's query. This simple approach may require the user to go through a long list of results to find the documents related to his information need. A common alternative is to cluster the search results and allow the user to browse the clusters, but this also imposes two challenges: 'how to define the clusters' and 'how to label the clusters in an informative way'. In this study, we propose an approach which uses Wikipedia as the source of information to organize the search results and addresses these two challenges. In response to a query, our method extracts a hierarchy of categories from Wikipedia pages and trains classifiers using web pages related to these categories. The search results are organized in the extracted hierarchy using the learned classifiers. Experiment results confirm the effectiveness of the proposed approach. 0 0
A wikipedia-based framework for collaborative semantic annotation Fernandez N.
Fisteus J.A.
Fuentes D.
Sanchez L.
Luque V.
Semantic annotation
Semantic web
International Journal on Artificial Intelligence Tools English The semantic web aims at automating web data processing tasks that nowadays only humans are able to do. To make this vision a reality, the information on web resources should be described in a computer-meaningful way, in a process known as semantic annotation. In this paper, a manual, collaborative semantic annotation framework is described. It is designed to take advantage of the benefits of manual annotation systems (like the possibility of annotating formats difficult to annotate in an automatic manner) addressing at the same time some of their limitations (reduce the burden for non-expert annotators). The framework is inspired by two principles: use Wikipedia as a facade for a formal ontology and integrate the semantic annotation task with common user actions like web search. The tools in the framework have been implemented, and empirical results obtained in experiences carried out with these tools are reported. 0 0
A-R-E: The author-review-execute environment Muller W.
Rojas I.
Eberhart A.
Peter Haase
Schmidt M.
Extended links
Linked data
Semantic wiki
Procedia Computer Science English The Author-Review-Execute (A-R-E) is an innovative concept to offer under a single principle and platform an environment to support the life cycle of an (executable) paper; namely the authoring of the paper, its submission, the reviewing process, the author's revisions, its publication, and finally the study (reading/interaction) of the paper as well as extensions (follow ups) of the paper. It combines Semantic Wiki technology, a resolver that solves links both between parts of documents to executable code or to data, an anonymizing component to support the authoring and reviewing tasks, and web services providing link perennity. 0 0
AVBOT: Detecting and fixing vandalism in Wikipedia Emilio J. Rodríguez-Posada AVBOT
Libre software
UPGRADE English Wikipedia is a project which aims to build a free encyclopaedia to spread the sum of all knowledge to every single human being. Today it can be said to be on the road to achieving that goal, having reached the 15 million articles milestone in 270 languages. Furthermore, if we include its sister projects (Wiktionary, Wikibooks, Wikisource,...), it has received more than 1 billion edits in 10 years and now has more than 10 billion page views every month. Compiling an encyclopaedia in a collaborative way has been possible thanks to MediaWiki software. It allows everybody to modify the content available on the site easily. But a problem emerges regarding this model: not all edits are made in good faith. AVBOT is a bot for protecting the Spanish Wikipedia against some undesired modifications known as vandalism. Although AVBOT was developed for Wikipedia, it can be used on any MediaWiki website. It is developed in Python and is free software. In the 2 years it has been in operation it has reverted more than 200,000 vandalism edits, while several clones have been executed, adding thousands of reverts to this count. 0 0
Ability climates in europe as socially represented notability Persson R.S. Ability climate
Cultural clusters
Cultural patterns
Gifted education
Social representations
High Ability Studies English The objective of this research was to study whether ability climate was a useful construct in exploring the possible pattern by which abilities were valued in the countries and cultures of Europe. Based on Moscovici's theory of social representations lists of famous and notable individuals published by the Wikipedia Encyclopedia were analyzed. In all, lists of 29 European countries representing 20,516 individuals perceived to be notable were subjected to a content analysis and a subsequent frequency analysis. The reliability of data derived from a database such as the Wikipedia will be discussed at length. Results suggested that as based on dominant ability clusters there appeared to exist three types of European ability climates: a uniform ability climate; a divergent ability climate, and a diverse ability climate; each of which was characterized by clusters of abilities that seemed to be particularly valued in a given European country. The possible implications of the result will be discussed. 0 0
Accessing dynamic web page in users language Sharma M.K.
Saha P.K.
Sarcar S.
Ghosh S.
Samanta D.
Human Computer interaction
Information and communication technology
Information retrieval
Ubiquitous computing
TechSym 2011 - Proceedings of the 2011 IEEE Students' Technology Symposium English In recent years, there is a rapid advancement in Information and Communication Technology (ICT). However, the explosive growth of ICT and its many applications in education, health, agriculture etc. are confined to a limited number of privileged people who have both language and digital literacy. At present the repositories in Internet are mainly in English, as a consequence users unfamiliar to English are not able to get benefits from Internet. Although many enterprises like Google have addressed this problem by providing translation engines but they have their own limitations. One major limitation is that translation engines fail to translate the dynamic content of the web pages which are written in English in web server database. We address the problem in this work and propose a user friendly interface mechanism through which a user can interact to any web services in Internet. We illustrate the access of Indian Railway Passenger Reservation System and interaction with Wikipedia English Website signifying the efficacy of the proposed mechanism as two case studies. 0 0
Accessing information sources using ontologies Sun D.
Hanmin Jung
Hwang C.
Hyeoncheol Kim
Park S.
International Journal of Computers, Communications and Control English In this paper, we present a system that helps users access various types of information sources using ontologies. An ontology consists of a set of concepts and their relationships in a domain of interests. The system analyzes an ontology provided by a user so that the user can search and browse Wikipedia [1], DBpedia [4], PubMed [5], and the Web by utilizing the information in the ontology. In particular, terms defined in the ontology are mapped to Wikipedia pages and the navigation history of a user is saved so that it can serve as a personalized ontology. In addition, users can create and edit ontologies using the proposed system. We show that the proposed system can be used in an educational environment. 0 0
Accuracy and completeness of drug information in Wikipedia: an assessment Natalie Kupferberg
Bridget McCrate Protus
Journal of the Medical Library Association English 8 2
Acquiring ethical communicative tools for an online ethics training Serra M.
Baneres D.
Santamaria E.
Basart J.M.
Applied ethics
Communicative tools
Online environment
Technological resources
SIGDOC'11 - Proceedings of the 29th ACM International Conference on Design of Communication English Nowadays, one of the most dynamic areas in applied ethics is professional ethics (e.g. in the areas of medicine, business and engineering). In this paper, we will focus on the online teaching of professional ethics to undergraduate engineering students, mainly in the specialty of Information and Communication Technologies (ICT). More specifically, the pedagogical structure for the ethical training of future engineers will be designed, within an online context, based on a set of ethical communicative tools. Besides, our proposal will analyze which technological resources are suitable to develop the appropriate ethical communicative competences required by the engineering students, when confronting moral conflicts in their daily professional exercise. 0 0
Acquiring the gist of Social Network Service Threads via comparison with wikipedia Akiyo Nadamoto
Eiji Aramaki
Takeshi Abekawa
Yohei Murakami
Coverage Degree
Link Graph
SNS Thread
Social Network Services (SNSs)
International Journal of Business Data Communications and Networking English Internet-based social network services (SNSs) have grown increasingly popular and are producing a great amount of content. Multiple users freely post their comments in SNS threads, and extracting the gist of these comments can be difficult due to their complicated dialog. In this paper, the authors propose a system that explores this concept of the gist of an SNS thread by comparing it with Wikipedia. The granularity of information in an SNS thread differs from that in Wikipedia articles, which implies that the information in a thread may be related to different articles on Wikipedia. The authors extract target articles on Wikipedia based on its link graph. When an SNS thread is compared with Wikipedia, the focus is on the table of contents (TOC) of the relevant Wikipedia articles. The system uses a proposed coverage degree to compare the comments in a thread with the information in the TOC. If the coverage degree is higher, the Wikipedia paragraph becomes the gist of the thread. 0 0
AdaptableGIMP: Designing a socially-adaptable interface Ben L.
Krynicki F.
Terry M.
Bunt A.
Lount M.
Adaptable interfaces
Search-based interfaces
UIST'11 Adjunct - Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology English We introduce the concept of a socially-adaptable interface, an interface that provides instant access to task-specific interface customizations created, edited, and documented by the application's user community. We demonstrate this concept in AdaptableGIMP, a modified version of the GIMP image editor that we have developed. 0 0
Adding semantic extension to wikis for enhancing cultural heritage applications Leclercq E.
Savonnet M.
Cultural Heritage application
Ontology Engineering
Semantic wiki
Communications in Computer and Information Science English Wikis are appropriate systems for community-authored content. In the past few years, they show that are particularly suitable for collaborative works in cultural heritage. In this paper, we highlight how wikis can be relevant solutions for building cooperative applications in domains characterized by a rapid evolution of knowledge. We will point out the capabilities of semantic extension to provide better quality of content, to improve searching, to support complex queries and finally to carry out different type of users. We describe the CARE project and explain the conceptual modeling approach. We detail the architecture of WikiBridge, a semantic wiki which allows simple, n-ary and recursive annotations as well as consistency checking. A specific section is dedicated to the ontology design which is the compulsory foundational knowledge for the application. 0 0
Adessowiki - Collaborative platform for writing executable papers Machado R.C.
Rittner L.
Lotufo R.A.
Collaborative system
Executable paper
Web 2.0
Procedia Computer Science English Adessowiki is a collaborative platform for scientific programming and document writing. It is a wiki environment that carries simultaneously documentation, programming code and results of its execution without any software configuration such as compilers, libraries and special tools at the client side. This combination of a collaborative wiki environment, central server and execution of code at rendering time enables the use of Adessowiki as an executable paper platform, since it fulfills the need to disseminate, validate, and archive research data. 0 0
Adoption and use of corporate wikis in German small and medium-sized enterprises Stieglitz S.
Dang-Xuan L.
Enterprise Wiki
Knowledge management
Wiki Adoption
Wiki Diffusion
17th Americas Conference on Information Systems 2011, AMCIS 2011 English In recent years, corporate wikis have been increasingly adopted in enterprises. However, little research is devoted to the adoption and use of wikis in small and medium-sized enterprises (SMEs), which are of high social and economic importance. The purpose of this paper is to examine the usage of enterprise wikis in SMEs and potential concerns that may hinder the diffusion of wikis in SMEs as well as other reasons for their reluctance to adopt wikis by conducting a survey of German SMEs. Findings indicate that a majority of SMEs do not intend to adopt wikis in their organization for various reasons. However, firms that have already introduced wikis seem to clearly benefit despite a number of concerns that might have a negative impact on the use and diffusion of wikis. Based on our results, we derive several implications for SMEs, in particular with respect to how to overcome these obstacles to adoption and diffusion of wikis. 0 0
Aging-kb: A knowledge base for the study of the aging process Becker K.G.
Holmes K.A.
YanChun Zhang
Aging database
Knowledge base
Mechanisms of Ageing and Development English As the science of the aging process moves forward, a recurring challenge is the integration of multiple types of data and information with classical aging theory while disseminating that information to the scientific community. Here we present AGING-kb, a public knowledge base with the goal of conceptualizing and presenting fundamental aspects of the study of the aging process. Aging-kb has two interconnected parts, the Aging-kb tree and the Aging Wiki. The Aging-kb tree is a simple intuitive dynamic tree hierarchy of terms describing the field of aging from the general to the specific. This enables the user to see relationships between areas of aging research in a logical comparative fashion. The second part is a specialized Aging Wiki which allows expert definition, description, supporting information, and documentation of each aging keyword term found in the Aging-kb tree. The Aging Wiki allows community participation in describing and defining concepts and terms in the Wiki format. This aging knowledge base provides a simple intuitive interface to the complexities of aging. 0 0
Agreement: How to reach it? defining language features leading to agreement in dialogue Zidrasco T.
Bobicev V.
Shiramatsu S.
Ozono T.
Shintani T.
International Conference Recent Advances in Natural Language Processing, RANLP English Consensus is the desired result in many argumentative discourses such as negotiations, public debates, and goal-oriented forums. However, due to the fact that usually people are poor arguers, a support of argumentation is necessary. Web-2 provides means for the online discussions which have their characteristic features. In our paper we study the features of discourse which lead to agreement. We use an argumentative corpus of Wikipedia discussions in order to investigate the influence of discourse structure and language on the final agreement. The corpus had been annotated with rhetorical relations and rhetorical structures leading to successful and unsuccessful discussions were analyzed. We also investigated language patterns extracted from the corpus in order to discover which ones are indicators of the following agreement. The results of our study can be used in system designing, whose purpose is to assist on-line interlocutors in consensus building. 0 0
An Introductory Historical Contextualization of Online Creation Communities for the Building of Digital Commons: The Emergence of a Free Culture Movement Mayo Fuster Morell Proceedings of the 6th Open Knowledge Conference English Online Creation Communities (OCCs) are a set of individuals that communicate, interact and collaborate; in several forms and degrees of participation which are eco-systemically integrated; mainly via a platform of participation on the Internet, on which they depend; and aiming at knowledge-making and sharing. The paper will first provide an historical contextualization OCCs. Then, it will show how the development of OCCs is fuelled by and contributes to, the rise of a free culture movement defending and advocating the creation of digital commons, and provide an empirically grounded definition of free culture movement. The empirical analyses is based content analysis of 80 interviews to free culture practitioners, promoters and activists with an international background or rooted in Europe, USA and Latino-America and the content analysis of two seminar discussions. The data collection was developed from 2008 to 2010. 0 0
An annotation scheme for automated bias detection in Wikipedia Livnat Herzig
Alex Nunes
Batia Snir
LAW V English 0 0
An application for the collaborative development of semantic content Rodriguez-Artacho M.
Lapo P.S.
Roa I.J.
Educational content authoring
Semantic wiki
Social web
Proceedings - Frontiers in Education Conference, FIE English This paper shows an approach to build communities of students working collaboratively to create educational content. We present an evaluation of the usability of a customized semantic environment and show how it interacts with a semantic wiki-based approach. Additionally, a proposal for a manageable hybrid methodology for the creation of ontologies is given. The result is a tool: OntoWikiUTPL and SemanticWikiUTPL, which allows the academic community to create semantic content. Thus, it is an interesting step toward innovation in interactive educational projects and collaborative activities in the Universidad Técnica Particular de Loja (UTPL). 0 0
An evidence-based medicine elective course to improve student performance in advanced pharmacy practice experiences Brandon Bookstaver P.
Rudisill C.N.
Rebecca Bickley A.
McAbee C.
Miller A.D.
Piro C.C.
Schulz R.
Active learning techniques
Advanced pharmacy practice experience
Evidence based medicine
Literature evaluation
American Journal of Pharmaceutical Education English Objective. To implement and evaluate the impact of an elective evidence-based medicine (EBM) course on student performance during advanced pharmacy practice experiences (APPEs). Design. A 2-hour elective course was implemented using active-learning techniques including case studies and problem-based learning, journal club simulations, and student-driven wiki pages. The small class size (15 students) encouraged independent student learning, allowing students to serve as the instructors and guest faculty members from a variety of disciplines to facilitate discussions. Assessment. Pre- and posttests found that students improved on 83% of the core evidence-based medicine concepts evaluated. Fifty-four APPE preceptors were surveyed to compare the performance of students who had completed the EBM course prior to starting their APPEs with students who had not. Of the 38 (70%) who responded, the majority (86.9%) agreed that students who had completed the course had stronger skills in applying evidence-based medicine to patient care than other students. The 14 students who completed the elective also were surveyed after completing their APPEs and the 11 who responded agreed the class had improved their skills and provided confidence in using the medical literature. Conclusions. The skill set acquired from this EBM course improved students' performance in APPEs. Evidence-based medicine and literature search skills should receive more emphasis in the pharmacy curriculum. 0 0
An experience using a Spatial Hypertext Wiki Carlos Solis
Nour Ali
Knowledge management
Spatial hypertext
HT 2011 - Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia English Most wikis do not allow users to collaboratively organize relations among wiki pages, nor ways to visualize them because such relations are hard to express using hyperlinks. The Spatial Hypertext Wiki (ShyWiki) is a wiki that uses Spatial Hypertext to represent visual and spatial implicit relations. This paper reports an experience about the use of ShyWiki features and its spatial hypertext model. Four groups, consisting of 3 members each, were asked to use ShyWiki for creating, sharing and brainstorming knowledge during the design and documentation of a software architecture. We present the evaluation of a questionnaire that users answered about their perceived usefulness and easiness of use of the spatial and visual properties of ShyWiki, and several of its features. We have also asked the users if they would find the visual and spatial properties useful in a wiki such as Wikipedia. In addition, we have analyzed the visual and spatial structures used in the wiki pages, and which features have been used. 0 0
An experience using a spatial hypertext Wiki Carlos Solis
Nour Ali
Knowledge management
Spatial hypertext
HT English 0 0
An exploratory study of navigating Wikipedia semantically: Model and application Wu I.-C.
Lin Y.-S.
Liu C.-H.
Normalized Google Distance
SNA-based summary
Lecture Notes in Computer Science English Due to the popularity of link-based applications like Wikipedia, one of the most important issues in online research is how to alleviate information overload on the World Wide Web (WWW) and facilitate effective information-seeking. To address the problem, we propose a semantically-based navigation application that is based on the theories and techniques of link mining, semantic relatedness analysis and text summarization. Our goal is to develop an application that assists users in efficiently finding the related subtopics for a seed query and then quickly checking the content of articles. We establish a topic network by analyzing the internal links of Wikipedia and applying the Normalized Google Distance algorithm in order to quantify the strength of the semantic relationships between articles via key terms. To help users explore and read topic-related articles, we propose a SNA-based summarization approach to summarize articles. To visualize the topic network more efficiently, we develop a semantically-based WikiMap to help users navigate Wikipedia effectively. 0 0
An exploratory study of navigating wikipedia semantically: model and application I-Chin Wu
Yi-Sheng Lin
Che-Hung Liu
SNA-based summary
Normalized google distance
OCSC English 0 0
An iterative clustering method for the XML-mining task of the INEX 2010 Tovar M.
Cruz A.
Vazquez B.
Pinto D.
Vilarino D.
Montes A.
Lecture Notes in Computer Science English In this paper we propose two iterative clustering methods for grouping Wikipedia documents of a given huge collection into clusters. The recursive method clusters iteratively subsets of the complete collection. In each iteration, we select representative items for each group, which are then used for the next stage of clustering. The presented approaches are scalable algorithms which may be used with huge collections that in other way (for instance, using the classic clustering methods) would be computationally expensive of being clustered. The obtained results outperformed the random baseline presented in the INEX 2010 clustering task of the XML-Mining track. 0 0
Analysis of participant behaviour in a twitter based crowdsourcing project Oztaysi B. Burtonstory
Open innovation
41st International Conference on Computers and Industrial Engineering 2011 English Crowd creation is a branch of crowdsourcing that focus on creation of activities such as asking individuals to film TV commercials, perform language translation or solve challenging scientific problems. Innocentive, iStockphoto, TopCoder, Wikipedia and Linux are the best known examples for crowd creation. Twitter based projects appear to be a promising tool for open innovation processes and especially for crowd creation activities. BurtonStory is a crowd creation project by Tim Burton ran over twitter between November 20th and December 6th of 2010. The author started a new story of cocreated character StainBoy, with a single sentence, and the rest of the story is created by the crowd. The aim of this paper is to analyze the behaviors of the project participants. Date/Time based participation, participation intensity, participant profiles are analyzed. Based on these analysis implications for forthcoming applications are discussed. 0 0
Analysis of social learning network for wiki in moodle E-Learning Proceedings - 4th International Conference on Interaction Sciences: IT, Human and Digital Content, ICIS 2011 English 0 0
Analysis on multilingual discussion for Wikipedia translation Linsi Xia
Naomi Yamashita
Toru Ishida
Machine translation
Multilingual communication
Multilingual Liquid Threads
Wikipedia Translation
Proceedings - 2011 2nd International Conference on Culture and Computing, Culture and Computing 2011 English In current Wikipedia translation activities, most translation tasks are performed by bilingual speakers who have high language skills and specialized knowledge of the articles. Unfortunately, compared to the large amount of Wikipedia articles, the number of such qualified translators is very small. Thus the success of Wikipedia translation activities hinges on the contributions from non-bilingual speakers. In this paper, we report on a study investigating the effects of introducing a machine translation mediated BBS that enables monolinguals to collaboratively translate Wikipedia articles using their mother tongues. From our experiment using this system, we found out that users made high use of the system and communicated actively across different languages. Furthermore, most of such multilingual discussions seemed to be successful in transferring knowledge between different languages. Such success appeared to be made possible by a distinctive communication pattern which emerged as the users tried to avoid misunderstandings from machine translation errors. These findings suggest that there is a fair chance of non-bilingual speakers being capable of effectively contributing to Wikipedia translation activities with the assistance of machine translation. 0 0
Analyzing the wikisphere: Methodology and data to support quantitative wiki research Jeffrey Stuckman
James Purtilo
Journal of the American Society for Information Science and Technology English Owing to the inherent difficulty in obtaining experimental data from wikis, past quantitative wiki research has largely focused on Wikipedia, limiting the ability to generalize such research. To facilitate the analysis of wikis other than Wikipedia, we developed WikiCrawler, a tool that automatically gathers research data from public wikis without supervision. We then built a corpus of 151 wikis, which we have made publicly available. Our analysis indicated that these wikis display signs of collaborative authorship, validating them as objects of study. We then performed an initial analysis of the corpus and discovered some similarities with Wikipedia, such as users contributing at unequal rates. We also analyzed distributions of edits across pages and users, resulting in data which can motivate or verify mathematical models of behavior on wikis. By providing data collection tools and a corpus of already-collected data, we have completed an important first step for investigations that analyze user behavior, establish measurement baselines for wiki evaluation, and generalize Wikipedia research by testing hypotheses across many wikis. 0 0
Annotating social acts: authority claims and alignment moves in Wikipedia talk pages Emily M. Bender
Jonathan T. Morgan
Meghan Oxley
Mark Zachry
Brian Hutchinson
Alex Marin
Bin Zhang
Mari Ostendorf
LSM English 0 0
Annotating software documentation in semantic wikis Klaas Andries de Graaf Semantic annotation
Semantic wiki
Software documentation
Software engineering knowledge
ESAIR English 0 0
Annotations on access controls in wikis: a proposal Chikashi Fuchimoto
Masayoshi Aritsugi
Annotations on access controls
IiWAS English 0 0
Application of Bradford's law and Lotka's law to web metrics study on the Wiki website Journal of Educational Media and Library Science English 0 0
Applying and extending semantic wikis for semantic web courses Rutledge L.
Oostenrijk R.
Distance learning
Linked data
Semantic wiki
CEUR Workshop Proceedings English This work describes the application of semantic wikis in distant learning for Semantic Web courses. The resulting system focuses its application of existing and new wiki technology in making a wiki-based interface that demonstrates Semantic Web features. A new layer of wiki technology, called "OWL Wiki Forms" is introduced for this Semantic Web functionality in the wiki interface. This new functionality includes a form-based interface for editing Semantic Web ontologies. The wiki then includes appropriate data from these ontologies to extend existing wiki RDF export. It also includes ontology-driven creation of data entry and browsing interfaces for the wiki itself. As a wiki, the system provides the student an educational tool that students can use anywhere while still sharing access with the instructor and, optionally, other students. 0 0
Approach of Web2.0 application pattern applied to the information teaching Li G.
Liu M.
Zhe Wang
Chen W.
Information Teaching
Web 2.0
Communications in Computer and Information Science English This paper firstly focuses on the development and function of Web2.0 from an educational perspective. Secondly, it introduces the features and theoretical foundation of Web 2.0. Consequently, The application pattern used in the information teaching based on the introduction described above is elaborated and proved to be an effective way of increasing educational productivity. Lastly, this paper presents the related cases and teaching resources for reference. 0 0
Approaching Web accessibility through the browser: Automatically applying ARIA attributes Harrington N.
Pucsek D.
Coady Y.
Web browsers
PLASTIC'11 - Proceedings of the 1st ACM SIGPLAN International Workshop on Programming Language and Systems Technologies for Internet Clients English For people with disabilities who require the use of screen readers it is difficult, if not impossible, to interact with Web applications like Facebook, Wikipedia, and Google due to a lack of information available to the screen reader describing the widgets, structures, and behaviours of the Web application. In this paper we explore the idea of automatically augmenting Web applications with attributes, provided by the Accessible Rich Internet Application specification, to increase usability for screen reader users through both Web browser extensions and the browser itself. We show that this automatic approach is feasible, but in order to process dynamic content and avoid degrading the user experience, it is necessary to implement such functionality in the Web browser. 0 0
Asia Signopedia: An open information system of Asian sign languages Wong K.-H.K.
Tang G.
Chung R.
Sign language
Technology and Disability English To promote deaf awareness and natural sign language in Asia, we created an open platform named "Asia Signopedia". The web page allowed both deaf and hearing people to input and access entries of different Asian sign languages and their dialects in either video or text mode. This paper describes how the data structure and user interface of the web page were designed. The distributive authoring scheme of the web page allowed the database to be input and corrected by those who used it. © 2011 - IOS Press and the authors. All rights reserved. 0 0
Assessing collaboration in a wiki: The reliability of university students' peer assessment Internet and Higher Education English 0 0
Assessment methods in the course on academic writing Klimova B.F. Assessment
Blended learning
Formal writing
Procedia - Social and Behavioral Sciences English The article focuses on the assessment of the productive skill of writing in the classes of academic writing at the Faculty of Informatics and Management of the University of Hradec Kralove, Czech Republic. Firstly, it briefly describes an optional, one-semestr course of academic writing and its specifics. Secondly, the article provides a definition of the assessment and its categories for the purpose of understanding different assessment practices. Thirdly, it lists the most common assessment methods used in the course with their benefits and drawbacks. Finally, students' evaluation of the course and reflections on their writing achievements are introduced. 0 0
Assessments in large- and small-scale wiki collaborative learning environments: Recommendations for educators and wiki designers Portia Pusey
Gabriele Meiselwitz
Wiki Learning
Wiki Learning Environment
Lecture Notes in Computer Science English This paper discusses assessment practice when wikis are used as learning environments in higher education. Wikis are simple online information systems which often serve user communities. In higher education, wikis have been used in a supporting function to traditional courses; however, there is little research on wikis taking on a larger role as learning environments and even less research on assessment practice for these learning environments. This paper reports on the assessment techniques for large- and small scale- learning environments. It explores the barriers to assessment described in the studies. The paper concludes with a proposal of five improvements to the wiki engine which could facilitate assessment when wikis are used as learning environments in higher education. 0 0
Assessments in large- and small-scale wiki collaborative learning environments: recommendations for educators and wiki designers Portia Pusey
Gabriele Meiselwitz
Wiki learning
Wiki learning environment
OCSC English 0 0
Automated construction of domain ontology taxonomies from wikipedia Juric D.
Banek M.
Skocir Z.
Lecture Notes in Computer Science English The key step for implementing the idea of the Semantic Web into a feasible system is providing a variety of domain ontologies that are constructed on demand, in an automated manner and in a very short time. In this paper we introduce an unsupervised method for constructing domain ontology taxonomies from Wikipedia. The benefit of using Wikipedia as the source is twofold: first, the Wikipedia articles are concise and have a particularly high "density"of domain knowledge; second, the articles represent a consensus of a large community, thus avoiding term disagreements and misinterpretations. The taxonomy construction algorithm, aimed at finding the subsumption relation, is based on two different techniques, which both apply linguistic parsing: analyzing the first sentence of each Wikipedia article and processing the categories associated with the article. The method has been evaluated against human judgment for two independent domains and the experimental results have proven its robustness and high precision. 0 0
Automatic acquisition of taxonomies in different languages from multiple Wikipedia versions Garcia R.D.
Rensing C.
Steinmetz R.
Hyponymy detection
Taxonomy acquisition
Data mining
ACM International Conference Proceeding Series English In the last years, the vision of the Semantic Web has led to many approaches that aim to automatically derive knowledge bases from Wikipedia. These approaches rely mostly on the English Wikipedia as it is the largest Wikipedia version and have lead to valuable knowledge bases. However, each Wikipedia version contains socio-cultural knowledge, i.e. knowledge with specific relevance for a culture or language. One difficulty of the application of existing approaches to multiple Wikipedia versions is the use of additional corpora. In this paper, we describe the adaptation of existing heuristics that make the extraction of large sets of hyponymy relations from multiple Wikipedia versions with little information about each language possible. Further, we evaluate our approach with Wikipedia versions in four different languages and compare results with GermaNet for German and WordNet for English. 0 0
Automatic assessment of document quality in web collaborative digital libraries Dalip D.H.
Goncalves M.A.
Marco Cristo
Pável Calado
Machine learning
Quality assessment
Quality features
Journal of Data and Information Quality English The old dream of a universal repository containing all of human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and open edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its quality. In this work, we explore a significant number of quality indicators and study their capability to assess the quality of articles from three Web collaborative digital libraries. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment. Through experiments, we show that the most important quality indicators are those which are also the easiest to extract, namely, the textual features related to the structure of the article. Moreover, to the best of our knowledge, this work is the first that shows an empirical comparison between Web collaborative digital libraries regarding the task of assessing article quality. 0 0
Automatic document tagging using online knowledge base Choi C.
Myunggwon Hwang
Choi D.
Choi J.
Kim P.
Document tagging
Online knowledge base
Information English Online Knowledge bases are utilized for semantic information processing such as WordNet. However, research indicates the existing knowledge base cannot cover all concepts used in talking and writing in the real world. It is necessary to use online knowledge base such as Wikipedia to resolve this limitation. Web document tagging generally chooses core words from a document itself. However, the core words are not standardized taggers. Thus, users should make an effort to grasp the tagged words first in the retrieval. This paper proposes methods to utilize titles (Wiki concept) of Wikipedia documents and to find the best Wiki concept that describes the Web documents (target documents). In addition to these methods, the research tries to classify target documents into a Wikipedia category (Wiki category) for semantic document interconnections. 0 0
Automatic gazetteer generation from wikipedia Alessio Bosca
Luca Dini
NLP4DL'09/AT4DL English 0 0
Automatic identification of the most important elements in an XML collection Krumpholz A.
Studeny N.
Hawking D.
Hadad A.
Gedeon T.
Fuzzy C-Means Clustering
XML Retrieval
ADCS 2011 - Proceedings of the Sixteenth Australasian Document Computing Symposium English An important problem in XML retrieval is determining the most useful element types to retrieve - e.g. book, chapter, section, paragraph or caption. An automated system for doing this could be based on features of element types related to size, depth, frequency of occurrence, etc. We consider a large number of such features and assess their usefulness in predicting the types of elements judged relevant in INEX evaluations for the IEEE and Wikipedia 2006 corpora. For each feature we automatically assign Useful / Not-Useful labels to element types using Fuzzy c-Means Clustering. We then rank the features by the accuracy with which they predict the manual judgments. We find strong overlap between the top-ten most predictive features for the two collections and that seven features achieve high average accuracy (F-measure > 65%) acrosss them. We hypothesize that an XML retrieval system working on an unlabelled corpus could use these features to decide which retrieval units are most appropriate to return to the user. 0 0
Automatic knowledge extraction from manufacturing research publications Boonyasopon P.
Riel A.
Uys W.
Louw L.
Tichkiewitch S.
Du Preez N.
Decision making
Document retrieval technique
CIRP Annals - Manufacturing Technology English Knowledge mining is a young and rapidly growing discipline aiming at automatically identifying valuable knowledge in digital documents. This paper presents the results of a study of the application of document retrieval and text mining techniques to extract knowledge from CIRP research papers. The target is to find out if and how such tools can help researchers to find relevant publications in a cluster of papers and increase the citation indices their own papers. Two different approaches to automatic topic identification are investigated. One is based on Latent Dirichlet Allocation of a huge document set, the other uses Wikipedia to discover significant words in papers. The study uses a combination of both approaches to propose a new approach to efficient and intelligent knowledge mining. 0 0
Automatic labelling of topic models Lau J.H.
Grieser K.
Newman D.
Baldwin T.
ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies English We propose a method for automatically labelling topics learned via LDA topic models. We generate our label candidate set from the top-ranking topic terms, titles ofWikipedia articles containing the top-ranking topic terms, and sub-phrases extracted from the Wikipedia article titles. We rank the label candidates using a combination of association measures and lexical features, optionally fed into a supervised ranking model. Our method is shown to perform strongly over four independent sets of topics, significantly better than a benchmark method. 0 0
Automatic reputation assessment in Wikipedia Wohner T.
Kohler S.
Ralf Peters
Reputation system
User generated content
Web 2.0
International Conference on Information Systems 2011, ICIS 2011 English The online encyclopedia Wikipedia is predominantly created by anonymous or pseudonymous authors whose knowledge and motivations are unknown. For that reason there is an uncertainty in terms of their contribution quality. An approach to this problem is provided by automatic reputation systems, which have been becoming a new research branch in the recent years. In previous research, different metrics for automatic reputation assessment have been suggested. Nevertheless, the metrics are evaluated insufficiently and considered isolated only. As a result, the significance of these metrics is quite unclear. In this paper, we compare and assess seven metrics, both originated from the literature and new suggestions. Additionally, we combine these metrics via a discriminant analysis to deduce a significant reputation function. The analysis reveals that our newly suggested metric editing efficiency is particularly effective. We validate our reputation function by means of an analysis of Wikipedia user groups. 0 0
Automatic semantic web annotation of named entities Charton E.
Marie-Pierre Gagnon
Ozell B.
Lecture Notes in Computer Science English This paper describes a method to perform automated semantic annotation of named entities contained in large corpora. The semantic annotation is made in the context of the Semantic Web. The method is based on an algorithm that compares the set of words that appear before and after the name entity with the content of Wikipedia articles, and identifies the more relevant one by means of a similarity measure. It then uses the link that exists between the selected Wikipedia entry and the corresponding RDF description in the Linked Data project to establish a connection between the named entity and some URI in the Semantic Web. We present our system, discuss its architecture, and describe an algorithm dedicated to ontological disambiguation of named entities contained in large-scale corpora. We evaluate the algorithm, and present our results. 0 0
Automatic summarization of Turkish documents using non-negative matrix factorization Guran A.
Bayazit N.G.
Bekar E.
Non-negative Matrix factorization
Turkish document summarization
Turkish wikipedia
INISTA 2011 - 2011 International Symposium on INnovations in Intelligent SysTems and Applications English Automatic document summarization is a process, where a computer summarizes a document. This paper presents the performance analysis of an automatic Turkish document summarization system that applies Non-negative matrix factorization based summarization algorithm with different preprocessing methods. The preprocessing method called "Consecutive Words Detection" is an innovative approach that uses Turkish Wikipedia links to represent related consecutive words as a single term and the result of the evaluation process is promising for document summarization in Turkish. 0 0
Automatically assigning Wikipedia articles to macro-categories Jacopo Farina
Riccardo Tasso
David Laniado
Category graph
Topic coverage
Hypertext English The online encyclopedia Wikipedia offers millions of articles which are organized in a hierarchical category structure, created and updated by users. In this paper we present a technique which leverages this rich and disordered graph to assign each article to one or more topics. We modify an existing approach, based on the shortest paths between categories, in order to account for the direction of the hierarchy. 0 0
Autonomous Link Spam Detection in Purely Collaborative Environments Andrew G. West
Avantika Agrawal
Phillip Baker
Brittney Exline
Insup Lee
Collaborative security
Information security
Spam mitigation
Spatio- temporal features
Machine learning
Intelligent routing
WikiSym English Collaborative models (e.g., wikis) are an increasingly prevalent Web technology. However, the open-access that defines such systems can also be utilized for nefarious purposes. In particular, this paper examines the use of collaborative functionality to add inappropriate hyperlinks to destinations outside the host environment (i.e., link spam). The collaborative encyclopedia, Wikipedia, is the basis for our analysis.

Recent research has exposed vulnerabilities in Wikipedia's link spam mitigation, finding that human editors are latent and dwindling in quantity. To this end, we propose and develop an autonomous classifier for link additions. Such a system presents unique challenges. For example, low barriers-to-entry invite a diversity of spam types, not just those with economic motivations. Moreover, issues can arise with how a link is presented (regardless of the destination).

In this work, a spam corpus is extracted from over 235,000 link additions to English Wikipedia. From this, 40+ features are codified and analyzed. These indicators are computed using "wiki" metadata, landing site analysis, and external data sources. The resulting classifier attains 64% recall at 0.5% false-positives (ROC-AUC=0.97). Such performance could enable egregious link additions to be blocked automatically with low false-positive rates, while prioritizing the remainder for human inspection. Finally, a live Wikipedia implementation of the technique has been developed.
0 0
Autorégulation de rapports sociaux et dispositif dans Wikipedia Jacquemin B. Collaborative device
Wikipedia encyclopaedia
Document Numerique French As a collaborative work, the online encyclopaedia Wikipedia leads naturally the contributors to work with each other and to face their opinions. But no frame is provided to control the collaboration, neither in the five fundamental principles, nor from the wiki software. This article studies how the contributing community thinks up original ways to promote collaboration, social exchanges and conflict resolution. The concept of device (dispositif), and especially how governance shows itself in a collaborative device, is used to analyse these ways. Two views of the power conflict in the community: one permits contributors to break the rules to strive to Wikipedia's goal; the other one makes sure to enforce strictly the rules. Even though the latter seems to prevail, there is some evidence that loyalty may sometimes be illusory. 0 0
Bancos de imágenes para proyectos enciclopédicos: el caso de Wikimedia Commons Tomás Saorín-Pérez
Juan-Antonio Pastor-Sánchez
Wikimedia Commons
Public domain
Image bank
El profesional de la información Spanish This paper presents the characteristics and functionalities of the Wikimedia Commons image databank shared by all Wikipedia projects. The process of finding images and ilustrating Wikipedia articles is also explained, along with how to add images to the bank. The role of cultural institutions in promoting free and open cultural heritage content is highlighted. Se presenta la naturaleza y función del banco de imágenes Wikimedia Commons para los proyectos de enciclopedias colaborativas. Se analiza el proceso de localización de imágenes y su uso para ilustrar un artículo en Wikipedia, así como la colaboración incorporando imágenes al banco. Se hace especial referencia a las políticas de liberación de patrimonio cultural desde las instituciones culturales. 5 1
Bancos de imágenes para proyectos enciclopédicos: el caso de Wikimedia Commons = Image databanks in encyclopedia context: the case of Wikimedia Commons Saorín
T. and Pastor Sánchez
Wikipedia; Wikimedia Commons; Public domain; Encyclopedias; Image banks Http://www.elprofesionaldelainformacion.com/ El profesional de la información, , julio-agosto, n.4, Spanish This paper presents the characteristics and functionalities of the Wikimedia Commons image databank shared by all Wikipedia projects. The process of finding images and ilustrating wikipedia articles is also explained, along with how to add images to the bank. The role of cultural institutions in promoting free and open cultural heritage content is highlighted. 0 0
Baudenkmalnetz - Creating a semantically annotated web resource of historical buildings Dumitrache A.
Christoph Lange
CEUR Workshop Proceedings English BauDenkMalNetz ("listed buildings web") deals with creating a semantically annotated website of urban historical landmarks. The annotations cover the most relevant information about the landmarks (e.g. the buildings' architects, architectural style or construction details), for the purpose of extended accessibility and smart querying. BauDenkMalNetz is based on a series of touristic books on architectural landscape. After a thorough analysis on the requirements that our website should provide, we processed these books using automated tools for text mining, which led to an ontology that allows for expressing all relevant architectural and historical information. In preparation of publishing the books on a website powered by this ontology, we analyze how well Semantic MediaWiki and the RDF-aware Drupal 7 content management system satisfy our requirements. 0 0
Beyond notability. Collective deliberation on content inclusion in Wikipedia Dario Taraborelli
Ciampaglia G.L.
Proceedings - 2010 4th IEEE International Conference on Self-Adaptive and Self-Organizing Systems Workshop, SASOW 2010 English In this study we analyse the structure of a particular form of collective decision-making in Wikipedia, i.e. decisions regarding content inclusion and deletion. Wikipedia's official guidelines require that only topics that meet "notability" standards be included with a dedicated article. Decisions as to whether a topic is "notable" are made by groups of self-appointed reviewers, who assess the alleged encyclopaedic nature of a topic via so called Article for Deletion discussions. We analyse the structure and dynamics of these discussions in order to identify possible biases affecting their outcome. We show in particular the effects of voter heterogeneity and herding behaviour on the functioning of these collective deliberation processes. 0 1
Beyond the bag-of-words paradigm to enhance information retrieval applications Paolo Ferragina Classification
Conceptual annotation
Search engine
Structured annotation
Proceedings - 4th International Conference on SImilarity Search and APplications, SISAP 2011 English The typical IR-approach to indexing, clustering, classification and retrieval, just to name a few, is the one based on the bag-of-words paradigm. It eventually transforms a text into an array of terms, possibly weighted (with tf-idf scores or derivatives), and then represents that array via points in highly-dimensional space. It is therefore syntactical and unstructured, in the sense that different terms lead to different dimensions. Co-occurrence detection and other processing steps have been thus proposed (see e.g. LSI, Spectral analysis [7]) to identify the existence of those relations, but yet everyone is aware of the limitations of this approach especially in the expanding context of short (and thus poorly composed) texts, such as the snippets of search-engine results, the tweets of a Twitter channel, the items of a news feed, the posts of a blog, or the advertisement messages, etc.. A good deal of recent work is attempting to go beyond this paradigm by enriching the input text with additional structured annotations. This general idea has been declined in the literature in two distinct ways. One consists of extending the classic term-based vector-space model with additional dimensions corresponding to features (concepts) extracted from an external knowledge base, such as DMOZ, Wikipedia, or even the whole Web (see e.g. [4, 5, 12]). The pro of this approach is to extend the bag-of-words scheme with more concepts, thus possibly allowing the identification of related texts which are syntactically far apart. The cons resides in the contamination of these vectors by un-related (but common) concepts retrieved via the syntactic queries. The second way consists of identifying in the input text short-and-meaningful sequences of terms (aka spots) which are then connected to unambiguous concepts drawn from a catalog. The catalog can be formed by either a small set of specifically recognized types, most often People and Locations (aka Named Entities, see e.g. [13, 14]), or it can consists of millions of concepts drawn from a large knowledge base, such as Wikipedia. This latter catalog is ever-expanding and currently offers the best trade-off between a catalog with a rigorous structure but with low coverage (like WordNet, CYC, TAP), and a large text collection with wide coverage but unstructured and noised content (like the whole Web). To understand how this annotation works, let us consider the following short news: "Diego Maradona won against Mexico". The goal of the annotation is to detect "Diego Maradona" and"Mexico" as spots, and then hyper-link them with theWikipedia pages which deal with the ex Argentina's coach and the football team of Mexico. The annotator uses as spots the anchor texts which occur in Wikipedia pages, and as possible concepts for each spot the (possibly many) pages pointed in Wikipedia by that spot/anchor 0 0
Blognoon: Exploring a topic in the blogosphere Maria Grineva
Maxim Grinev
Dmitry Lizorkin
Boldakov A.
Denis Turdakov
Sysoev A.
Kiyko A.
Concept search
Semantic search
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English We demonstrate Blognoon, a semantic blog search engine with the focus on topic exploration and navigation. Blognoon provides concept search instead of traditional keywords search and improves ranking by identifying main topics of posts. It enhances navigation over the Blogosphere with faceted interfaces and recommendations. 0 0
Blurring boundaries: Two groups of girls collaborate on a wiki Journal of Adolescent and Adult Literacy English 0 0
Bootstrapping multilingual relation discovery using English wikipedia and wikimedia-induced entity extraction Schome P.
Tim Allison
Chris Giannella
Craig Pfeifer
Multilingual relation extraction
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI English Relation extraction has been a subject of significant study over the past decade. Most relation extractors have been developed by combining the training of complex computational systems on large volumes of annotations with extensive rule-writing by language experts. Moreover, many relation extractors are reliant on other non-trivial NLP technologies which themselves are developed through significant human efforts, such as entity tagging, parsing, etc. Due to the high cost of creating and assembling the required resources, relation extractors have typically been developed for only high-resourced languages. In this paper, we describe a near-zero-cost methodology to build relation extractors for significantly distinct non-English languages using only freely available Wikipedia and other web documents, and some knowledge of English. We apply our method to build alma-mater, birthplace, father, occupation, and spouse relation extractors in Greek, Spanish, Russian, and Chinese. We conduct evaluations of induced relations at the file level - the most refined we have seen in the literature. 0 0
Building a geographical ontology by using Wikipedia Quoc Hung-Ngo
Son Doan
Werner Winiwarter
Geographical ontology
Ontology building
IiWAS English 0 0
Building a signed network from interactions in Wikipedia Silviu Maniu
Bogdan Cautis
Talel Abdessalem
Online community
Signed networks
Social applications
Web of trust
DBSocial English 0 1
Building green to attain sustainability Kamana C.P.
Escultura E.
Grand unified theory
Green building
Sustainable development
International Journal of Earth Sciences and Engineering English "Sustainable development is a pattern of resource use that aims to meet human needs while preserving the environment so that these needs can be met not only in the present, but also for future generations." -(source: Wikipedia) A sustainable building, or green building is an outcome of a design which focuses on increasing the efficiency of resource use - energy, water, and materials - while reducing building impacts on human health and the environment during the building's lifecycle, through better location, design, construction, operation, maintenance, and removal. This study is aimed at understanding the green building concepts, rating of green buildings and making a comparison between green buildings and conventional buildings. It also aims at reducing the energy consumption of non-renewable resources, through various mechanical engineering aspects like HVAC, electrical lighting, plumbing etc. This is about how to minimize environmental degradation caused by building practices; in particular, understanding green practices opted in two green buildings- CII Hyderabad (first platinum rated green building in India) and GRUNDFOS Chennai (gold rated green building). The OVERALL GOAL of this paper is learning how to deliver Planet Earth to the next generation so that it will be a cleaner and more energizing place than the planet we inherited. © 2011 CAFET-INNOVA TECHNICAL SOCIETY. All rights reserved. 0 0
Building ontology for mashup services based on Wikipedia Xiao K.
Li B.
Lecture Notes in Electrical Engineering English Tagging as a useful way to organize online resources has attracted many attentions in the last few years. And many ontology building approaches are proposed using such tags. While tags usually associated with concepts in some databases, such as WordNet and online ontologies. However, these databases are stable, static and lack of consistence. In this paper, we build an ontology for a collection of mashup services using their affiliated tags by referring to the entries of Wikipedia. Core tags are filtered out and mapped to the corresponding Wikipedia entries (i.e., URIs). An experiment is given as an illustration. © 2011 Springer Science+Business Media B.V. 0 0
Building roadmaps: A knowledge sharing perspective Tang A.
De Boer T.
Hans van Vliet
Knowledge sharing
Proceedings - International Conference on Software Engineering English Roadmapping is a process that involves many stakeholders and architects. In an industry case, we have found that a major challenge is to exchange timely knowledge between these people. We report a number of knowledge sharing scenarios in the roadmapping process. In order to address these issues, we propose a codification mechanism that makes use of a semantic wiki to facilitate knowledge sharing. 0 0
CATE: Context-aware timeline for entity illustration Tuan T.A.
Elbassuoni S.
Preda N.
Gerhard Weikum
Knowledge ranking
Visualization tools
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English Wikipedia has become one of the most authoritative information sources on the Web. Each article in Wikipedia provides a portrait of a certain entity. However, such a portrait is far from complete. An informative portrait of an entity should also reveal the context the entity belongs to. For example, for a person, major historical, political and cultural events that coincide with her life are important and should be included in that person's portrait. Similarly, the person's interactions with other people are also important. All this information should be summarized and presented in an appealing and interactive visual interface that enables users to quickly scan the entity's portrait. We demonstrate CATE which is a system that utilizes Wikipedia to create a portrait of a given entity of interest. We provide a visualization tool that summarizes the important events related to the entity. The novelty of our approach lies in seeing the portrait of an entity in a broader context, synchronous with its time. 0 0
COBS: Realizing decentralized infrastructure for collaborative browsing and search Von Der Weth C.
Anwitaman Datta
Proceedings - International Conference on Advanced Information Networking and Applications, AINA English Finding relevant and reliable information on the web is a non-trivial task. While internet search engines do find correct web pages with respect to a set of keywords, they often cannot ensure the relevance or reliability of their content. An emerging trend is to harness internet users in the spirit of Web 2.0, to discern and personalize relevant and reliable information. Users collaboratively search or browse for information, either directly by communicating or indirectly by adding meta information (e.g., tags) to web pages. While gaining much popularity, such approaches are bound to specific service providers, or the Web 2.0 sites providing the necessary features, and the knowledge so generated is also confined to, and subject to the whims and censorship of such providers. To overcome these limitations we introduce COBS, a browser-centric knowledge repository which enjoys the inherent openness (similar to WIKIPEDIA) while aiming to provide end-users the freedom of personalization and privacy by adopting an eventually hybrid/p2p back-end. In this paper we first present the COBS front-end, a browser add-on that enables users to tag, rate or comment arbitrary web pages and to socialize with others in both a synchronous and asynchronous manner. We then discuss how a decentralized back-end can be realized. While Distributed Hash Tables (DHTs) are the most natural choice, and despite a decade of research on DHT designs, we encounter several, some small, while others more fundamental shortcomings that need to be surmounted in order to realize an efficient, scalable and reliable decentralized back-end for COBS. To that end, we outline various design alternatives and discuss qualitatively (and quantitatively, when possible) their (dis-)advantages. We believe that the objectives of COBS are ambitious, posing significant challenges for distributed systems, middleware and distributed data-analytics research, even while building on the existing momentum. Based on experiences from our ongoing work on COBS, we outline these systems research issues in this position paper. 0 0
COLT: A proposed center for open teaching and learning Forsyth P.
Cummings R.E.
Centers for Teaching and Learning
Higher education
Open education
Open educational practices
Open educational resources
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English The Center for Open Learning and Teaching (COLT) is a proposed interdisciplinary research consortium and network with a physical center at the University of Mississippi supporting the integration of effective Internet-based learning practices into education. 0 0
Calculating Wikipedia article similarity using machine translation evaluation metrics Maike Erdmann
Andrew Finch
Kotaro Nakayama
Eiichiro Sumita
Takahiro Hara
Shojiro Nishio
Bilingual dictionary
Cross-language Document Similarity
Data mining
Proceedings - 25th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2011 English Calculating the similarity of Wikipedia articles in different languages is helpful for bilingual dictionary construction and various other research areas. However, standard methods for document similarity calculation are usually very simple. Therefore, we describe an approach of translating one Wikipedia article into the language of the other article, and then calculating article similarity with standard machine translation evaluation metrics. An experiment revealed that our approach is effective for identifying Wikipedia articles in different languages that are covering the same concept. 0 0
Capability modeling of knowledge-based agents for commonsense knowledge integration Kuo Y.-L.
Hsu J.Y.-J.
Agent description
Capability model
Common sense
Commonsense knowledge integration
Multi-agent system
Lecture Notes in Computer Science English Robust intelligent systems require commonsense knowledge. While significant progress has been made in building large commonsense knowledge bases, they are intrinsically incomplete. It is difficult to combine multiple knowledge bases due to their different choices of representation and inference mechanisms, thereby limiting users to one knowledge base and its reasonable methods for any specific task. This paper presents a multi-agent framework for commonsense knowledge integration, and proposes an approach to capability modeling of knowledge bases without a common ontology. The proposed capability model provides a general description of large heterogeneous knowledge bases, such that contents accessible by the knowledge-based agents may be matched up against specific requests. The concept correlation matrix of a knowledge base is transformed into a k-dimensional vector space using low-rank approximation for dimensionality reduction. Experiments are performed with the matchmaking mechanism for commonsense knowledge integration framework using the capability models of ConceptNet, WordNet, and Wikipedia. In the user study, the matchmaking results are compared with the ranked lists produced by online users to show that over 85% of them are accurate and have positive correlation with the user-produced ranked lists. 0 0
Capability reconfiguration of incumbent firms: Nintendo in the video game industry Subramanian A.M.
Chai K.-H.
Mu S.
Dynamic capability
Game console
Resource-based view of the firm
Technovation English The importance of incumbent firms ability to transform themselves according to the changing technological environment has been underlined by several scholars and practitioners. Yet, how incumbents leverage on commercial capabilities in order to develop such technological reconfiguration abilities in the midst of fierce competition from new entrants has not gained enough attention. To address the above research issue, our study investigated the case of Nintendo, an incumbent firm in the video game industry, using the dynamic capability perspective. Our study relied on primary and secondary data collected from diverse sources such as interviews, web contents, magazines, the US Patent and Trademark Office and Wikipedia. Three component factors that reflect the common features of dynamic capabilities across past studies emerged as the basis of Nintendos reconfiguration ability. Underlining the significance of these commercial capabilities in the technological reconfiguration of an incumbent, our paper helps to synthesize this stream of literature and extends guidelines for future empirical studies to develop the dynamic capability construct. In addition, the findings also help managers devise strategies for an adaptive organization. © 2011 Elsevier Ltd. All rights reserved. 0 0
Care for patients with ultra-rare disorders Hennekam R.C.M. Centres of expertise
E-mail consulting
European community
Rare disorders
Support groups
Total exome sequencing
Ultra-rare disorders
Virtual centres of expertise
European Journal of Medical Genetics English There is increasing attention by policy makers and health authorities for rare disorders (by definition prevalence <1:2000). The attention for ultra-rare disorders (suggested prevalence one-thousandth of rare disorders, so <1:2,000,000) is very limited however. Here some aspects of organizing adequate care for individuals with ultra-rare disorders in a European setting are discussed.Individual ultra-rare disorders are by definition very uncommon but it can be calculated that as a group they form a considerable part of the total group of persons with rare disorders in the European Community (EC). Diagnostics and regular care for individuals with rare disorders is being arranged in national centres of expertise, but due to small individual numbers this is not possible for ultra-rare disorders. A secure database on the internet to which patients with unknown diagnoses from all countries within the EC can be uploaded using standardized terminology and including clinical pictures will be needed to allow for recognition of comparable phenotypes in patients and, thus, establishing rare diagnoses. Due to the large distances between the places where patients live and their large numbers regular care has to be provided locally and centres of excellence will have to function virtually through e-mail consulting. The use of wiki's that are accessible to patients and families to upload data will help to disseminate knowledge and experience. It will be extremely difficult to obtain sufficient funds for research in ultra-rare disorders. It is suggested that the many very small support groups for ultra-rare disorders organize themselves in umbrella organisations of such size that policy makers and grant providing bodies will consult them for their strategies. The role of individuals with ultra-rare disorders themselves, or their families, in obtaining access to all advantages modern medicine can provide will therefore be large. 0 0
Casting a web of trust over Wikipedia: An interaction-based approach Silviu Maniu
Talel Abdessalem
Bogdan Cautis
Online community
Signed networks
Social applications
Web of trust
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English We report in this short paper results on inferring a signed network (a "web of trust") from user interactions. On the Wikipedia network of contributors, from a collection of articles in the politics domain and their revision history, we investigate mechanisms by which relationships between contributors - in the form of signed directed links - can be inferred from their interactions. Our preliminary study provides valuable insight into principles underlying a signed network of Wikipedia contributors that is captured by social interaction. We look into whether this network (called hereafter WikiSigned) represents indeed a plausible configuration of link signs. We assess connections to social theories such as structural balance and status, which have already been considered in online communities. We also evaluate on this network the accuracy of a learned predictor for edge signs. Equipped with learning techniques that have been applied before on three explicit signed networks, we obtain good accuracy over the WikiSigned edges. Moreover, by cross training-testing we obtain strong evidence that our network does reveal an implicit signed configuration and that it has similar characteristics to the explicit ones, even though it is inferred from interactions. We also report on an application of the resulting signed network that impacts Wikipedia readers, namely the classification of Wikipedia articles by importance and quality. 0 0
Categorising social tags to improve folksonomy-based recommendations Ivan Cantador
Ioannis Konstas
Jose J.M.
Recommender system
Semantic web
Social tagging
W3C Linking Open Data
Journal of Web Semantics English In social tagging systems, users have different purposes when they annotate items. Tags not only depict the content of the annotated items, for example by listing the objects that appear in a photo, or express contextual information about the items, for example by providing the location or the time in which a photo was taken, but also describe subjective qualities and opinions about the items, or can be related to organisational aspects, such as self-references and personal tasks. Current folksonomy-based search and recommendation models exploit the social tag space as a whole to retrieve those items relevant to a tag-based query or user profile, and do not take into consideration the purposes of tags. We hypothesise that a significant percentage of tags are noisy for content retrieval, and believe that the distinction of the personal intentions underlying the tags may be beneficial to improve the accuracy of search and recommendation processes. We present a mechanism to automatically filter and classify raw tags in a set of purpose-oriented categories. Our approach finds the underlying meanings (concepts) of the tags, mapping them to semantic entities belonging to external knowledge bases, namely WordNet and Wikipedia, through the exploitation of ontologies created within the W3C Linking Open Data initiative. The obtained concepts are then transformed into semantic classes that can be uniquely assigned to content- and context-based categories. The identification of subjective and organisational tags is based on natural language processing heuristics. We collected a representative dataset from Flickr social tagging system, and conducted an empirical study to categorise real tagging data, and evaluate whether the resultant tags categories really benefit a recommendation model using the Random Walk with Restarts method. The results show that content- and context-based tags are considered superior to subjective and organisational tags, achieving equivalent performance to using the whole tag space. © 2010 Elsevier B.V. All rights reserved. 0 0
Categorization of wikipedia articles with spectral clustering Szymanski J. Lecture Notes in Computer Science English The article reports application of clustering algorithms for creating hierarchical groups within Wikipedia articles. We evaluate three spectral clustering algorithms based on datasets constructed with usage of Wikipedia categories. Selected algorithm has been implemented in the system that categorize Wikipedia search results in the fly. 0 0
Characterization and prediction of Wikipedia edit wars Róbert Sumi
Taha Yasseri
András Rung
András Kornai
János Kertész
WebSci Conference English We present a new, eficient method for automatically detecting conict cases and test it on five diferent language Wikipedias. We discuss how the number of edits, reverts, the length of discussions deviate in such pages from those following the general workow. 4 2
Characterizing Wikipedia pages using edit network motif profiles Guangyu Wu
Martin Harrigan
Pádraig Cunningham
Network motifs
SMUC English Good Wikipedia articles are authoritative sources due to the collaboration of a number of knowledgeable contributors. This is the many eyes idea. The edit network associated with a Wikipedia article can tell us something about its quality or authoritativeness. In this paper we explore the hypothesis that the characteristics of this edit network are predictive of the quality of the corresponding article's content. We characterize the edit network using a profile of network motifs and we show that this network motif profile is predictive of the Wikipedia quality classes assigned to articles by Wikipedia editors. We further show that the network motif profile can identify outlier articles particularly in the 'Featured Article' class, the highest Wikipedia quality class. 0 0
Citizens as database: Conscious ubiquity in data collection Richter K.-F.
Winter S.
Lecture Notes in Computer Science English Crowd sourcing [1], citzens as sensors [2], user-generated content [3,4], or volunteered geographic information [5] describe a relatively recent phenomenon that points to dramatic changes in our information economy. Users of a system, who often are not trained in the matter at hand, contribute data that they collected without a central authority managing or supervising the data collection process. The individual approaches vary and cover a spectrum from conscious user actions ('volunteered') to passive modes ('citizens as sensors'). Volunteered user-generated content is often used to replace existing commercial or authoritative datasets, for example, Wikipedia as an open encyclopaedia, or OpenStreetMap as an open topographic dataset of the world. Other volunteered content exploits the rapid update cycles of such mechanisms to provide improved services. For example, fixmystreet.com reports damages related to streets; Google, TomTom and other dataset providers encourage their users to report updates of their spatial data. In some cases, the database itself is the service; for example, Flickr allows users to upload and share photos. At the passive end of the spectrum, data mining methods can be used to further elicit hidden information out of the data. Researchers identified, for example, landmarks defining a town from Flickr photo collections [6], and commercial services track anonymized mobile phone locations to estimate traffic flow and enable real-time route planning. 0 0
City model enrichment Smart P.D.
Quinn . J.A.
Jones C.B.
ISPRS Journal of Photogrammetry and Remote Sensing English The combination of mobile communication technology with location and orientation aware digital cameras has introduced increasing interest in the exploitation of 3D city models for applications such as augmented reality and automated image captioning. The effectiveness of such applications is, at present, severely limited by the often poor quality of semantic annotation of the 3D models. In this paper, we show how freely available sources of georeferenced Web 2.0 information can be used for automated enrichment of 3D city models. Point referenced names of prominent buildings and landmarks mined from Wikipedia articles and from the OpenStreetMaps digital map and Geonames gazetteer have been matched to the 2D ground plan geometry of a 3D city model. In order to address the ambiguities that arise in the associations between these sources and the city model, we present procedures to merge potentially related buildings and implement fuzzy matching between reference points and building polygons. An experimental evaluation demonstrates the effectiveness of the presented methods. © 2011 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). 0 0
Clasificación de textos en lenguaje natural usando la wikipedia Quinteiro-Gonzalez J.M.
Martel-Jordan E.
Hernandez-Morera P.
Ligero-Fleitas J.A.
Lopez-Rodriguez A.
Machine learning
Natural Language Processing
Text categorization
RISTI - Revista Iberica de Sistemas e Tecnologias de Informacao Spanish Automatic Text Classifiers are needed in environments where the amount of data to handle is so high that human classification would be ineffective. In our study, the proposed classifier takes advantage of the Wikipedia to generate the corpus defining each category. The text is then analyzed syntactically using Natural Language Processing software. The proposed classifier is highly accurate and outperforms Machine Learning trained classifiers. 0 0
Classification Techniques for Assessing Student Collaboration in Shared Wiki Spaces Chitrabharathi Ganapathy
Jeon-Hyung Kang
Erin Shaw
Jihie Kim
English This paper presents the case study of collaboration analysis in the context of an undergraduate student engineering project. Shared Wiki spaces used by students in collaborative project teams were analyzed and the paper presents new techniques, based on descriptive statistics and the Labeled Latent Dirichlet Allocation (LLDA) model for multi-label document classification, to assess quality of student work in shared wiki spaces. A link is shown between processes of collaboration, performance and work pace. 0 0
Classification of Recommender Expertise in the Wikipedia Recommender System Christian D. Jensen
Povilas Pilkauskas
Thomas Lefévre
Journal of Information Processing English 0 0
Classifying Wikipedia entities into fine-grained classes Maksim Tkatchenko
Alexander Ulanov
Andrey Simanovsky
ICDEW English 0 0
ClassroomWiki: a collaborative Wiki for institutional use Rupali Sawant
Apoorv Singhal
Priyank Nigam
Utkarsh Shah
ICWET English 0 0
Click log based evaluation of link discovery David Alexander
Andrew Trotman
Knott A.
Information retrieval
ADCS 2011 - Proceedings of the Sixteenth Australasian Document Computing Symposium English We introduce a set of new metrics for hyperlink quality. These metrics are based on users' interactions with hyperlinks as recorded in click logs. Using a year-long click log, we assess the INEX 2008 link discovery (Link-the-Wiki) runs and find that our metrics rank them differently from the existing metrics (INEX automatic and manual assessment), and that runs tend to perform well according to either our metrics or the existing ones, but not both. We conclude that user behaviour is influenced by more factors than are assessed in automatic and manual assessment, and that future link discovery strategies should take this into account. We also suggest ways in which our assessment method may someday replace automatic and manual assessment, and explain how this would benefit the quality of large-scale hypertext collections such as Wikipedia. 0 0
Clustering blogs using document context similarity and spectral graph partitioning Ayyasamy R.K.
Alhashmi S.M.
Eu-Gene S.
Tahayna B.
Bipartite graph
Blog Clustering
Similarity measure
Advances in Intelligent and Soft Computing English Semantic-based document clustering has been a challenging problem over the past few years and its execution depends on modeling the underlying content and its similarity metrics. Existing metrics evaluate pair wise text similarity based on text content, which is referred as content similarity. The performances of these measures are based on co-occurrences, and ignore the semantics among words. Although, several research works have been carried out to solve this problem, we propose a novel similarity measure by exploiting external knowledge base-Wikipedia to enhance document clustering task. Wikipedia articles and the main categories were used to predict and affiliate them to their semantic concepts. In this measure, we incorporate context similarity by constructing a vector with each dimension representing contents similarity between a document and other documents in the collection. Experimental result conducted on TREC blog dataset confirms that the use of context similarity measure, can improve the precision of document clustering significantly. 0 0
ClusteringWiki: Personalized and collaborative clustering of search results Anastasiu D.C.
Gao B.J.
Buttler D.
Mass collaboration
Personalized clustering
Search result clustering
Web 2.0
SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval English How to organize and present search results plays a critical role in the utility of search engines. Due to the unprecedented scale of the Web and diversity of search results, the common strategy of ranked lists has become increasingly inadequate, and clustering has been considered as a promising alternative. Clustering divides a long list of disparate search results into a few topic-coherent clusters, allowing the user to quickly locate relevant results by topic navigation. While many clustering algorithms have been proposed that innovate on the automatic clustering procedure, we introduce ClusteringWiki, the first prototype and framework for personalized clustering that allows direct user editing of clustering results. Through a Wiki interface, the user can edit and annotate the membership, structure and labels of clusters for a personalized presentation. In addition, the edits and annotations can be shared among users as a mass collaborative way of improving search result organization and search engine utility. 0 0
Co-authorship 2.0: Patterns of collaboration in Wikipedia Tasso David Laniado Collaboration network
Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia The study of collaboration patterns in wikis can help shed light on the process of content creation by online communities. To turn a wiki's revision history into a collaboration network, we propose an algorithm that identifies as authors of a page the users who provided the most of its relevant content, measured in terms of quantity and of acceptance by the community. The scalability of this approach allows us to study the English Wikipedia community as a co-authorship network. We find evidence of the presence of a nucleus of very active contributors, who seem to spread over the whole wiki, and to interact preferentially with inexperienced users. The fundamental role played by this elite is witnessed by the growing centrality of sociometric stars in the network. Isolating the community active around a category, it is possible to study its specific dynamics and most influential authors. 0 3
Co-authorship 2.0: patterns of collaboration in Wikipedia David Laniado
Riccardo Tasso
Collaboration network
Online production
Social network analysis
Hypertext English The study of collaboration patterns in wikis can help shed light on the process of content creation by online communities. To turn a wiki's revision history into a collaboration network, we propose an algorithm that identifies as authors of a page the users who provided the most of its relevant content, measured in terms of quantity and of acceptance by the community. The scalability of this approach allows us to study the English Wikipedia community as a co-authorship network. We find evidence of the presence of a nucleus of very active contributors, who seem to spread over the whole wiki, and to interact preferentially with inexperienced users. The fundamental role played by this elite is witnessed by the growing centrality of sociometric stars in the network. Isolating the community active around a category, it is possible to study its specific dynamics and most influential authors. 0 3
CoSi: Context-sensitive keyword query interpretation on RDF databases Fu H.
Gao S.
Anyanwu K.
Keyword query interpretation
Query history
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English The demo will present CoSi, a system that enables context-sensitive interpretation of keyword queries on RDF databases. The techniques for representing, managing and exploiting query history are central to achieving this objective. The demonstration will show the effectiveness of our approach for capturing a user's querying context from their query history. Further, it will show how context is utilized to influence the interpretation of a new query. The demonstration is based on DBPedia, the RDF representation of Wikipedia. 0 0
CoSyne: A framework for multilingual content synchronization of wikis Christof Monz
Vivi Nastase
Matteo Negri
Angela Fahrni
Yashar Mehdad
Michael Strube
Recognizing textual entailment
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English Wikis allow a large base of contributors easy access to shared content, and freedom in editing it. One of the side-effects of this freedom was the emergence of parallel and independently evolving versions in a variety of languages, reflecting the multilingual background of the pool of contributors. For the Wiki to properly represent the user-added content, this should be fully available in all its languages. Working on parallel Wikis in several European languages, we investigate the possibility to "synchronize" different language versions of the same document, by: i) pinpointing topically related pieces of information in the different languages, ii) identifying information that is missing or less detailed in one of the two versions, iii) translating this in the appropriate language, iv) inserting it in the appropriate place. Progress along such directions will allow users to share more easily content across language boundaries. 0 0
Coerência entre princípios e práticas na Wikipédia Lusófona: uma análise semiótica Paulo Henrique Souto Maior Serrano Wikipedia
Communities of Practice
Tensive Semiotics
Portuguese This paper presents the method, the analysis and the results of a study that examined the operation dynamics and consistency between the guidelines of conduct and practice of editing at the Lusophone version of Wikipedia, the free encyclopedia. This work uses information and content published under the Creative Commons / Share alike 3.0 that indicates the need to distribute the resulting work under the same license. The online encyclopedia can be freely changed by users that browse its contents. Discussions on the permanence or alteration of information published are held in a special discussion page where people can argue about the differences of opinion and reach consensus. This process occurs from cognitive and pragmatic sanctions given to themes and figures that make up the thematic isotopy of users enunciation. The identification of these elements in this dissertation, was carried out by Greimas' semiotics. Sanctions should pragmatically represent the guidelines of the collaborative process on Wikipedia, but there are institutionalized rules that are presented to users as the five pillars of Wikipedia. The five pillars are about the encyclopedism, neutral point of view, free license, community conviviality and liberality in the rules. The statute assigns values to the practice of encyclopedias and information that are published by them. These values were defined by tensive semiotics and compared with the cognitive and pragmatic sanctions of the isotopies enunciated by users, to check the consistency between what is being requested by Wikipedia and what is being done by their contributors. The results of this comparison show some similarities and differences between discourse and practice, indicating ownership of Wikipedia by its users and the need for more accuracy and criteria in conflicting issues or controversies for the permanence of information on the page entry. The verifiability of the information was presented as a greatly appreciated theme by users, indicating the importance of the veracity of reference sources and the verification of information. The freedoms and distribution of powers introduced by the principles are denied on the practice of editing. Wikipedia presented itself as a very liberal and tolerant encyclopedia, giving substance to the collaboration, but, in practice, very restrictive and careful when it comes to the permanence of a content in the article page. 4 1
Coherence progress: A measure of interestingness based on fixed compressors Schaul T.
Pape L.
Glasmachers T.
Graziano V.
Schmidhuber J.
Lecture Notes in Computer Science English The ability to identify novel patterns in observations is an essential aspect of intelligence. In a computational framework, the notion of a pattern can be formalized as a program that uses regularities in observations to store them in a compact form, called a compressor. The search for interesting patterns can then be stated as a search to better compress the history of observations. This paper introduces coherence progress, a novel, general measure of interestingness that is independent of its use in a particular agent and the ability of the compressor to learn from observations. Coherence progress considers the increase in coherence obtained by any compressor when adding an observation to the history of observations thus far. Because of its applicability to any type of compressor, the measure allows for an easy, quick, and domain-specific implementation. We demonstrate the capability of coherence progress to satisfy the requirements for qualitatively measuring interestingness on a Wikipedia dataset. 0 0
Cohort shepherd: Discovering cohort traits from hospital visits Goodwin T.
Rink B.
Roberts K.
Harabagiu S.M.
NIST Special Publication English This paper describes the system created by the University of Texas at Dallas for content- based medical record retrieval submitted to the TREC 2011 Medical Records Track. Our system builds a query by extracting keywords from a given topic using a Wikipedia-based approach we use regular expressions to ex- tract age, gender, and negation requirements. Each query is then expanded by relying on UMLS, SNOMED, Wikipedia, and PubMed Co-occurrence data for retrieval. Four runs were submitted: two based on Lucene with varying scoring methods, and two based on a hybrid approach with varying negation detec- tion techniques. Our highest scoring submis- sion achieved a MAP score of 40.8. 0 0
Collaboration by choice: Youth online creative collabs in Scratch Kafai Y.
Roque R.
Fields D.
Monroy-Hernandez A.
Online community
Proceedings of the 19th International Conference on Computers in Education, ICCE 2011 English Online creative production has received considerable attention for its success in creating Wikipedia and Free and Open Source Software yet few youth participate in such voluntary online collaborations, in particular in programming contexts. In this paper, we describe how youth programmers organized collaborative groups or collabs in response to a design challenge in the Scratch Online Community. We report on participation in the "Collab Challenge" in the Scratch community at large and with particular groups, designers' efforts in recruiting and organizing collab groups, and the role of community feedback. In the discussion, we address what we learned about youth's informal collaborative skills, fostering community participation, and the design of online communities supportive of creative collaboration, and open issues for further research. 0 0
Collaborative Wikipedia Hosting Wikipedia
Collaborative web hosting
0 0
Collaborative learning using wiki web sites for computer science undergraduate education: A case study IEEE Transactions on Education English 0 0
Collaborative learning with a wiki: Differences in perceived usefulness in two contexts of use Journal of Computer Assisted Learning English 0 0
Collaborative management of business metadata Huner K.M.
Boris Otto
Osterle H.
Data quality management
Design science research
Metadata management
Semantic wiki
International Journal of Information Management English Legal provisions, cross-company data exchange and intra-company reporting or planning procedures require comprehensively, timely, unambiguously and understandably specified business objects (e.g. materials, customers, and suppliers). On the one hand, this business metadata has to cover miscellaneous regional peculiarities in order to enable business activities anywhere in the world. On the other hand, data structures need to be standardized throughout the entire company in order to be able to perform global spend analysis, for example. In addition, business objects should adapt to new market conditions or regulatory requirements as quickly and consistently as possible. Centrally organized corporate metadata managers (e.g. within a central IT department) are hardly able to meet all these demands. They should be supported by key users from several business divisions and regions, who contribute expert knowledge. However, despite the advantages regarding high metadata quality on a corporate level, a collaborative metadata management approach of this kind has to ensure low effort for knowledge contributors as in most cases these regional or divisional experts do not benefit from metadata quality themselves. Therefore, the paper at hand identifies requirements to be met by a business metadata repository, which is a tool that can effectively support collaborative management of business metadata. In addition, the paper presents the results of an evaluation of these requirements with business experts from various companies and of scenario tests with a wiki-based prototype at the company Bayer CropScience AG. The evaluation shows two things: First, collaboration is a success factor when it comes to establishing effective business metadata management and integrating metadata with enterprise systems, and second, semantic wikis are well suited to realizing business metadata repositories. 0 0
Collaborative sensemaking during admin permission granting in Wikipedia Katie Derthick
Patrick Tsao
Travis Kriplean
Alan Borning
Mark Zachry
David W. McDonald
Collaboration software
Contributor systems
System administration
Lecture Notes in Computer Science English A self-governed, open contributor system such as Wikipedia depends upon those who are invested in the system to participate as administrators. Processes for selecting which system contributors will be allowed to assume administrative roles in such communities have developed in the last few years as these systems mature. However, little is yet known about such processes, which are becoming increasingly important for the health and maintenance of contributor systems that are becoming increasingly important in the knowledge economy. This paper reports the results of an exploratory study of how members of the Wikipedia community engage in collaborative sensemaking when deciding which members to advance to admin status. 0 0
Collaborative sensemaking during admin permission granting in wikipedia Katie Derthick
Patrick Tsao
Travis Kriplean
Alan Borning
Mark Zachry
David W. McDonald
Collaboration software
Contributor systems
System administration
OCSC English 0 0
Collaborative video editing for Wikipedia Michael Dale WikiSym English 0 0
Collaborative writing with wikis: Evaluating students' contributions Hadjerrouit S. Collaboration
Collaborative learning
Collaborative authoring
IADIS International Conference on Cognition and Exploratory Learning in Digital Age, CELDA 2011 English Wikis are widely considered as collaborative writing tools that foster collaborative writing. Most studies that report on wiki capabilities to support collaborative writing are based on students' subjective perceptions, which are not reliable enough to measure the degree and level of contribution to the wiki. A more reliable method is the data log generated by wikis, which tracks all actions and activities performed by each student and stores all previous versions of the wiki. This data log is of considerable interest and inherently more reliable to collect and analyze collaborative writing activities than students' subjective perceptions. Using MediaWiki as a platform, this work reports on the evaluation of the data logs created by wikis as students perform collaborative writing projects. The results are critically discussed and some pedagogical implications are drawn for collaborative writing in teacher education. 0 0
Collective memory building in Wikipedia: The case of North African uprisings Michela Ferron
Paolo Massa
Web 2.0
Collective memory
Traumatic event
North Africa
WikiSym English Since December 2010, a series of protests and uprisings have shocked North African countries such as Tunisia, Egypt, Libya, Syria, Yemen and more. In this paper, focusing mainly on the Egyptian revolution, we provide evidence of the intense edit activity occurred during these uprisings on the related Wikipedia pages. Thousands of people provided their contribution on the content pages and discussed improvements and disagreements on the associated talk pages as the traumatic events unfolded. We propose to interpret this phenomenon as a process of collective memory building and argue how on Wikipedia this can be studied empirically and quantitatively in real time. We explore and suggest possible directions for future research on collective memory formation of traumatic and controversial events in Wikipedia. 14 0
Combining heterogeneous knowledge resources for improved distributional semantic models Szarvas G.
Torsten Zesch
Iryna Gurevych
Lecture Notes in Computer Science English The Explicit Semantic Analysis (ESA) model based on term cooccurrences in Wikipedia has been regarded as state-of-the-art semantic relatedness measure in the recent years. We provide an analysis of the important parameters of ESA using datasets in five different languages. Additionally, we propose the use of ESA with multiple lexical semantic resources thus exploiting multiple evidence of term cooccurrence to improve over the Wikipedia-based measure. Exploiting the improved robustness and coverage of the proposed combination, we report improved performance over single resources in word semantic relatedness, solving word choice problems, classification of semantic relations between nominals, and text similarity. 0 0
Combining multiple disambiguation methods for gene mention normalization Xia N.
Hong Lin
Zhenglu Yang
Yanyan Li
BioCreative II
Gene mention normalization
Gene symbol disambiguation
Web-based kernel
Expert Systems with Applications English The rapid growth of biomedical literature prompts pervasive concentrations of biomedical text mining community to explore methodology for accessing and managing this ever-increasing knowledge. One important task of text mining in biomedical literature is gene mention normalization which recognizes the biomedical entities in biomedical texts and maps each gene mention discussed in the text to unique organic database identifiers. In this work, we employ an information retrieval based method which extracts gene mention's semantic profile from PubMed abstracts for gene mention disambiguation. This disambiguation method focuses on generating a more comprehensive representation of gene mention rather than the organic clues such as gene ontology which has fewer co-occurrences with the gene mention. Furthermore, we use an existing biomedical resource as another disambiguation method. Then we extract features from gene mention detection system's outcome to build a false positive filter according to Wikipedia's retrieved documents. Our system achieved F-measure of 83.1% on BioCreative II GN test data. © 2011 Elsevier Ltd. All rights reserved. 0 0
Comparing methods for single paragraph similarity analysis Stone B.
Dennis S.
Kwantes P.J.
Corpus construction
Corpus preprocessing
Paragraph similarity
Semantic models
Wikipedia corpora
Topics in Cognitive Science English The focus of this paper is two-fold. First, similarities generated from six semantic models were compared to human ratings of paragraph similarity on two datasets-23 World Entertainment News Network paragraphs and 50 ABC newswire paragraphs. Contrary to findings on smaller textual units such as word associations (Griffiths, Tenenbaum, & Steyvers, 2007), our results suggest that when single paragraphs are compared, simple nonreductive models (word overlap and vector space) can provide better similarity estimates than more complex models (LSA, Topic Model, SpNMF, and CSM). Second, various methods of corpus creation were explored to facilitate the semantic models' similarity estimates. Removing numeric and single characters, and also truncating document length improved performance. Automated construction of smaller Wikipedia-based corpora proved to be very effective, even improving upon the performance of corpora that had been chosen for the domain. Model performance was further improved by augmenting corpora with dataset paragraphs. 0 0
Comparison of different ontology-based query expansion algorithms for effective image retrieval Leung C.H.C.
Yanyan Li
Concept distance
Image retrieval
Query expansion
Communications in Computer and Information Science English We study several semantic concept-based query expansion and re-ranking scheme and compare different ontology-based expansion methods in image search and retrieval. In particular, we exploit the two concept similarities of different concept expansion ontology-WordNet Similarity, Wikipedia Similarity. Furthermore, we compare the keywords semantic distance with the precision of image search results with query expansion according to different concept expansion algorithms. We also compare the image retrieval precision of searching with the expanded query and original plain query. Preliminary experiments have been able to demonstrate that the two proposed retrieval mechanism has the potential to outperform unaided approaches. 0 0
Comparison of wiki-based process modeling systems ACM International Conference Proceeding Series English 0 0
Computer literacy as life skills for a web 2.0 world Turk J. Class wiki
Computer literacy
Critical thinking
General education
Information literacy
SIGCSE'11 - Proceedings of the 42nd ACM Technical Symposium on Computer Science Education English This paper presents an innovative computer literacy course that focuses solely on developing skills needed for life in a networked world in which one must protect oneself from identity theft, be careful posting on social networks, and use credit and debit cards wisely. The course emphasizes ethical responsibility and information literacy. Its target audience, first-year, non-computer science majors, learn what they need to know to use technology safely, effectively, efficiently, and ethically. The course is grounded in active learning, such as posting in a class wiki, and critical thinking. It is a radical alternative to a traditional software packages approach. The paper documents the need for this course, The Digital Person, and its blend of content and pedagogy. Data from three years of offering the course provide an assessment of its effectiveness and value to those who have taken it. A review of SIGCSE literature in the last ten years finds no representation of this creative approach. 0 0
Concept based modeling approach for blog classification using fuzzy similarity Ayyasamy R.K.
Tahayna B.
Alhashmi S.M.
Eu-Gene S.
Blog classification
Fuzzy similarity
Proceedings - 2011 8th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2011 English As information technology is developing in a faster pace, there is a steep increase in social networking where the user can share their knowledge, views, criticism through various ways such as blogging, facebook, microblogging, news, forums, etc. Among these various ways, blogs play a different role as it is a personal site for each user, and blogger writes lengthy posts on various topics. Several research works are carried out, to classify blogs based on machine learning techniques. In this paper, we describe a method for classifying blog posts automatically using fuzzy similarity. We perform, experiments using TREC dataset and applied our approach to six different fuzzy similarity measures. Experimental results proved that Einstein fuzzy similarity measures performs better than the other measures. 0 0
Concept disambiguation exploiting semantic databases Hossucu A.G.
Ayyildiz H.
Gokturk Z.O.
Concept disambiguation
Linked data
Semantic databases
Proceedings of the International Workshop on Semantic Web Information Management, SWIM 2011 English This paper presents a novel approach for resolving ambiguities in concepts that already reside in semantic databases such as Freebase and DBpedia. Different from standard dictionaries and lexical databases, semantic databases provide a rich hierarchy of semantic relations in ontological structures. Our disambiguation approach decides on the implied sense by computing concept similarity measures as a function of semantic relations defined in ontological graph representation of concepts. Our similarity measures also utilize Wikipedia descriptions of concepts. We performed a preliminary experimental evaluation, measuring disambiguation success rate and its correlation with input text content. The results show that our method outperforms well-known disambiguation methods. 0 0
Concept-based document classification using Wikipedia and value function Pekka Malo
Ankur Sinha
Jyrki Wallenius
Pekka Korhonen
Journal of the American Society for Information Science and Technology English In this article, we propose a new concept-based method for document classification. The conceptual knowledge associated with the words is drawn from Wikipedia. The purpose is to utilize the abundant semantic relatedness information available in Wikipedia in an efficient value function-based query learning algorithm. The procedure learns the value function by solving a simple linear programming problem formulated using the training documents. The learning involves a step-wise iterative process that helps in generating a value function with an appropriate set of concepts (dimensions) chosen from a collection of concepts. Once the value function is formulated, it is utilized to make a decision between relevance and irrelevance. The value assigned to a particular document from the value function can be further used to rank the documents according to their relevance. Reuters newswire documents have been used to evaluate the efficacy of the procedure. An extensive comparison with other frameworks has been performed. The results are promising. 0 0
Concept-based information retrieval using explicit semantic analysis Egozi O.
Shaul Markovitch
Evgeniy Gabrilovich
Concept-based retrieval
Explicit semantic analysis
Feature selection
Semantic search
ACM Transactions on Information Systems English Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in the documents and in the queries. Furthermore, the relationship between these related keywords may be semantic rather than syntactic, and capturing it thus requires access to comprehensive human world knowledge. Concept-based retrieval methods have attempted to tackle these difficulties by using manually built thesauri, by relying on term cooccurrence data, or by extracting latent word relationships and concepts from a corpus. In this article we introduce a new concept-based retrieval approach based on Explicit Semantic Analysis (ESA), a recently proposed method that augments keywordbased text representation with concept-based features, automatically extracted from massive human knowledge repositories such as Wikipedia. Our approach generates new text features automatically, and we have found that high-quality feature selection becomes crucial in this setting to make the retrieval more focused. However, due to the lack of labeled data, traditional feature selection methods cannot be used, hence we propose new methods that use self-generated labeled training data. The resulting system is evaluated on several TREC datasets, showing superior performance over previous state-of-the-art results. 0 0
ConceptMapWiki - A collaborative framework for agglomerating pedagogical knowledge Lauri Lahti Concept map
Intelligent tutoring systems
Knowledge maturing
Proceedings of the 2011 11th IEEE International Conference on Advanced Learning Technologies, ICALT 2011 English We propose a new educational framework, ConceptMapWiki, that is a wiki representing pedagogical knowledge with a collection of concept maps which is collaboratively created, edited and browsed. The learners and educators provide complementing contribution to evolving shared knowledge structures that are stored in a relational database forming together inter-connected overlapping ontologies. Every contribution is stored supplied with time stamps and a user profile enabling to analyze maturing of knowledge according to various learner-driven criteria. Pedagogically motivated learning paths can be collaboratively defined and evaluated, and educational games can be incorporated based on browsing and editing concept maps. The proposed framework is believed to be the first wiki architecture of it's kind, designed for personalized learning with an evolving knowledge repository relying on adaptive visual representations and sound pedagogical motivation. Initial experiments with a functional online prototype indicate promising educational gain and suggest further research. 0 0
Conceptual indexing of documents using Wikipedia Carlo Abi Chahine
Nathalie Chaignaud
Kotowicz J.-P.
Pecuchet J.-P.
Directed acyclic graph
Document indexing
Keyword and topic extraction
Proceedings - 2011 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2011 English This paper presents an indexing support system that suggests for librarians a set of topics and keywords relevant to a pedagogical document. Our method of document indexing uses the Wikipedia category network as a conceptual taxonomy. A directed acyclic graph is built for each document by mapping terms (one or more words) to a concept in the Wikipedia category network. Properties of the graph are used to weight these concepts. This allows the system to extract socalled important concepts from the graph and to disambiguate terms of the document. According to these concepts, topics and keywords are proposed. This method has been evaluated by the librarians on a corpus of french pedagogical documents. 0 0
Conceptual query expansion and visual search results exploration for Web image retrieval Hoque E.
Strong G.
Hoeber O.
Gong M.
Conceptual query expansion
Image search results organization
Interactive exploration
Web image retrieval
Advances in Intelligent and Soft Computing English Most approaches to image retrieval on the Web have their basis in document search techniques. Images are indexed based on the text that is related to the images. Queries are matched to this text to produce a set of search results, which are organized in paged grids that are reminiscent of lists of documents. Due to ambiguity both with the user-supplied query and with the text used to describe the images within the search index, most image searches contain many irrelevant images distributed throughout the search results. In this paper we present a method for addressing this problem.We perform conceptual query expansion using Wikipedia in order to generate a diverse range of images for each query, and then use a multi-resolution self organizing map to group visually similar images. The resulting interface acts as an intelligent search assistant, automatically diversifying the search results and then allowing the searcher to interactively highlight and filter images based on the concepts, and zoom into an area within the image space to show additional images that are visually similar. 0 0
Constraint optimization approach to context based word selection Matsuno J.
Toru Ishida
IJCAI International Joint Conference on Artificial Intelligence English Consistent word selection in machine translation is currently realized by resolving word sense ambiguity through the context of a single sentence or neighboring sentences. However, consistent word selection over the whole article has yet to be achieved. Consistency over the whole article is extremely important when applying machine translation to collectively developed documents like Wikipedia. In this paper, we propose to consider constraints between words in the whole article based on their semantic relatedness and contextual distance. The proposed method is successfully implemented in both statistical and rule-based translators. We evaluate those systems by translating 100 articles in the English Wikipedia into Japanese. The results show that the ratio of appropriate word selection for common nouns increased to around 75% with our method, while it was around 55% without our method. 0 0
Constructing a semantic wiki-based teamwork platform for collaborative e-research 2011 International Conference on Electrical and Control Engineering, ICECE 2011 - Proceedings English 0 0
Content disputes in Wikipedia reflect geopolitical instability Gordana Apic
Matthew J. Betts
Robert B. Russell
English Indicators that rank countries according socioeconomic measurements are important tools for regional development and political reform. Those currently in widespread use are sometimes criticized for a lack of reproducibility or the inability to compare values over time, necessitating simple, fast and systematic measures. Here, we applied the 'guilt by association' principle often used in biological networks to the information network within the online encyclopedia Wikipedia to create an indicator quantifying the degree to which pages linked to a country are disputed by contributors. The indicator correlates with metrics of governance, political or economic stability about as well as they correlate with each other, and though faster and simpler, it is remarkably stable over time despite constant changes in the underlying disputes. For some countries, changes over a four year period appear to correlate with world events related to conflicts or economic problems. 0 1
Content neutrality for Wiki systems: From neutral point of view (NPOV) to every point of view (EPOV) Cap C.H. Content neutrality
Distributed Wiki
Neutral point of view
Wiki systems
Proceedings of the 4th International Conference on Internet Technologies and Applications, ITA 11 English The neutral point of view (NPOV) cornerstone of Wikipedia (WP) is challenged for next generation knowledge bases. An empirical test is made with two WP articles. A case is built for content neutrality as a new, every point of view (EPOV) guiding principle. The architectural implications of content neutrality are discussed and translated into novel concepts of Wiki architectures. Guidelines for implementing this architecture are presented. Although NPOV is heavily criticized, the contribution avoids ideological controversy but rather focuses on the benefits and characteristics of the novel approach. 0 0
Content-based recommendation algorithms on the Hadoop mapreduce framework De Pessemier T.
Vanhecke K.
Dooms S.
Martens L.
Cloud computing
Content-based recommendations
Recommender system
WEBIST 2011 - Proceedings of the 7th International Conference on Web Information Systems and Technologies English Content-based recommender systems are widely used to generate personal suggestions for content items based on their metadata description. However, due to the required (text) processing of these metadata, the computational complexity of the recommendation algorithms is high, which hampers their application in large-scale. This computational load reinforces the necessity of a reliable, scalable and distributed processing platform for calculating recommendations. Hadoop is such a platform that supports data-intensive distributed applications based on map and reduce tasks. Therefore, we investigated how Hadoop can be utilized as a cloud computing platform to solve the scalability problem of content-based recommendation algorithms. The various MapReduce operations, necessary for keyword extraction and generating content-based suggestions for the end-user, are elucidated in this paper. Experimental results on Wikipedia articles prove the appropriateness of Hadoop as an efficient and scalable platform for computing content-based recommendations. 0 0
Contrasts in student engagement, meaning-making, dislikes, and challenges in a discovery-based program of game design learning Rebecca Reynolds
Caperton I.H.
Cognitive load
Community of practice
Design-based research
Digital divide
Digital literacy
Discovery-based learning
Game design
Information literacy
Productive failure
Project-based learning
Self-determination theory
Social learning system
Social media
West Virginia
Educational Technology Research and Development English This implementation study explores middle school, high school and community college student experiences in Globaloria, an educational pilot program of game design offered in schools within the U. S. state of West Virginia, supported by a non-profit organization based in New York City called the World Wide Workshop Foundation. This study reports on student engagement, meaning making and critique of the program, in their own words. The study's data source was a mid-program student feedback survey implemented in Pilot Year 2 (2008/2009) of the 5 year design-based research initiative, in which the researchers posed a set of open-ended questions in an online survey questionnaire answered by 199 students. Responses were analyzed using inductive textual analysis. While the initial purpose for data collection was to elicit actionable program improvements as part of a design-based research process, several themes emergent in the data tie into recent debates in the education literature around discovery-based learning. In this paper, we draw linkages from the categories of findings that emerged in student feedback to this literature, and identify new scholarly research questions that can be addressed in the ongoing pilot, the investigation of which might contribute new empirical insights related to recent critiques of discovery based learning, self-determination theory, and the productive failure phenomenon. 0 0
Cooperative WordNet editor for lexical semantic acquisition Szymanski J. Acquisition
Collaborative editing
Lexical semantic
Semantic dictionaries
Communications in Computer and Information Science English The article describes an approach for building WordNet semantic dictionary in a collaborative approach paradigm. The presented system system enables functionality for gathering lexical data in a Wikipedia-like style. The core of the system is a user-friendly interface based on component for interactive graph navigation. The component has been used for WordNet semantic network presentation on web page, and it brings functionalities of modification its content by the distributed group of people. 0 0
Cooperative or collaborative literacy practices: Mapping metadiscourse in a business students' Wiki group project Australasian Journal of Educational Technology English 0 0
Coping with the dynamics of open, social media on mobile devices with mobile facets Kleinen A.
Scherp A.
Staab S.
Dynamics of social media
Mobile computing
Proceedings of the 4th International Workshop on Semantic Ambient Media Experience, SAME 2011, in Conjunction with the 5th International Convergence on Communities and Technologies English When traveling to a foreign city or wanting to know what is happening in one's home area, users today often search and explore different social media platforms. In order to provide different social media sources in an integrated manner on a mobile device, we have developed Mobile Facets. Mobile Facets allows for the faceted, interactive search and explo- ration of social media on a touchscreen mobile phone. The social media is queried live from different data sources and professional content sources like DBpedia, a Semantic Web version of Wikipedia, the event directories Eventful and Up- coming, geo-located Flickr photos, and GeoNames. Mobile Facets provides an integrated retrieval and interactive ex- ploration of resources from these social media sources such as places, persons, organizations, and events. One does not know in advance how many facets the application will receive from such sources in a specific contextual situation and how many data items for the facets will be provided. Thus, the user interface of Mobile Facets is to be designed to cope with this dynamics of social media. Copyright 0 0
Creating and Exploiting a Hybrid Knowledge Base for Linked Data Zareen Syed
Tim Finin
Information extraction
Knowledge base
Linked data
Semantic web
Communications in Computer and Information Science English Twenty years ago Tim Berners-Lee proposed a distributed hypertext system based on standard Internet protocols. The Web that resulted fundamentally changed the ways we share information and services, both on the public Internet and within organizations. That original proposal contained the seeds of another effort that has not yet fully blossomed: a Semantic Web designed to enable computer programs to share and understand structured and semi-structured information easily. We will review the evolution of the idea and technologies to realize a Web of Data and describe how we are exploiting them to enhance information retrieval and information extraction. A key resource in our work is Wikitology, a hybrid knowledge base of structured and unstructured information extracted from Wikipedia. 0 0
Creating and using a Wiki textbook to teach management information systems Olson T. Collaboration
On-line textbook
Web 2.0
Proceedings of the International Conference on e-Learning, ICEL English The Management Information System (MIS) Department has developed and utilizes a wiki textbook for our undergraduate introductory course. The web based wiki contains content ranging from classics discussing databases to recent articles describing global enterprise security. The wiki textbook was originally developed to address the rapid changes in the information technology field and the challenges of keeping a traditional textbook current. In addition to providing timely content for students, there are a number of other benefits in using a wiki textbook: Students only need to have access to an internet browser Students are able to download and save posted readings The total saving for students has exceeded $350,000. No text book for students to carry The content is always current, including RSS feeds from multiple news sources In discussion with several book publishers and scholars on using wiki technology in the classroom, our university feels this topic has tremendous potential both for students and educators in the field of MIS. 0 0
Creating categories for Wikipedia articles using self-organizing maps Szymanski J. Documents clustering
Principal Component Analysis
Self Organizing Maps
Text processing
Text processing text representation
Text representation
2011 International Conference on Communications, Computing and Control Applications, CCCA 2011 English The article presents the results of the experiments performed on selected sub-set of Wikipedia which we categorized automaticly. We analyze two methods of text representation: based on references and word content. Using them we introduced joint representation that has been used to build groups of similar articles based on Kohonen Self-Organizing Maps. To fulfill efficiency of the data processing, we performed dimensionality reduction of raw data using Principal Component Analysis performed on similarity matrix. Changing the granularity of SOM network allows to build hierarchical categories and find significant relations between articles in documents repository. 0 0
Creating online collaborative environments for museums: A case study of a museum wiki Alison Hsiang-Yi Liu
Jonathan P. Bowen
Collaborative learning
Community of practice
Knowledge management
Online community
Int. J. Web Based Communities English Museums have been increasingly adopting Web 2.0 technology to reach and interact with their visitors. Some have experimented with wikis to allow both curators and visitors to provide complementary information about objects in the museum. An example of this is the Object Wiki from the Science Museum in London. Little has been done to study these interactions in an academic framework. In the field of knowledge management, the concept of 'communities of practice' has been posited as a suitable structure in which to study how knowledge is developed within a community with a common interest in a particular domain, using a sociological approach. Previously this has been used in investigating the management of knowledge within business organisations, teachers' professional development, and online e-learning communities. The authors apply this approach to a museum-based wiki to assess its applicability for such an endeavour. Copyright 0 0
Creative Commons licenses and the non-commercial condition Daniel Mietchen
Robert A. Morris
Donat Agosti
Lyubomir Penev
Walter G. Berendsohn
Donald Hobern
ZooKeys English The Creative Commons (CC) licenses are a suite of copyright-based licenses defining terms for the distribution and re-use of creative works. CC provides licenses for different use cases and includes open content licenses such as the Attribution license (CC BY, used by many Open Access scientific publishers) and the Attribution Share Alike license (CC BY-SA, used by Wikipedia, for example). However, the license suite also contains non-free and non-open licenses like those containing a “non-commercial” (NC) condition. Although many people identify “non-commercial” with “non-profit”, detailed analysis reveals that significant differences exist and that the license may impose some unexpected re-use limitations on works thus licensed. After providing background information on the concepts of Creative Commons licenses in general, this contribution focuses on the NC condition, its advantages, disadvantages and appropriate scope. Specifically, it contributes material towards a risk analysis for potential re-users of NC-licensed works. 0 0
Credibility Assessment Using Wikipedia for Messages on Social Network Services Yu Suzuki
Akiyo Nadamoto
Social Network Service
DASC English 0 0
Credibility judgment and verification behavior of college students concerning Wikipedia Lim
S. and Simon
Wikipedia; credibility; theory of bounded rationality; verification; college students First Monday This study examines credibility judgments in relation to peripheral cues and genre of Wikipedia articles, and attempts to understand user information verification behavior based on the theory of bounded rationality. Data were collected employing both an experiment and a survey at a large public university in the midwestern United States in Spring 2010. This study shows some interesting patterns. It appears that the effect of peripheral cues on credibility judgments differed according to genre. Those who did not verify information displayed a higher level of satisficing than those who did. Students used a variety of peripheral cues of Wikipedia. The exploratory data show that peer endorsement may be more important than formal authorities for user generated information sources, such as Wikipedia, which calls for further research. 0 0
Critical Point of View: A Wikipedia Reader Amila Akdag Salah
Nicholas Carr
Shun-ling Chen
Florian Cramer
Morgan Currie
Edgar Enyedy
Andrew Famiglietti
Heather Ford
Mayo Fuster Morell
Cheng Gao
R. Stuart Geiger
Mark Graham
Gautam John
Dror Kamir
Peter B. Kaufman
Scott Kildall
Lawrence Liang
Patrick Lichty
Geert Lovink
Hans Varghese Mathews
Johanna Niesyto
Matheiu O’Neil
Dan O’Sullivan
Joseph M. Reagle
Andrea Scharnhorst
Alan Shapiro
Christian Stegbauer
Nathaniel Stern
Krzystztof Suchecki
Nathaniel Tkacz
Maja van der Velden
Institute of Network Cultures English For millions of internet users around the globe, the search for new knowledge begins with Wikipedia. The encyclopedia’s rapid rise, novel organization, and freely offered content have been marveled at and denounced by a host of commentators. Critical Point of View moves beyond unflagging praise, well-worn facts, and questions about its reliability and accuracy, to unveil the complex, messy, and controversial realities of a distributed knowledge platform. 0 4
Cross lingual text classification by mining multilingual topics from Wikipedia Xiaochuan Ni
Sun J.-T.
Jian Hu
Zheng Chen
Cross lingual text classification
Topic modeling
Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011 English This paper investigates how to effectively do cross lingual text classification by leveraging a large scale and multilingual knowledge base, Wikipedia. Based on the observation that each Wikipedia concept is described by documents of different languages, we adapt existing topic modeling algorithms for mining multilingual topics from this knowledge base. The extracted topics have multiple types of representations, with each type corresponding to one language. In this work, we regard such topics extracted from Wikipedia documents as universal-topics, since each topic corresponds with same semantic information of different languages. Thus new documents of different languages can be represented in a space using a group of universal-topics. We use these universal-topics to do cross lingual text classification. Given the training data labeled for one language, we can train a text classifier to classify the documents of another language by mapping all documents of both languages into the universal-topic space. This approach does not require any additional linguistic resources, like bilingual dictionaries, machine translation tools, or labeling data for the target language. The evaluation results indicate that our topic modeling approach is effective for building cross lingual text classifier. Copyright 2011 ACM. 0 0
Cross-Language Information Retrieval Using Meta-language Index Construction and Structural Queries Fariborz Mahmoudi Amir Hossein Jadidinejad Wikipedia-Mining
Indri Structural Query Language
Proceeding of the Multilingual Information Access Evaluation I. Text Retrieval Experiments, Lecture Notes in Computer Science, Volume 6241/2011, pp. 70-77 Structural Query Language allows expert users to richly represent its information needs but unfortunately, the complexity of SQLs make them impractical in the Web search engines. Automatically detecting the concepts in an unstructured user’s information need and generating a richly structured, multilingual equivalent query is an ideal solution. We utilize Wikipedia as a great concept repository and also some state of the art algorithms for extracting Wikipedia’s concepts from the user’s information need. This process is called “Query Wikification”. Our experiments on the TEL corpus at CLEF2009 achieves +23% and +17% improvement in Mean Average Precision and Recall against the baseline. Our approach is unique in that, it does improve both precision and recall; two pans that often improving one, hurt the another. 0 0
Cross-domain Dutch coreference resolution De Clercq O.
Hoste V.
Hendrickx I.
International Conference Recent Advances in Natural Language Processing, RANLP English This article explores the portability of a coreference resolver across a variety of eight text genres. Besides newspaper text, we also include administrative texts, autocues, texts used for external communication, instructive texts, wikipedia texts, medical texts and unedited new media texts. Three sets of experiments were conducted. First, we investigated each text genre individually, and studied the effect of larger training set sizes and including genre-specific training material. Then, we explored the predictive power of each genre for the other genres conducting cross-domain experiments. In a final step, we investigated whether excluding genres with less predictive power increases overall performance. For all experiments we use an existing Dutch mention-pair resolver and report on our experimental results using four metrics: MUC, B-cubed, CEAF and BLANC. We show that resolving out-of-domain genres works best when enough training data is included. This effect is further intensified by including a small amount of genre-specific text. As far as the cross-domain performance is concerned we see that especially genres of a very specific nature tend to have less generalization power. 0 0
Cross-language information retrieval with latent topic models trained on a comparable corpus Vulic I.
De Smet W.
Moens M.-F.
Comparable corpora
Cross-language retrieval
Document models
Multilingual retrieval
Topic models
Lecture Notes in Computer Science English In this paper we study cross-language information retrieval using a bilingual topic model trained on comparable corpora such as Wikipedia articles. The bilingual Latent Dirichlet Allocation model (BiLDA) creates an interlingual representation, which can be used as a translation resource in many different multilingual settings as comparable corpora are available for many language pairs. The probabilistic interlingual representation is incorporated in a statistical language model for information retrieval. Experiments performed on the English and Dutch test datasets of the CLEF 2001-2003 CLIR campaigns show the competitive performance of our approach compared to cross-language retrieval methods that rely on pre-existing translation dictionaries that are hand-built or constructed based on parallel corpora. 0 0
Cross-lingual recommendations in a resource-based learning scenario Schmidt S.
Scholl P.
Rensing C.
Steinmetz R.
Cross-Language Semantic Relatedness
Explicit Semantic Analysis
Reference Corpus
Lecture Notes in Computer Science English CROKODIL is a platform supporting resource-based learning scenarios for self-directed, on-task learning with web resources. As CROKODIL enables the forming of possibly large learning communities, the stored data is growing in a large scale. Thus, an appropriate recommendation of tags and learning resources becomes increasingly important for supporting learners. We propose semantic relatedness between tags and resources as a basis of recommendation and identify Explicit Semantic Analysis (ESA) using Wikipedia as reference corpus as a viable option. However, data from CROKODIL shows that tags and resources are often composed in different languages. Thus, a monolingual approach to provide recommendations is not applicable in CROKODIL. Thus, we examine strategies for providing mappings between different languages, extending ESA to provide cross-lingual capabilities. Specifically, we present mapping strategies that utilize additional semantic information contained in Wikipedia. Based on CROKODIL's application scenario, we present an evaluation design and show results of cross-lingual ESA. 0 0
Crowd-based data sourcing (Abstract) Milo T. Lecture Notes in Computer Science English Harnessing a crowd of Web users for the collection of mass data has recently become a wide-spread phenomenon [9]. Wikipedia [20] is probably the earliest and best known example of crowd-sourced data and an illustration of what can be achieved with a crowd-based data sourcing model. Other examples include social tagging systems for images, which harness millions of Web users to build searchable databases of tagged images; traffic information aggregators like Waze [17]; and hotel and movie ratings like TripAdvisor [19] and IMDb [18]. 0 0
Cultural Configuration of Wikipedia: measuring Autoreferentiality in Different Languages Marc Miquel
Horacio Rodríguez
Natural Language Processing
Topic Coverage
Proceedings of Recent Advances in Natural Language Processing, 2011, pg. 316--322 Among the motivations to write in Wikipedia given by the current literature there is often coincidence, but none of the studies presents the hypothesis of contributing for the visibility of the own national or language related content. Similar to topical coverage studies, we outline a method which allows collecting the articles of this content, to later analyse them in several dimensions. To prove its universality, the tests are repeated for up to twenty language editions of Wikipedia. Finally, through the best indicators from each dimension we obtain an index which represents the degree of autoreferentiality of the encyclopedia. Last, we point out the impact of this fact and the risk of not considering its existence in the design of applications based on user generated content. 0 0
Cultural bias in Wikipedia content on famous persons Ewa S. Callahan
Susan C. Herring
Journal of the American Society for Information Science and Technology English Wikipedia advocates a strict "neutral point of view" (NPOV) policy. However, although originally a U.S-based, English-language phenomenon, the online, user-created encyclopedia now has versions in many languages. This study examines the extent to which content and perspectives vary across cultures by comparing articles about famous persons in the Polish and English editions of Wikipedia. The results of quantitative and qualitative content analyses reveal systematic differences related to the different cultures, histories, and values of Poland and the United States; at the same time, a U.S./English-language advantage is evident throughout. In conclusion, the implications of these findings for the quality and objectivity of Wikipedia as a global repository of knowledge are discussed, and recommendations are advanced for Wikipedia end users and content developers. 22 2
Cultural configuration of Wikipedia: Measuring autoreferentiality in different languages Ribe M.M.
Rodriguez H.
International Conference Recent Advances in Natural Language Processing, RANLP English Among the motivations to write in Wikipedia given by the current literature there is often coincidence, but none of the studies presents the hypothesis of contributing for the visibility of the own national or language related content. Similar to topical coverage studies, we outline a method which allows collecting the articles of this content, to later analyse them in several dimensions. To prove its universality, the tests are repeated for up to twenty language editions of Wikipedia. Finally, through the best indicators from each dimension we obtain an index which represents the degree of autoreferentiality of the encyclopedia. Last, we point out the impact of this fact and the risk of not considering its existence in the design of applications based on user generated content. 0 0
Curriculum-guided crowd-sourcing of assessments for primary schools in a developing country Zualkernan I.A.
Raza A.
Karim A.
Developing world
Primary education
Proceedings of the 2011 11th IEEE International Conference on Advanced Learning Technologies, ICALT 2011 English Success of Wikipedia has opened a number of possibilities for crowd sourcing knowledge. However, not all crowd sourcing initiatives are successful. This paper presents a preliminary study to determine if teachers in a developing country are able to create quality multiple-choice questions for primary school students. In addition, an adoption model is developed and evaluated to ascertain if the teachers would actually contribute to such a Wiki. Results are that a reasonable number of teachers are able to formulate quality questions in Science and English and that there is a strong intention to use such a system. However, there is no obvious relationship between the intention to use and an ability to pose good assessments. 0 0
D-cores: Measuring collaboration of directed graphs based on degeneracy Giatsidis C.
Thilikos D.M.
Vazirgiannis M.
Community evaluation metrics
Graph mining
Proceedings - IEEE International Conference on Data Mining, ICDM English Community detection and evaluation is an important task in graph mining. In many cases, a community is defined as a subgraph characterized by dense connections or interactions among its nodes. A large variety of measures have been proposed to evaluate the quality of such communities - in most cases ignoring the directed nature of edges. In this paper, we introduce novel metrics for evaluating the collaborative nature of directed graphs - a property not captured by the single node metrics or by other established community evaluation metrics. In order to accomplish this objective, we capitalize on the concept of graph degeneracy and define a novel D-core framework, extending the classic graph-theoretic notion of k-cores for undirected graphs to directed ones. Based on the D-core, which essentially can be seen as a measure of the robustness of a community under degeneracy, we devise a wealth of novel metrics used to evaluate graph collaboration features of directed graphs. We applied the D-core approach on large real-world graphs such as Wikipedia and DBLP and report interesting results at the graph as well at node level. 0 0
DART3: DHS assistant for R&D tracking and technology transfer Burns L.
Selby C.
Longstaff T.
Matching algorithm
Technology transition
ACM International Conference Proceeding Series English Department of Homeland Security (DHS) Assistant for Research and development Tracking and Technology Transition (DART3) is a web-based implementation designed to capture US Federally funded research and development (R&D) project descriptions and the DHS Cyber Security and Communications (CS&C) R&D requirements and use this information to plan a set of transition activities that accelerate the deployment of relevant R&D results to CS&C operational systems. 0 0
DBWiki: A structured wiki for curated data and collaborative data management Peter Buneman
James Cheney
Sam Lindley
Muller H.
Proceedings of the ACM SIGMOD International Conference on Management of Data English Wikis have proved enormously successful as a means to collaborate in the creation and publication of textual information. At the same time, a large number of curated databases have been developed through collaboration for the dissemination of structured data in specific domains, particularly bioinformatics. We demonstrate a general-purpose platform for collaborative data management, DBWiki, designed to achieve the best of both worlds. Our system not only facilitates the collaborative creation of a database; it also provides features not usually provided by database technology such as versioning, provenance tracking, citability, and annotation. In our demonstration we will show how DBWiki makes it easy to create, correct, discuss and query structured data, placing more power in the hands of users while managing tedious details of data curation automatically. 0 0
DBpedia Spotlight: Shedding Light on the Web of Documents Pablo N. Mendes
Max Jakob
Andrés García-Silva
Christian Bizer
Text Annotation
Linked data
Named Entity Disambiguation
International Conference on Semantic Systems English 0 0
Deconstructing Wikipedia: Collaborative Content Creation in an Open Process Platform Andrew Feldstein Procedia - Social and Behavioral Sciences English Collaboration in Wikipedia articles has widely been touted as a great leap forward and an example of how technology can be leveraged to improve collaborative processes. If we focus on the creation of individual articles, what does that creation process look like? Information was collected from the Revision History Statistics page of thirty Wikipedia featured articles to examine variables such as number of edits, number of editors and total edits by the largest contributors to a given article. This small pilot study suggests that the article creation process may more closely mirror the traditional writer/editor process than it does the “crowd as writer-editor”. It also raises questions about potential changes in how people view the content creation process. 0 0
Defining ontology by using users collaboration on social media Kamran S.
Crestani F.
Information retrieval
Semantic networks
Semantic relatedness
Social media
English This novel method is proposed for building a reliable ontology around specific concepts, by using the immense potential of active volunteering collaboration of detected knowledgeable users on social media. Copyright 2011 ACM. 0 0
Deployment of a low interaction honeypot in an organizational private network Chamotra S.
Bhatia J.S.
Kamal R.
Ramani A.K.
Proceedings of 2011 International Conference on Emerging Trends in Networks and Computer Communications, ETNCC2011 English This paper describes a case study of Honey pot deployment in an organizational network. As per Wikipedia honey pot is a trap that is set to detect, deflect, or in some manner counteract attempts at unauthorized use of information systems [05]. These traps could be any digital resource ranging from a single computer to a network of such computers or a network application that appears to be a part of organizational network resources but is actually a fake resource with no production traffic. Further these resources are closely monitored and the traffic to and from these resources is well under the control of the administrator. In the experiment performed in this paper, such a trap is laid in the form of a low interaction honeypot honeyd [01] in the perimeter security of an organizational network. The results of deployment are presented and further various props and cons of such deployments are brought about. 0 0
Design Mechanisms for MediaWiki to Support Collaborative Writing in a Mandatory Context Sumonta Kasemvilas Design
Information technology
Educational technology
English Because MediaWiki is not appropriate for use in the classroom setting due to its decentralization, arbitrariness, and sharing, its flexible characteristics complicate concepts of practical design when applying MediaWiki in a mandatory writing context. This dissertation identifies a need to add extensions to facilitate increased accountability, project management, discussion, and awareness based on a theoretical framework, proposes MediaWiki with some modifications as an innovative way to optimize the strengths associated with constructivist learning and social presence, and examines the results of those changes. Relevant theoretical perspectives are used to contextualize the potential significance of additional extensions of MediaWiki. Three categories of mechanisms in MediaWiki—role, awareness, and project management—were newly developed in this research. They are designed to increase project control and accountability. Discussion, chat, text editor, and online notification extensions were also installed and customized to meet the needs of the students. Two case studies were conducted in two separate graduate classes to test the value of the extensions. Quantitative and qualitative data were collected and analyzed. Use of qualitative methods helps add texture to quantitative findings. The findings illustrate some potential impact for classroom use. Delineation of the results in Case Study 1 and Case Study 2 provides well-grounded rationale for why the proposed new MediaWiki mechanisms positively impact collaborative writing. By applying a set of extended features to MediaWiki, some problems were solved and others were mitigated, but other problems were not resolved and new problems emerged. Thus, this study articulates the benefits and the additional problems using MediaWiki and extensions and suggests ways to improve the group writing process. Using MediaWiki in academia needs appropriate governance and proper technology. The results potentially offer new teaching mechanisms for graduate students involved with collaborative writing. The study holds promise in improving collaborative efforts in mandatory group writing projects and discusses a way to facilitate collaborative writing in this context. Implications of this study can assist researchers and developers in understanding what effects the extensions have on users. 26 0
Design and implementation of the sweble wikitext parser: Unlocking the structured data of Wikipedia Hannes Dohrn
Dirk Riehle
Abstract syntax tree
Parsing expression grammar
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English The heart of each wiki, including Wikipedia, is its content. Most machine processing starts and ends with this content. At present, such processing is limited, because most wiki engines today cannot provide a complete and precise representation of the wiki's content. They can only generate HTML. The main reason is the lack of well-defined parsers that can handle the complexity of modern wiki markup. This applies to Media Wiki, the software running Wikipedia, and most other wiki engines. This paper shows why it has been so difficult to develop comprehensive parsers for wiki markup. It presents the design and implementation of a parser for Wikitext, the wiki markup language of MediaWiki. We use parsing expression grammars where most parsers used no grammars or grammars poorly suited to the task. Using this parser it is possible to directly and precisely query the structured data within wikis, including Wikipedia. The parser is available as open source from http://sweble.org. 0 0
Design guidelines for software processes knowledge repository development Garcia J.
Amescua A.
Sanchez M.-I.
Bermon L.
Agile development
Knowledge management
Software engineering
Software process technology
Web 2.0
Information and Software Technology English Context: Staff turnover in organizations is an important issue that should be taken into account mainly for two reasons: Employees carry an organization's knowledge in their heads and take it with them wherever they goKnowledge accessibility is limited to the amount of knowledge employees want to share Objective: The aim of this work is to provide a set of guidelines to develop knowledge-based Process Asset Libraries (PAL) to store software engineering best practices, implemented as a wiki. Method: Fieldwork was carried out in a 2-year training course in agile development. This was validated in two phases (with and without PAL), which were subdivided into two stages: Training and Project. Results: The study demonstrates that, on the one hand, the learning process can be facilitated using PAL to transfer software process knowledge, and on the other hand, products were developed by junior software engineers with a greater degree of independence. Conclusion: PAL, as a knowledge repository, helps software engineers to learn about development processes and improves the use of agile processes. © 2011 Elsevier B.V. All rights reserved. 0 0
Designing collaborative e-learning environments based upon semantic Wiki: From design models to application scenarios Yanyan Li
Mingkai Dong
Ronghuai Huang
Collaborative knowledge construction
E-learning 2.0
Interactive query
Semantic wiki
Educational Technology and Society English The knowledge society requires life-long learning and flexible learning environment that enables fast, just-intime and relevant learning, aiding the development of communities of knowledge, linking learners and practitioners with experts. Based upon semantic wiki, a combination of wiki and Semantic Web technology, this paper designs and develops flexible e-learning environments for different application scenarios aiming to facilitate collaborative knowledge construction and maximize resource sharing and utilization. One application scenario is to support hybrid learning by deploying an online course platform and the first round of using has shown that the course platform can effectively facilitate and support students to fulfill task-driven learning in a more flexibly and friendly collaborative manner. The other application scenario is to build a teamwork platform for supporting collaborative e-research. After several months' trial, team members agree that the platform can well meet their collaborative research work demands with the advantage of quick, easy and convenient operating assistance. The kernel idea of the collaborative e-learning environments is to enable structural organization of resources with semantic association while providing diverse customized facilities. 0 0
Detecting community kernels in large social networks Lei Wang
Lou T.
Tang J.
Hopcroft J.E.
Auxiliary communities
Community kernel detection
Community kernels
Social network
Proceedings - IEEE International Conference on Data Mining, ICDM English In many social networks, there exist two types of users that exhibit different influence and different behavior. For instance, statistics have shown that less than 1% of the Twitter users (e.g. entertainers, politicians, writers) produce 50% of its content [1], while the others (e.g. fans, followers, readers) have much less influence and completely different social behavior. In this paper, we define and explore a novel problem called community kernel detection in order to uncover the hidden community structure in large social networks. We discover that influential users pay closer attention to those who are more similar to them, which leads to a natural partition into different community kernels. We propose GREEDY and WEBA, two efficient algorithms for finding community kernels in large social networks. GREEDY is based on maximum cardinality search, while WEBA formalizes the problem in an optimization framework. We conduct experiments on three large social networks: Twitter, Wikipedia, and Coauthor, which show that WEBA achieves an average 15%- 50% performance improvement over the other state-of-the-art algorithms, and WEBA is on average 6-2,000 times faster in detecting community kernels. 0 0
Detecting the long-tail of points of interest in tagged photo collections Zigkolis C.
Papadopoulos S.
Kompatsiaris Y.
Athena Vakali
Proceedings - International Workshop on Content-Based Multimedia Indexing English The paper tackles the problem of matching the photos of a tagged photo collection to a list of "long-tail" Points Of Interest (PoIs), that is PoIs that are not very popular and thus not well represented in the photo collection. Despite the significance of improving "long-tail" PoI photo retrieval for travel applications, most landmark detection methods to date have been tested on very popular landmarks. In this paper, we conduct a thorough empirical analysis comparing four baseline matching methods that rely on photo metadata, three variants of an approach that uses cluster analysis in order to discover PoI-related photo clusters, and a real-world retrieval mechanism (Flickr search) on a set of less popular PoIs. A user-based evaluation of the aforementioned methods is conducted on a Flickr photo collection of over 100, 000 photos from 10 well-known touristic destinations in Greece. A set of 104 "long-tail" PoIs is collected for these destinations from Wikipedia, Wikimapia and OpenStreetMap. The results demonstrate that two of the baseline methods outperform Flickr search in terms of precision and F-measure, whereas two of the cluster-based methods outperform it in terms of recall and PoI coverage. We consider the results of this study valuable for enhancing the indexing of pictorial content in social media sites. 0 0
Detection of Text Quality Flaws as a One-class Classification Problem Maik Anderka
Benno Stein
Nedim Lipka
Information quality
Quality Flaw Prediction
One-class Classification
20th ACM Conference on Information and Knowledge Management (CIKM 11) English For Web applications that are based on user generated content the detection of text quality flaws is a key concern. Our research contributes to automatic quality flaw detection. In particular, we propose to cast the detection of text quality flaws as a one-class classification problem: we are given only positive examples (= texts containing a particular quality flaw) and decide whether or not an unseen text suffers from this flaw. We argue that common binary or multiclass classification approaches are ineffective in here, and we underpin our approach by a real-world application: we employ a dedicated one-class learning approach to determine whether a given Wikipedia article suffers from certain quality flaws. Since in the Wikipedia setting the acquisition of sensible test data is quite intricate, we analyze the effects of a biased sample selection. In addition, we illustrate the classifier effectiveness as a function of the flaw distribution in order to cope with the unknown (real-world) flaw-specific class imbalances. Altogether, provided test data with little noise, four from ten important quality flaws in Wikipedia can be detected with a precision close to 1. 0 0
Developing product innovation using Web 2.0: A field study Wallace S.
Ganesh J.
Kolhatkar A.
Singh R.
Case study
Innovation process
Knowledge management
Team wiki
Web 2.0
17th Americas Conference on Information Systems 2011, AMCIS 2011 English Firms rely on effective innovation processes to develop innovative products essential to their competitive strategy. Systems that support the innovation processes have strategic relevance and are critical to the firm's success and growth. More research is needed to explain how we can effectively coordinate the KM activities required for effective innovation processes. This paper will answer the question: How does Web 2.0 support effective innovation processes in product innovation? We need to better understand how a Web 2.0 platform can facilitate coordination, cooperation, and organizational learning and lead to improved innovation through more effective innovation processes. This paper develops an understanding of how Web 2.0 applications integrate and support the needs of the innovation processes for product innovation. We provide a detailed case study where Web 2.0 is used in the innovation process to show how it can be used to support KM for effective innovation processes. 0 0
Digital libraries and social web: Insights from wikipedia users' activities Zelenkauskaite A.
Paolo Massa
Digital libraries
Social web
User-centric design
Proceedings of the IADIS International Conferences - Web Based Communities and Social Media 2011, Social Media 2011, Internet Applications and Research 2011, Part of the IADIS, MCCSIS 2011 English A growing importance of the social aspects within large scale knowledge depositories as digital libraries was discerned since the last decade for its ever increasing number of digital depositories and users. Despite the fact that this digital trend influenced multiple users, yet little is known about how users navigate in these online platforms. In this study Wikipedia is considered as a lens to analyze user activities within a large scale online environment, in order to achieve a better understanding regarding user needs in online knowledge depositories. This study analyzed user activities in real setting where editing activities of 686,332 active contributors of English Wikipedia have been studied within a period of ten years. Their editing behaviors were compared based on different periods of permanence (longevity) within Wikipedia's content-oriented versus social-oriented namespaces. The results show that users with less than 21 days of longevity were more likely to interact in namespaces that were designated for social purposes, compared to the users who remained from two to ten years who were more likely to exploit functionalities related to content discussion. The implications of these findings were positioned within the collaborative learning framework which postulates that users with different expertise levels have different exigencies. Since social functionalities were more frequently used by users who stayed for short periods of time, inclusion of such functionalities in online platforms can provide support to this segment of users. This study aims at contributing to the design of online collaborative environments such as digital libraries where socialoriented design would allow creating more sustainable environments that are built around the specific needs of diverse users. 0 0
Disambiguation and filtering methods in using web knowledge for coreference resolution Uryupina O.
Poesio M.
Claudio Giuliano
Kateryna Tymoshenko
Proceedings of the 24th International Florida Artificial Intelligence Research Society, FLAIRS - 24 English We investigate two publicly available web knowledge bases, Wikipedia and Yago, in an attempt to leverage semantic information and increase the performance level of a state-of-the-art coreference resolution (CR) engine. We extract semantic compatibility and aliasing information from Wikipedia and Yago, and incorporate it into a CR system. We show that using such knowledge with no disambiguation and filtering does not bring any improvement over the baseline, mirroring the previous findings (Ponzetto and Poesio 2009). We propose, therefore, a number of solutions to reduce the amount of noise coming from web resources: using disambiguation tools for Wikipedia, pruning Yago to eliminate the most generic categories and imposing additional constraints on affected mentions. Our evaluation experiments on the ACE-02 corpus show that the knowledge, extracted from Wikipedia and Yago, improves our system's performance by 2-3 percentage points. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved. 0 0
Discovering context: Classifying tweets through a semantic transform based on wikipedia Yegin Genc
Yasuaki Sakamoto
Nickerson J.V.
Latent semantic analysis
Text classification
Lecture Notes in Computer Science English By mapping messages into a large context, we can compute the distances between them, and then classify them. We test this conjecture on Twitter messages: Messages are mapped onto their most similar Wikipedia pages, and the distances between pages are used as a proxy for the distances between messages. This technique yields more accurate classification of a set of Twitter messages than alternative techniques using string edit distance and latent semantic analysis. 0 0
Discovering context: classifying tweets through a semantic transform based on wikipedia Yegin Genc
Yasuaki Sakamoto
Jeffrey V. Nickerson
Latent semantic analysis
Text classification
FAC English 0 0
Discussion about translation in Wikipedia Ari Hautasaari
Toru Ishida
Talk page
Proceedings - 2011 2nd International Conference on Culture and Computing, Culture and Computing 2011 English Discussion pages in individual Wikipedia articles are a channel for communication and collaboration between Wikipedia contributors. Although discussion pages contribute to a large portion of the online encyclopedia, there have been relatively few in-depth studies conducted on the type of communication and collaboration in the multilingual Wikipedia, especially regarding translation activities. This paper reports the results on an analysis of discussion about translation in the Finnish, French and Japanese Wikipedias. The analysis results highlight the main problems in Wikipedia translation requiring interaction with the community. Unlike reported in previous works, community interaction in Wikipedia translation focuses on solving problems in source referencing, proper nouns and transliteration in articles, rather than mechanical translation of words and sentences. Based on these findings we propose future directions for supporting translation activities in Wikipedia. 0 0
Dissemination and control model of internet public opinion in the ubiquitous media environments Chen B.
Yu L.
Liu J.-T.
Chu W.-M.
Epidemic model
Internet public opinion
Ubiquitous media environments
Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice Chinese The gradual formation of the ubiquitous media environment has a profound effect on the dissemination and control of internet public opinion. This paper presents a novel propagation model with direct immune named SEIR, and the traditional epidemic model is generalized to the ubiquitous media environment. The existing models process the netizens' state and treat the propagation media of public opinion in rather simple ways. The proposed novel model has overcome the defects of the existing models. The equilibrium point and stability of the model are proved, and the evolution rules are analyzed. The given control methods are starting control from the internet public opinion environment, and have early intervention in the formation of public opinion. This paper then constructs an information propagation platform with self-purification capacity applying Wiki technology. The simulations to the opinion control efficacy of the platform are accomplished, so the effectiveness of the control methods in the internet public opinion is verified. 0 0
Distributed tuning of machine learning algorithms using MapReduce Clusters Yasser Ganjisaffar
Debeauvais T.
Sara Javanmardi
Caruana R.
Lopes C.V.
Machine learning
Proceedings of the 3rd Workshop on Large Scale Data Mining: Theory and Applications, LDMTA 2011 - Held in Conjunction with ACM SIGKDD 2011 English Obtaining the best accuracy in machine learning usually requires carefully tuning learning algorithm parameters for each problem. Parameter optimization is computationally challenging for learning methods with many hyperparameters. In this paper we show that MapReduce Clusters are particularly well suited for parallel parameter optimization. We use MapReduce to optimize regularization parameters for boosted trees and random forests on several text problems: three retrieval ranking problems and a Wikipedia vandalism problem. We show how model accuracy improves as a function of the percent of parameter space explored, that accuracy can be hurt by exploring parameter space too aggressively, and that there can be significant interaction between parameters that appear to be independent. Our results suggest that MapReduce is a two-edged sword: it makes parameter optimization feasible on a massive scale that would have been unimaginable just a few years ago, but also creates a new opportunity for overfitting that can reduce accuracy and lead to inferior learning parameters. 0 0
Divergent and convergent knowledge processes on wikipedia Iassen Halatchliyski
Joachim Kimmerle
Ulrike Cress
Connecting Computer-Supported Collaborative Learning to Policy and Practice: CSCL 2011 Conf. Proc. - Short Papers and Posters, 9th International Computer-Supported Collaborative Learning Conf. English The paper presents a new theoretical consideration of knowledge processes bridging the individual and the collective level. Building on a differentiation between accommodation and assimilation of knowledge in wikis, we derive divergence and convergence from intelligence and creativity research and reconstruct their impact on the open-ended development of knowledge. The distinction from related CSCL constructs is elaborated. Using examples from Wikipedia, the definition of divergence and convergence is illustrated in the dynamic context of article development. 0 0
Document Topic Extraction Based on Wikipedia Category Jiali Yun
Liping Jing
Jian Yu
Houkuan Huang
Ying Zhang
Topic Extraction
Document Representation
Wikipedia Category
Semantic relatedness
CSO English 0 0
Does collaboration occur when children are learning with the support of a wiki? Turkish Online Journal of Educational Technology English 0 0
Don't bite the newbies: how reverts affect the quantity and quality of Wikipedia work Aaron Halfaker
Aniket Kittur
John Riedl
WikiSym English Reverts are important to maintaining the quality of Wikipedia. They fix mistakes, repair vandalism, and help enforce policy. However, reverts can also be damaging, especially to the aspiring editor whose work they destroy. In this research we analyze 400,000 Wikipedia revisions to understand the effect that reverts had on editors. We seek to understand the extent to which they demotivate users, reducing the workforce of contributors, versus the extent to which they help users improve as encyclopedia editors. Overall we find that reverts are powerfully demotivating, but that their net influence is that more quality work is done in Wikipedia as a result of reverts than is lost by chasing editors away. However, we identify key conditions – most specifically new editors being reverted by much more experienced editors – under which reverts are particularly damaging. We propose that reducing the damage from reverts might be one effective path for Wikipedia to solve the newcomer retention problem. 0 2
Don't leave me alone: Effectiveness of a framed wiki-based learning activity Nikolaos Tselios
Panagiota Altanopoulou
Vassilis Komis
Activity design
Collaborative learning
Learning outcome
Project based learning
Web 2.0
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English In this paper, the effectiveness of a framed wiki-based learning activity is examined. A one-group pretest-posttest design was conducted towards this aim. The study involved 146 first year university students of a Greek Education Department using wikis to learn basic aspects and implications of search engines in the context of a first year course entitled "Introduction to ICT". Data analysis showed significant improvement in learning outcomes, in particular for students with low initial performance. The average students' questionnaire score jumped from 38.6% to 55%. In addition, a positive attitude towards using wikis in their project was expressed by the students. The design of the activity, the context of the study and the results obtained are discussed in detail. 0 0
EcoliWiki: a wiki-based community resource for Escherichia coli Brenley K. McIntosh
Daniel P. Renfro
Gwendowlyn S. Knapp
Chanchala R. Lairikyengbam
Nathan M. Liles
Lili Niu
Amanda M. Supak
Anand Venkatraman
Adrienne E. Zweifel
Deborah A. Siegele
James C. Hu
English EcoliWiki is the community annotation component of the PortEco (http://porteco.org; formerly EcoliHub) project, an online data resource that integrates information on laboratory strains of Escherichia coli, its phages, plasmids and mobile genetic elements. As one of the early adopters of the wiki approach to model organism databases, EcoliWiki was designed to not only facilitate community-driven sharing of biological knowledge about E. coli as a model organism, but also to be interoperable with other data resources. EcoliWiki content currently covers genes from five laboratory E. coli strains, 21 bacteriophage genomes, F plasmid and eight transposons. EcoliWiki integrates the Mediawiki wiki platform with other open-source software tools and in-house software development to extend how wikis can be used for model organism databases. EcoliWiki can be accessed online at http://ecoliwiki.net. 0 0
Edit wars in Wikipedia Róbert Sumi
Taha Yasseri
András Rung
András Kornai
János Kertész
IEEE Third International Conference on Social Computing English We present a new, efficient method for automatically detecting severe conflicts `edit wars' in Wikipedia and evaluate this method on six different language WPs. We discuss how the number of edits, reverts, the length of discussions, the burstiness of edits and reverts deviate in such pages from those following the general workflow, and argue that earlier work has significantly over-estimated the contentiousness of the Wikipedia editing process. 9 2
Editing knowledge resources: The wiki way Francesco Ronzano
Andrea Marchetti
Maurizio Tesconi
Collaborative editing web applications
Knowledge resources
Web and social knowledge management
Wiki paradigm
International Conference on Information and Knowledge Management, Proceedings English The creation, customization, and maintenance of knowledge resources are essential for fostering the full deployment of Language Technologies. The definition and refinement of knowledge resources are time- and resource-consuming activities. In this paper we explore how the Wiki paradigm for online collaborative content editing can be exploited to gather massive social contributions from common Web users in editing knowledge resources. We discuss the Wikyoto Knowledge Editor, also called Wikyoto. Wikyoto is a collaborative Web environment that enables users with no knowledge engineering background to edit the multilingual network of knowledge resources exploited by KYOTO, a cross-lingual text mining system developed in the context of the KYOTO European Project. 0 0
Editing knowledge resources: the wiki way Francesco Ronzano
Andrea Marchetti
Maurizio Tesconi
Collaborative editing web applications
Knowledge resources
Web and social knowledge management
Wiki paradigm
CIKM English 0 0
Editing the Wikipedia: Its role in science education Mareca P.
Bosch V.A.
Editing Wikipedia to improve learning in Science Education
Educational innovation
Web resources in Higher Education
Proceedings of the 6th Iberian Conference on Information Systems and Technologies, CISTI 2011 Spanish This paper describes and analyzes how the cooperation of Engineering students in a Wikipedia editing project helped to improve their learning and understanding of Physics. This project aims to incorporate to the first University Courses other forms of learning, including specifically the communication of scientific concepts to other students and general audiences. Students have been in accordance to say that with the Wikipedia project have learned to work better together and helped them gain insight into the concepts of Physics. 0 0
Educational concept mapping method based on high-frequency words and wikipedia linkage Lauri Lahti Concept map
Intelligent tutoring system
Knowledge acquisition
Proceedings of the 4th International Conference on Internet Technologies and Applications, ITA 11 English We propose a computational method to support the learner's knowledge adoption based on concept mapping relying on three perspectives of learning scenario represented by learning concept networks: learner's knowledge, learning context and learning objective. Each learning concept network is generated based on a set of high-frequency words from a representative text sample that are connected based on the shortest hyperlink chains between corresponding Wikipedia articles. The learner explores ranking-based routings connecting learning concept networks by expanding a concept map in two complementing learning modes: assisted construction and assistive evaluation, with focused and contextualized emphasis. Based on the method we have implemented a prototype of an educational tool and its preliminary testing indicated that the method can well support personalized knowledge adoption. 0 0
Educational semantic wikis in the linked data age: The case of msc web science program at aristotle university of thessaloniki Bratsas C.
Dimou A.
Alexiadis G.
Chrysou D.-E.
Kavargyris K.
Parapontis I.
Bamidis P.
Antoniou I.
Linked data
Semantic wiki
Wiki engine
CEUR Workshop Proceedings English Wikis are nowadays a mature technology and further well established as successful eLearning approaches that promote collaboration, fulfill the requirements of new trends in education and follow the theory of constructivism. Semantic Wikis on the other hand, are not yet thoroughly explored, but differentiate by offering an increased overall added value to the educational procedure and the course management. Their recent integration with the Linked Data cloud exhibits a potential to exceed their usual contribution and to render them into powerful eLearning tools as they expand their potentialities to the newly created educational LOD. Web Science Semantic Wiki constitutes a prime attempt to evaluate this potential and the benefits that Semantic Web and linked data bring in the field of education. 0 0
Effectively mining wikipedia for clustering multilingual documents N. Kiran Kumar
G. S. K. Santosh
Vasudeva Varma
Document representation
Multilingual document clustering
NLDB English 0 0
Effectiveness of a Framed Wiki-Based Learning Activity in the Context of HCI Education Nikolaos Tselios
Panagiota Altanopoulou
Christos Katsanos
Web 2.0
Activity design
Hci education
Project based learning
Collaborative learning
Learning outcome
PCI English 0 0
Efficient and scalable data evolution with column oriented databases Liu Z.
He B.
Hsiao H.-I.
Yirong Chen
Bitmap index
Column oriented database
Data evolution
ACM International Conference Proceeding Series English Database evolution is the process of updating the schema of a database or data warehouse (schema evolution) and evolving the data to the updated schema (data evolution). It is often desired or necessitated when changes occur to the data or the query workload, the initial schema was not carefully designed, or more knowledge of the database is known and a better schema is concluded. The Wikipedia database, for example, has had more than 170 versions in the past 5 years [8]. Unfortunately, although much research has been done on the schema evolution part, data evolution has long been a prohibitively expensive process, which essentially evolves the data by executing SQL queries and re-constructing indexes. This prevents databases from being flexibly and frequently changed based on the need and forces schema designers, who cannot afford mistakes, to be highly cautious. Techniques that enable efficient data evolution will undoubtedly make life much easier. In this paper, we study the efficiency of data evolution, and discuss the techniques for data evolution on column oriented databases, which store each attribute, rather than each tuple, contiguously. We show that column oriented databases have a better potential than traditional row oriented databases for supporting data evolution, and propose a novel data-level data evolution framework on column oriented databases. Our approach, as suggested by experimental evaluations on real and synthetic data, is much more efficient than the query-level data evolution on both row and column oriented databases, which involves unnecessary access of irrelevant data, materializing intermediate results and re-constructing indexes. 0 0
Einstein: Physicist or vegetarian? Summarizing semantic type graphs for knowledge discovery Tylenda T.
Sozio M.
Gerhard Weikum
Knowledge bases
Semantic search
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English The Web and, in particular, knowledge-sharing communities such as Wikipedia contain a huge amount of information encompassing disparate and diverse fields. Knowledge bases such as DBpedia or Yago represent the data in a concise and more structured way bearing the potential of bringing database tools to Web Search. The wealth of data, however, poses the challenge of how to retrieve important and valuable information, which is often intertwined with trivial and less important details. This calls for an efficient and automatic summarization method. In this demonstration proposal, we consider the novel problem of summarizing the information related to a given entity, like a person or an organization. To this end, we utilize the rich type graph that knowledge bases provide for each entity, and define the problem of selecting the best cost-restricted subset of types as summary with good coverage of salient properties. We propose a demonstration of our system which allows the user to specify the entity to summarize, an upper bound on the cost of the resulting summary, as well as to browse the knowledge base in a more simple and intuitive manner. 0 0
El potlatch digital. Wikipedia y el triunfo del procomún y el conocimiento compartido Felipe Ortega
Joaquín Rodríguez López
Spanish En el año 1968, Garret Hardin publicó en la revis­ta «Science» un artículo determinante, «The Trage­dy of the Commons», en el que reflexionaba sobre la dificultad de la gestión de los bienes y los recursos comunes y sobre el peligro al que estaba expuesta su subsistencia. La Premio Nobel de Economía Elinor Ostrom pasaría la mayor parte de su vida profesional investigando, precisamente, sobre los mecanismos de la acción colectiva y la gestión solidaria del proco­mún, intentando inferir de las buenas prácticas al­gunas características estructurales comunes. Con la in­vención de Internet y la digitalización del conoci­miento, resurge con vigor en versión digital el pro­blema analógico precedente: ¿cómo pueden surgir y autogestionarse comunidades online cuyo propósito es la generación de conocimiento compartido? Es decir, ¿cómo puede y debe gestionarse el procomún digital, el «digital commons»? Wikipedia ofrece un ejem­plo prototípico y floreciente de la construcción de una comunidad que consensúa sus políticas, esta­blece sus mecanismos internos de reconocimiento y orga­niza sus dispositivos de control y vigilancia, todo sin que circule efectivo de ninguna clase. El caso del «potlatch» canadiense nos sirve para comprender cómo en determinados contextos y circunstancias es nece­sario desprenderse del capital que se posee para que la comunidad lo devuelva y lo reintegre en forma de reconocimiento y renombre; cómo en determinados contextos culturales, la especie de capital que circula no es monetaria, sino simbólica, en forma de repu­tación y popularidad, y la lógica de su acumulación exige ser desinteresado para generar otra forma de interés. Así funcionan algunos de los casos más co­nocidos de Internet y así se ha convertido la Wikipe­dia en un caso del triunfo de la gestión del procomún y el conocimiento compartido. 0 0
Electures-wikitoward engaging students to actively work with lecture recordings Hermann C.
Ottmann T.
Enabling technology.
Lecture recordings
IEEE Transactions on Learning Technologies English In this paper, we present the integration of a Wiki with lecture recordings using a tool called aofconvert, enabling the students to visually reference lecture recordings in the Wiki at a precise moment in time of the lecture. This tight integration between a Wiki and lecture materials allows the students to elaborate on the topics they learned in class as well as thoroughly discuss their own aspects of those topics. This technology can enable students to get actively involved in a collaborative learning process. One prerequisite for facilitating this consists in a reliable method for detecting slide transitions in lecture recordings. We describe an improved technique for slide transition detection in video-based/screen-grabbed lecture recordings when the object-based representation is not available. Our experiments demonstrate the accuracy of this new technique. A survey conducted with our students after using the Wiki in class completes this article and demonstrates which technical features are most important for such a Wiki. 0 0
Embedding MindMap as a service for user-driven composition of web applications Guabtni A.
Clarke S.
Benatallah B.
User-driven composition
Web application
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English The World Wide Web is evolving towards a very large distributed platform allowing ubiquitous access to a wide range of Web applications with minimal delay and no installation required. Such Web applications range from having users undertake simple tasks, such as filling a form, to more complex tasks including collaborative work, project management, and more generally, creating, consulting, annotating, and sharing Web content. However, users are lacking a simple but yet powerful mechanism to compose Web applications, similarly to what desktop environments allowed for decades using the file explorer paradigm and the desktop metaphor. Attempts have been made to adapt the desktop metaphor to the Web environment giving birth to Webtops (Web desktops). It essentially consisted of embedding a desktop environment in a Web browser and provide access to various Web applications within the same User Interface. However, those attempts did not take into consideration to the radical differences between Web and desktop environments and applications. In this work, we introduce a new approach for Web application composition based on the mindmap metaphor. It allows browsing artifacts (Web resources) and enabling user-driven composition of their associated Web applications. Essentially, a mindmap is a graph of widgets representing artifacts created or used by Web applications and allow to list and launch all possible Web applications associated to each artifact. A tool has been developed to experiment the new metaphor and is provided as a service to be embedded in Web applications via a Web browser's plug-in. We demonstrate in this paper three case studies regarding the DBLP Web site, Wikipedia and Google Picasa Web applications. 0 0
Embedding the HeaRT rule engine into a semantic wiki Studies in Computational Intelligence English 0 0
Emergent verbal behaviour in human-robot interaction Kristiina Jokinen
Graham Wilcock
2011 2nd International Conference on Cognitive Infocommunications, CogInfoCom 2011 English The paper describes emergent verbal behaviour that arises when speech components are added to a robotics simulator. In the existing simulator the robot performs its activities silently. When speech synthesis is added, the first level of emergent verbal behaviour is that the robot produces spoken monologues giving a stream of simple explanations of its movements. When speech recognition is added, human-robot interaction can be initiated by the human, using voice commands to direct the robot's movements. In addition, cooperative verbal behaviour emerges when the robot modifies its own verbal behaviour in response to being asked by the human to talk less or more. The robotics framework supports different behavioural paradigms, including finite state machines, reinforcement learning and fuzzy decisions. By combining finite state machines with the speech interface, spoken dialogue systems based on state transitions can be implemented. These dialogue systems exemplify emergent verbal behaviour that is robot-initiated: the robot asks appropriate questions in order to achieve the dialogue goal. The paper mentions current work on using Wikipedia as a knowledge base for open-domain dialogues, and suggests promising ideas for topic-tracking and robot-initiated conversational topics. 0 0
Emotion dependent dialogues in the VirCA system Fulop I.M. Cognitive infocommunication
Emotion tracking
Emotion vector
User interaction
Virtual reality
2011 2nd International Conference on Cognitive Infocommunications, CogInfoCom 2011 English In the VirCA system, the Wikipedia cyber device was developed in order to realize dialogues with human users as a case of inter-cognitive sensor sharing communication. [1] These dialogues are based on the scenarios of wiki pages edited on the web. This cyber device was extended with the ability of emotion support: the Wikipedia answers the user with the emotion received from some emotion tracker component. This way not only speech but emotion is transferred as well in the course of cognitive infocommunication. To realize this attitude, on the one hand, a universal thesaurus component was developed, which can select the appropriate version of a default lingual item which matches the received emotion. On the other hand, a universal emotion tracker component was also developed to recognize the emotion of the user either from the voice or the used lingual items of the user. This paper intends to present how the different components are connected together in order to realize the desired behaviour. It is going to be described how the universal components are exactly operating and which technologies are applied to achieve the required operation. Examples for the usage of the system are going to be presented as well. 0 0
Emphasising assessment 'as' learning by assessing wiki writing assignments collaboratively and publicly online Australasian Journal of Educational Technology English 0 0
Empirical Study on Application of Wiki Based Collaborative Lesson-Preparing Yingjie Ren
Chaohua Gong
Collaborative lesson-preparing
Knowledge management
ICM English 0 0
Enable Wikis for seamless hypervideo integration Niels Seidel CSCL
ECCE English 0 0
Enabling knowledge workers to collaboratively add structure to enterprise wikis Florian Matthes
Christian Neubert
Enterprise 2.0
Knowledge management systems
Semantic web
Social software
Structuring of content
Proceedings of the European Conference on Knowledge Management, ECKM English Varied fields of application, fast access to often needed information, easy collaboration capabilities and low maintenance costs make wikis very attractive for enterprises. For these reasons in many companies wikis have already been firmly established as tools for collaboration and knowledge exchange. Since most of the content in wikis is completely unstructured (plain hypertext, links, etc.) it is difficult for programs to process the information on the particular wiki pages. Therefore individual pages can only be found by means of a full-text search engine, but searching for particular pages with specific attributes and attribute values is not possible. In this paper we present Hybrid Wikis, a lightweight approach for structuring content and management of information structures in enterprise wikis. Hybrid Wikis are realized based on the commercial Enterprise 2.0 software Tricia and supported by our experiences made with classical wikis, semantic wikis and integrated Enterprise 2.0 platforms used for knowledge and information management in enterprises. Inspired by these web technologies Hybrid Wikis extend the wiki provided by Tricia with a few mechanisms for classification, linking, consistency checking, and visualization of wiki pages, which can be combined flexibly. We explain how these mechanisms facilitate the structuring of content in enterprise wikis and how both can benefit from it, knowledge workers and enterprises. Hybrid Wikis create incentives for users to apply structure by giving suggestions of frequently used structured elements, provide lightweight web-interfaces which enable users to manage the structured elements directly as part of the page content, and help to avoid information redundancies by offering structured searches as well as autocompletion mechanisms for structured elements. Furthermore we show how Hybrid Wikis enable knowledge workers to manage and integrate structured and unstructured information uniform across the enterprise, which is one of the key challenges knowledge management systems are faced with. 0 0
… further results