From WikiPapers
Jump to: navigation, search
<< 2009 - 2010 - 2011 - 2012 - 2013 - 2014 - 2015 >>

This is a list of 6 events celebrated and 771 publications published in 2012.


Name City Country DateThis property is a special property in this wiki.
Annual Conference of Malayalam Wikimedians 2012 Kollan India 28 April 2012
PAN 2012 Rome Italy 17 September 2012
RecentChangesCamp 2012 Canberra Canberra Australia 20 January 2012
Wiki Loves Monuments 2012 Worldwide September 2012
WikiSym 2012 Linz Austria 27 August 2012
Wikimania 2012 Washington D.C. United States 12 July 2012


Title Author(s) Keyword(s) Published in Language Abstract R C
"Askwiki": Shallow semantic processing to query Wikipedia Burkhardt F.
Jia Zhou
Wikipedia semantic modeling natural language understanding European Signal Processing Conference English We describe an application to query Wikipedia with a voice interface on a mobile device, i.e. smart phone or tablet computer. The aim was to develop a so-called App that installs easily on an android phone and does not need large vocabularies. It can be used to either answer questions directly, if the information is contained in a table or matches some keyword syntax (like birth place), or get access to an article's sub chapter. An evaluation based on 25 test users showed the feasibility of the approach. 0 0
A Breakdown of Quality Flaws in Wikipedia Maik Anderka
Benno Stein
Quality Flaws
Information quality
User-generated Content Analysis
2nd Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality 12) English The online encyclopedia Wikipedia is a successful example of the increasing popularity of user generated content on the Web. Despite its success, Wikipedia is often criticized for containing low-quality information, which is mainly attributed to its core policy of being open for editing by everyone. The identification of low-quality information is an important task since Wikipedia has become the primary source of knowledge for a huge number of people around the world. Previous research on quality assessment in Wikipedia either investigates only small samples of articles, or else focuses on single quality aspects, like accuracy or formality. This paper targets the investigation of quality flaws, and presents the first complete breakdown of Wikipedia's quality flaw structure. We conduct an extensive exploratory analysis, which reveals (1) the quality flaws that actually exist, (2) the distribution of flaws in Wikipedia, and (3) the extent of flawed content. An important finding is that more than one in four English Wikipedia articles contains at least one quality flaw, 70% of which concern article verifiability. 0 0
A Casual Network Security Monitoring System using a Portable Sensor Device and Wiki Software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Information security
SAINT English A casual network security monitoring system is proposed in this paper. The system is easy to deploy without reconfiguring the central network infrastructure, the firewall, and the intrusion detector system (IDS) of an organization. A virus-infected host, which is hidden by the network address translator (NAT) of a sub LAN, can be identified easily by using this monitoring system with the IDS. This monitoring system consists of a portable sensor device and a web site with wiki software. The portable sensor device, which is located on a target LAN that may have virus-infected hosts, is remote-controlled by a network manager's commands. The commands and the results are written on a wiki page. 3 2
A Cross-Lingual Dictionary for English Wikipedia Concepts Valentin I. Spitkovsky
Angel X. Chang
Information retrieval
Entity linking
Proceedings of the Eighth International Conference on Language Resources and Evaluation English We present a resource for automatically associating strings of text with English Wikipedia concepts. Our machinery is bi-directional, in the sense that it uses the same fundamental probabilistic methods to map strings to empirical distributions over Wikipedia articles as it does to map article URLs to distributions over short, language-independent strings of natural language text. For maximal interoperability, we release our resource as a set of flat line-based text files, lexicographically sorted and encoded with UTF-8. These files capture joint probability distributions underlying concepts (we use the terms article, concept and Wikipedia URL interchangeably) and associated snippets of text, as well as other features that can come in handy when working with Wikipedia articles and related information. 5 0
A Jester's Promenade: Citations to Wikipedia in Law Reviews, 2002-2008 Daniel J. Baker Wikipedia
Legal citation
Law reviews
Law journals
Legal writing
I/S: A Journal of Law and Policy for the Information Society Due to its perceived omniscience and ease-of-use, reliance on the online encyclopedia Wikipedia as a source for information has become pervasive. As a result, scholars and commentators have begun turning their attentions toward this resource and its uses. The main focus of previous writers, however, has been on the use of Wikipedia in the judicial process, whether by litigants relying on Wikipedia in their pleadings or judges relying on it in their decisions. No one, until now, has examined the use of Wikipedia in the legal scholarship context. This article intends to shine a light on the citation aspect of the Wikipedia-as-authority phenomenon by providing detailed statistics on the scope of its use and critiquing or building on the arguments of other commentators. Part II provides an overview of the debate regarding the citation of Wikipedia, beginning with a general discussion on the purposes of citation. In this Part, this article examines why some authors choose to cite to Wikipedia and explains why such citation is nonetheless problematic despite its perceived advantages. A citation analysis performed on works published by nearly 500 American law reviews between 2002 and 2008 is the focus of Part III, from a description of the methodology to an examination of the results of the analysis and any trends that may be discerned from the statistics. Finally, Part IV examines the propriety of citing to Wikipedia, culminating in a call for tighter editorial standards in law reviews. 0 0
A Linked Data platform for mining software repositories Keivanloo I.
Forbes C.
Hmood A.
Erfani M.
Neal C.
Peristerakis G.
Rilling J.
Fact sharing
Linked data
Software mining
IEEE International Working Conference on Mining Software Repositories English The mining of software repositories involves the extraction of both basic and value-added information from existing software repositories. The repositories will be mined to extract facts by different stakeholders (e.g. researchers, managers) and for various purposes. To avoid unnecessary pre-processing and analysis steps, sharing and integration of both basic and value-added facts are needed. In this research, we introduce SeCold, an open and collaborative platform for sharing software datasets. SeCold provides the first online software ecosystem Linked Data platform that supports data extraction and on-the-fly inter-dataset integration from major version control, issue tracking, and quality evaluation systems. In its first release, the dataset contains about two billion facts, such as source code statements, software licenses, and code clones from 18 000 software projects. In its second release the SeCold project will contain additional facts mined from issue trackers and versioning systems. Our approach is based on the same fundamental principle as Wikipedia: researchers and tool developers share analysis results obtained from their tools by publishing them as part of the SeCold portal and therefore make them an integrated part of the global knowledge domain. The SeCold project is an official member of the Linked Data dataset cloud and is currently the eighth largest online dataset available on the Web. 0 0
A M2M system using Arduino, Android and Wiki Software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Social network
IIAI ESKM English A Machine-to-Machine (M2M) system, which uses Arduino, Android, and Wiki software, is discussed. ["proposed"?] This system consists of mobile terminals and web sites with wiki software. A mobile terminal of the system consists of an Android terminal and an Arduino board with sensors and actuators. The mobile terminal reads data from the sensors in the Arduino board and sends the data to a wiki page. The mobile terminal also reads commands on the wiki page and controls the actuators of the Arduino board. In addition, a wiki page can have a program that reads the page and outputs information such as a graph. This system realizes an open communication forum for not only people but also for machines 4 3
A M2M system using arduino, android and wiki software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Sensor network
Social network
Proceedings of the 2012 IIAI International Conference on Advanced Applied Informatics, IIAIAAI 2012 English A Machine-to-Machine (M2M) system, which uses Arduino, Android, and Wiki software, is discussed. ["proposed"?] This system consists of mobile terminals and web sites with wiki software. A mobile terminal of the system consists of an Android terminal and an Arduino board with sensors and actuators. The mobile terminal reads data from the sensors in the Arduino board and sends the data to a wiki page. The mobile terminal also reads commands on the wiki page and controls the actuators of the Arduino board. In addition, a wiki page can have a program that reads the page and outputs information such as a graph. This system realizes an open communication forum for not only people but also for machines. 0 3
A Simple Application Program Interface for Saving Java Program Data on a Wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Advances in Software Engineering English A simple application program interface (API) for Java programs running on a wiki is implemented experimentally. A Java program with the API can be running on a wiki, and the Java program can save its data on the wiki. The Java program consists of PukiWiki, which is a popular wiki in Japan, and a plug-in, which starts up Java programs and classes of Java. A Java applet with default access privilege cannot save its data at a local host. We have constructed an API of applets for easy and unified data input and output at a remote host. We also combined the proposed API and the wiki system by introducing a wiki tag for starting Java applets. It is easy to introduce new types of applications using the proposed API. We have embedded programs such as a simple text editor, a simple music editor, a simple drawing program, and programming environments in a PukiWiki system using this API. 10 7
A Wikipedia-based corpus reference tool Jason Ginsburg Corpus
Language teaching
HCCE English This paper describes a dictionary-like reference tool that is designed to help users find information that is similar to what one would find in a dictionary when looking up a word, except that this information is extracted automatically from large corpora. For a particular vocabulary item, a user can view frequency information, part-of-speech distribution, word-forms, definitions, example paragraphs and collocations. All of this information is extracted automatically from corpora and most of this information is extracted from Wikipedia. Since Wikipedia is a massive corpus covering a diverse range of general topics, this information is probably very representative of how target words are used in general. This project has applications for English language teachers and learners, as well as for language researchers. 0 0
A case study on scaffolding design for wiki-based collaborative knowledge building Li S.
Shi P.
Tang Q.
Collaborative knowledge building
Lecture Notes in Computer Science English Social software, particularly Wiki, is providing new opportunities for computer-based collaborative learning by supporting more flexible sharing, communication, co-writing, collaborative knowledge building and learning community building. This paper presents a case study on how to scaffold wiki-based collaborative knowledge building in a tertiary education environment, which is expected to be a useful exploration of pedagogy with wikis. The paper proposes a theoretical scaffolding framework for wiki-based collaborative knowledge building, in which cognitive process, motivation and skills are concerned as the backbone of scaffolding design. Then, implementation results of this framework in a bachelor degree course are reported in the paper. Results of the implementation were positive. As predicted, both participation rate and quality of social construction were improved. The paper concludes with a discussion and reflection on issues relevant to implementation of scaffolding framework, including designing scaffolding strategies, the role of instructors, improvement of wiki systems and further researches. 0 0
A casual network security using a portable sensor device and wiki software Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Network security
Security monitor
Proceedings - 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet, SAINT 2012 English A casual network security monitoring system is proposed in this paper. The system is easy to deploy without reconfiguring the central network infrastructure, the firewall, and the intrusion detector system (IDS) of an organization. A virus-infected host, which is hidden by the network address translator (NAT) of a sub LAN, can be identified easily by using this monitoring system with the IDS. This monitoring system consists of a portable sensor device and a web site with wiki software. The portable sensor device, which is located on a target LAN that may have virus-infected hosts, is remote-controlled by a network manager's commands. The commands and the results are written on a wiki page. 0 1
A collaborative Wiki-based tool for semantic management of medical interventions Kontotasiou D.
Zarpalas D.
Bratsas C.
Bamidis P.D.
Medical interventions
Semantic management
IEEE 12th International Conference on BioInformatics and BioEngineering, BIBE 2012 English Semantic wikis have been widely adopted to support a variety of collaborative activities within the health domain [6], [7], [9]. In this paper, relevant existing tools that may be taken into account for the development of a Wiki-based tool are revisited. The paper then proposes a collaborative Wiki-based tool to be used for semantic management and classification of unstructured and semi-structured medical interventions [12] spread across the Web. The architecture of the tool and its functionality are described in the light of some evidence and a discussion on how this tool may become useful in the semantic Web description of elderly care interventions in the ageing society. 0 0
A collaborative environment to learn programming Bizzarri G.
Forlizzi L.
Ricci F.
Collaborative programming
Teaching tool
CSEDU 2012 - Proceedings of the 4th International Conference on Computer Supported Education English Students taking their first steps in the programming world need to find resolved examples, compare their solutions to well-know problems and to understand the errors that are returned by a compiler. We have planned to create a wiki for source code and give to the students an e-learning platform that allow them to write code in a collaborative way, integrated with a technology to compile the source code written in different programming languages, to interpret errors returned by the compiler and to show them by a virtual tutor speaking in their national language and that use the natural language of everyday life. It helps to understand the errors, where they were committed and how fix them. 0 0
A common body of knowledge for engineering secure software and services Schwittek W.
Schmidt H.
Beckers K.
Eicker S.
Fassbender S.
Heisel M.
Common body of knowledge
Knowledge management
Security engineering
Services computing
Software engineering
Proceedings - 2012 7th International Conference on Availability, Reliability and Security, ARES 2012 English The discipline of engineering secure software and services brings together researchers and practitioners from software, services, and security engineering. This interdisciplinary community is fairly new, it is still not well integrated and is therefore confronted with differing perspectives, processes, methods, tools, vocabularies, and standards. We present a Common Body of Knowledge (CBK) to overcome the aforementioned problems. We capture use cases from research and practice to derive requirements for the CBK. Our CBK collects, integrates, and structures knowledge from the different disciplines based on an ontology that allows one to semantically enrich content to be able to query the CBK. The CBK heavily relies on user participation, making use of the Semantic MediaWiki as a platform to support collaborative writing. The ontology is complemented by a conceptual framework, consisting of concepts to structure the knowledge and to provide access to it, and a means to build a common terminology. We also present organizational factors covering dissemination and quality assurance. 0 0
A comprehensive concept of optogenetics Dugue G.P.
Akemann W.
Knopfel T.
Fluorescent proteins
Optical control
Optical imaging
Progress in Brain Research English Fundamental questions that neuroscientists have previously approached with classical biochemical and electrophysiological techniques can now be addressed using optogenetics. The term optogenetics reflects the key program of this emerging field, namely, combining optical and genetic techniques. With the already impressively successful application of light-driven actuator proteins such as microbial opsins to interact with intact neural circuits, optogenetics rose to a key technology over the past few years. While spearheaded by tools to control membrane voltage, the more general concept of optogenetics includes the use of a variety of genetically encoded probes for physiological parameters ranging from membrane voltage and calcium concentration to metabolism. Here, we provide a comprehensive overview of the state of the art in this rapidly growing discipline and attempt to sketch some of its future prospects and challenges. © 2012 Elsevier B.V. 0 0
A conceptual framework and experimental workbench for architectures Konersmann M.
Goedicke M.
Lecture Notes in Computer Science English When developing the architecture of a software system, inconsistent architecture representations and missing specifications or documentations are often a problem. We present a conceptual framework for software architecture that can help to avoid inconsistencies between the specification and the implementation, and thus helps during the maintenance and evolution of software systems. For experimenting with the framework, we present an experimental workbench. Within this workbench, architecture information is described in an intermediate language in a semantic wiki. The semantic information is used as an experimental representation of the architecture and provides a basis for bidirectional transformations between implemented and specified architecture. A systematic integration of model information in the source code of component models allows for maintaining only one representation of the architecture: the source code. The workbench can be easily extended to experiment with other Architecture Description Languages, Component Models, and analysis languages. 0 0
A corpus-based study of edit categories in featured and non-featured wikipedia articles Daxenberger J.
Iryna Gurevych
Collaborative authoring
Quality assessment
Revision history
24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers English In this paper, we present a study of the collaborative writing process in Wikipedia. Our work is based on a corpus of 1,995 edits obtained from 891 article revisions in the English Wikipedia. We propose a 21-category classification scheme for edits based on Faigley and Witte's (1981) model. Example edit categories include spelling error corrections and vandalism. In a manual multi-label annotation study with 3 annotators, we obtain an inter-annotator agreement of α = 0.67. We further analyze the distribution of edit categories for distinct stages in the revision history of 10 featured and 10 non-featured articles. Our results show that the information content in featured articles tends to become more stable after their promotion. On the opposite, this is not true for non-featured articles. We make the resulting corpus and the annotation guidelines freely available. 0 0
A crowdsourcing model for public consultations on draft laws Burov V.
Patarakin E.
Yarmakhov B.
ACM International Conference Proceeding Series English The paper discusses an innovative approach to lawmaking. In the proposed model a draft law is split into segments and improved by a network community which members can vote for the segments and suggest their own versions. Several cases of public consultations of Russian Laws based on Wikivote approach are presented and analyzed. Copyright 2012 ACM. 0 0
A data-driven sketch of Wikipedia editors Robert West
Ingmar Weber
Carlos Castillo
Web usage
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion English Who edits Wikipedia? We attempt to shed light on this question by using aggregated log data from Yahoo!'s browser toolbar in order to analyzeWikipedians' editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in pop culture. Then we inspect how editors' general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors show more expertise in their active domains than average users. Our results are relevant as they illuminate novel aspects of what has become many Web users' prevalent source of information and can help in recruiting new editors. Copyright is held by the author/owner(s). 0 0
A domain ontology building process based on principles of social web Guergour H.-E.
Boufaida Z.
Semantic MediaWiki
Social Web
2012 International Conference on Information Technology and e-Services, ICITeS 2012 English In this paper, we present a domain ontology building process based on principles of social web. The proposed process takes into account the deficiencies that have marked the development of ontologies, mainly those related to the negligence of user, the sustainability of knowledge and the problem of consensus among different users. Our focus concentrates on introducing users in the development process in order to address these problems. We model this initiative by the principles of social web and this by exploiting, in particular, wikis and content sharing to ensure sustainability, availability and sharing of knowledge between different collaborators, and the use of tags and folksonomies to converge to a consensus vocabulary. 0 0
A dual hashtables algorithm for durable top-k search Ming H.
YanChun Zhang
Chunxiao Xing
Yin H.
Wang M.
Document Archives
Durable top-k
Proceedings - 9th Web Information Systems and Applications Conference, WISA 2012 English We propose a dual hash tables algorithm which can realize the durable top-k search. Two hash tables are constructed to keep the core information, such as score and time in the inverted lists. We use the key-value relationships between the two hash tables to calculate the scores which measure the correlations between a keyword and documents, and search the versioned objects that are consistent in the top-k results throughout a given query interval. Finally, we use data from Wikipedia to demonstrate the efficiency and performance of our algorithm. 0 0
A framework to represent and mine knowledge evolution from Wikipedia revisions Wu X.
Wei Fan
Sheng M.
Lei Zhang
Shi X.
Su Z.
Yiqin Yu
Expired data detection
Knowledge evolution extraction
Wikipedia revision
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion English State-of-the-art knowledge representation in semantic web employs a triple format (subject-relation-object). The limitation is that it can only represent static information, but cannot easily encode revisions of semantic web and knowledge evolution. In reality, knowledge does not stay still but evolves over time. In this paper, we first introduce the concept of "quintuple representation" by adding two new fields, state and time, where state has two values, either in or out, to denote that the referred knowledge takes effective or becomes expired at the given time. We then discuss a twostep statistical framework to mine knowledge evolution into the proposed quintuple representation. Utilizing extracted quintuple properly, it not only can reveal knowledge changing history but also detect expired information. We evaluate the proposed framework on Wikipedia revisions, as well as, common web pages currently not in semantic web format. Copyright is held by the author/owner(s). 0 0
A graph-based approach for ontology population with named entities Shen W.
Wang J.
Luo P.
Wang M.
Entity linking
Label propagation
Named entity classification
Ontology population
ACM International Conference Proceeding Series English Automatically populating ontology with named entities extracted from the unstructured text has become a key issue for Semantic Web and knowledge management techniques. This issue naturally consists of two subtasks: (1) for the entity mention whose mapping entity does not exist in the ontology, attach it to the right category in the ontology (i.e., fine-grained named entity classification), and (2) for the entity mention whose mapping entity is contained in the ontology, link it with its mapping real world entity in the ontology (i.e., entity linking). Previous studies only focus on one of the two subtasks and cannot solve this task of populating ontology with named entities integrally. This paper proposes APOLLO, a grAph-based aPproach for pOpuLating ontoLOgy with named entities. APOLLO leverages the rich semantic knowledge embedded in the Wikipedia to resolve this task via random walks on graphs. Meanwhile, APOLLO can be directly applied to either of the two subtasks with minimal revision. We have conducted a thorough experimental study to evaluate the performance of APOLLO. The experimental results show that APOLLO achieves significant accuracy improvement for the task of ontology population with named entities, and outperforms the baseline methods for both subtasks. 0 0
A graph-based summarization system at QA@INEX track 2011 Laureano-Cruces A.L.
Ramirez-Rodriguez J.
Automatic summarization system
Question-answering system
Lecture Notes in Computer Science English In this paper we use REG, a graph-based system to study a fundamental problem of Natural Language Processing: the automatic summarization of documents. The algorithm models a document as a graph, to obtain weighted sentences. We applied this approach to the INEX@QA 2011 task (question-answering). We have extracted the title and some key or related words according to two people from the queries, in order to recover 50 documents from english wikipedia. Using this strategy, REG obtained good results with the automatic evaluation system FRESA. 0 0
A hybrid QA system with focused IR and automatic summarization for INEX 2011 Bhaskar P.
Somnath Banerjee
Neogi S.
Bandyopadhyay S.
Automatic summarization
INEX 2011
Information extraction
Information retrieval
Question answering
Lecture Notes in Computer Science English The article presents the experiments carried out as part of the participation in the QA track of INEX 2011. We have submitted two runs. The INEX QA task has two main sub tasks, Focused IR and Automatic Summarization. In the Focused IR system, we first preprocess the Wikipedia documents and then index them using Nutch. Stop words are removed from each query tweet and all the remaining tweet words are stemmed using Porter stemmer. The stemmed tweet words form the query for retrieving the most relevant document using the index. The automatic summarization system takes as input the query tweet along with the tweet's text and the title from the most relevant text document. Most relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query tweet, tweet's text and title words. Each retrieved sentence is assigned a ranking score in the Automatic Summarization system. The answer passage includes the top ranked retrieved sentences with a limit of 500 words. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. Our first run got the highest score of 432.2 in Relaxed metric of Readability evaluation among all the participants. 0 0
A hybrid method based on WordNet and Wikipedia for computing semantic relatedness between texts Malekzadeh R.
Bagherzadeh J.
Noroozi A.
Information retrieval
Lexical semantic knowledge
Semantic relatedness
Semantic similarity
AISP 2012 - 16th CSI International Symposium on Artificial Intelligence and Signal Processing English In this article we present a new method for computing semantic relatedness between texts. For this purpose we use a tow-phase approach. The first phase involves modeling document sentences as a matrix to compute semantic relatedness between sentences. In the second phase, we compare text relatedness by using the relation of their sentences. Since Semantic relation between words must be searched in lexical semantic knowledge source, selecting a suitable source is very important, so that produced accurate results with correct selection. In this work, we attempt to capture the semantic relatedness between texts with a more accuracy. For this purpose, we use a collection of tow well known knowledge bases namely, WordNet and Wikipedia, so that provide more complete data source for calculate the semantic relatedness with a more accuracy. We evaluate our approach by comparison with other existing techniques (on Lee datasets). 0 0
A knowledge-extraction approach to identify and present verbatim quotes in free text Paass G.
Bergholz A.
Pilz A.
Information extraction application
Relation Extraction
ACM International Conference Proceeding Series English In news stories verbatim quotes of persons play a very important role, as they carry reliable information about the opinion of that person concerning specific aspects. As thousands of new quotes are published every hour it is very difficult to keep track of them. In this paper we describe a set of algorithms to solve the knowledge management problem of identifying, storing and accessing verbatim quotes. We handle the verbatim quote task as a relation extraction problem from unstructured text. Using a workflow of knowledge extraction algorithms we provide the required features for the relation extraction algorithm. The central relation extraction procedures is trained using manually annotated documents. It turns out that structural grammatical information is able to improve the F-vale for verbatim quote detection to 84.1%, which is sufficient for many exploratory applications. We present the results in a smartphone app connected to a web server, which employs a number of algorithms like linkage to Wikipedia, topics extraction and search engine indices to provide a flexible access to the extracted verbatim quotes. 0 0
A learning-based framework to utilize E-HowNet ontology and Wikipedia sources to generate multiple-choice factual questions Chu M.-H.
Chen W.-Y.
Lin S.-D.
Multiple-choice questions
Proceedings - 2012 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2012 English This paper proposes a framework that automatically generates multiple-choice questions. Unlike most other similar works that focus on generating questions for English proficiency tests, this paper provides a framework to generate factual questions in Chinese. We have decomposed this problem into several sub-tasks: a) the identification of sentences that contain factual knowledge, b) the identification of the query term from each factual sentence, and c) the generation of distractors. Learning-based approaches are applied to address the first two problems. We then propose a way to generate distractors by using E-HowNet ontology database and Wikipedia sources. The system was evaluated through user study and test theory, and achieved a satisfaction rate of up to 70.6%. 0 0
A method for automatically extracting domain semantic networks from Wikipedia Xavier C.C.
De Lima V.L.S.
Knowledge acquisition
Semantic networks
Lecture Notes in Computer Science English This paper describes a method for automatically extracting domain semantic networks of concepts connected by non-specific relations from Wikipedia. We propose an approach based on category and link structure analysis. The method consists of two main tasks: concepts extraction and relations acquisition. For each task we developed two different implementation strategies. Aiming to identify what strategies have the best performances we conducted different extractions for two domains and we analyze their results. From this evaluation we discuss the best approach to implement the extraction method. 0 0
A method for cooperation support between discussion space and activity space in collaborative learning and its experimental evaluation Tilwaldi D.
Kaneko S.
Hosomura T.
Dasai T.
Mitsui H.
Koizumi H.
Collaborative learning
Electronics and Communications in Japan English This paper describes a prototype and the experimental evaluation of a chat system that offers cooperation support between a discussion space and an activity space in collaborative learning. In collaborative learning in the proposed system, the students are divided into groups, carry out discussions on a study topic through chats, and create online reports in a cooperative manner. The proposed cooperation support method aims at improving the level of cooperation among students and the effectiveness of learning by making group members aware of other members' learning circumstances through cooperation support in group member utterances and report creation. We use a Wiki as a tool for collaborative work in this research. Cooperation support displays the Wiki's update time and contents on the chat system with activity cooperation support, offering a space for distance collaborative learning and allowing each student to become aware of the other students' situations. In addition, the number of chat utterances is displayed, and other students' situation is easily grasped. © 2012 Wiley Periodicals, Inc. Electron Comm Jpn, 95(2): 25-38, 2012; Published online in Wiley Online Library (). DOI 10.1002/ecj.10366 Copyright 0 0
A model for information growth in collective wisdom processes Sanmay Das
Malik Magdon-Ismail
Collective intelligence
Dynamical systems
Social network
ACM Transactions on Knowledge Discovery from Data English Collaborative media such as wikis have become enormously successful venues for information creation. Articles accrue information through the asynchronous editing of users who arrive both seeking information and possibly able to contribute information. Most articles stabilize to high-quality, trusted sources of information representing the collective wisdom of all the users who edited the article. We propose a model for information growth which relies on two main observations: (i) as an article's quality improves, it attracts visitors at a faster rate (a rich-get-richer phenomenon); and, simultaneously, (ii) the chances that a new visitor will improve the article drops (there is only so much that can be said about a particular topic). Our model is able to reproduce many features of the edit dynamics observed on Wikipedia; in particular, it captures the observed rise in the edit rate, followed by 1/t decay. Despite differences in the media, we also document similar features in the comment rates for a segment of the LiveJournal blogosphere. 0 0
A multi-layer text classification framework based on two-level representation model Jiali Yun
Liping Jing
Jian Yu
Houkuan Huang
Multi-layer classification
Text classification
Text representation
Expert Systems with Applications English Text categorization is one of the most common themes in data mining and machine learning fields. Unlike structured data, unstructured text data is more difficult to be analyzed because it contains complicated both syntactic and semantic information. In this paper, we propose a two-level representation model (2RM) to represent text data, one is for representing syntactic information and the other is for semantic information. Each document, in syntactic level, is represented as a term vector where the value of each component is the term frequency and inverse document frequency. The Wikipedia concepts related to terms in syntactic level are used to represent document in semantic level. Meanwhile, we designed a multi-layer classification framework (MLCLA) to make use of the semantic and syntactic information represented in 2RM model. The MLCLA framework contains three classifiers. Among them, two classifiers are applied on syntactic level and semantic level in parallel. The outputs of these two classifiers will be combined and input to the third classifier, so that the final results can be obtained. Experimental results on benchmark data sets (20Newsgroups, Reuters-21578 and Classic3) have shown that the proposed 2RM model plus MLCLA framework improves the text classification performance by comparing with the existing flat text representation models (Term-based VSM, Term Semantic Kernel Model, Concept-based VSM, Concept Semantic Kernel Model and Term + Concept VSM) plus existing classification methods. © 2011 Elsevier Ltd. All rights reserved. 0 0
A new preprocessing phase for LSA-based Turkish text summarization Guran A.
Bayazit N.G.
Latent Semantic Analysis
Turkish Text Summarization
Turkish Wikipedia
Lecture Notes in Electrical Engineering English Text Summarization is a process of identifying the most salient information in a document or a set of related documents. This paper presents the performance analysis of a Turkish text summarization system that applies two Latent Semantic Analysis based algorithms with different preprocessing phases. The preprocessing method called "Consecutive Words Detection" is a new method that uses Turkish Wikipedia links to represent related consecutive words as a single term and improves the performance of text summarization in Turkish. 0 0
A novel Framenet-based resource for the semantic web Bryl V.
Tonelli S.
Claudio Giuliano
Luciano Serafini
Semantic web
Word sense disambiguation
Proceedings of the ACM Symposium on Applied Computing English FrameNet is a large-scale lexical resource encoding information about semantic frames (situations) and semantic roles. The aim of the paper is to enrich FrameNet by mapping the lexical fillers of semantic roles to WordNet using a Wikipedia-based detour. The applied methodology relies on a word sense disambiguation step, in which a Wikipedia page is assigned to a role filler, and then BabelNet and YAGO are used to acquire WordNet synsets for a filler. We show how to represent the acquired resource in OWL, linking it to the existing RDF/OWL representations of FrameNet and WordNet. Part of the resource is evaluated by matching it with the WordNet synsets manually assigned by FrameNet lexicographers to a subset of semantic roles. 0 0
A novel model of bursts in event sequences Sun J.
Yin J.
Wang T.
Yanyan Li
Burst detection
Bursty interval
Event sequence
2012 2nd International Conference on Consumer Electronics, Communications and Networks, CECNet 2012 - Proceedings English When we focus on events analysis, such as studying the event burstiness and monitoring the event trends, we have to face a great number of similar events that had happened in the history, i.e. event sequences, usually spanning more than 10 years. Burst detection is a popular technique of sequence analysis. Today there are several burst models and detection algorithms based on different burst definitions, producing very different results. However, almost all of these definitions are difficult to clearly define the beginning and end, as well as intensity of the burst. In this paper, we reconsider the burst definition and propose an efficient detection approach for event sequences, which can be further utilized to other temporal sequences or data streams. As a sample application, we present the burst model for event sequences since the end of WWII (1946∼2010) collected from the event lists in Wikipedia. Finally, we show the results comparison of our model and three popular ones in terms of rationality, and two case studies. 0 0
A practical approach to language complexity: a Wikipedia case study Taha Yasseri
András Kornai
János Kertész
Submitted to PLoS ONE English In this paper we present statistical analysis of English texts from Wikipedia (WP). We try to address the issue of language complexity empirically by comparing samples of the main English WP (Main) and the simple English WP (Simple). Simple is supposed to use a more simplified language with a limited vocabulary, and editors are explicitly requested to follow this guideline, yet in practice the vocabulary richness of both samples are at the same level. However, detailed analysis of longer units (n-grams rather than words alone) shows that the language of Simple is indeed less complex than that of Main. Comparing the two language varieties by the Gunning readability index supports this conclusion. We also report on the topical dependence of language complexity, e.g. that the language is more advanced in conceptual articles compared to person-based (biographical) and object-based articles. Finally, we investigate the relation between conflict and language complexity by analysing the content of the talk pages associated to controversial and peacefully developing articles, concluding that controversy has the effect of reducing language complexity. 0 0
A recommender system for wiki pages: Usage based rating approach Agasta Adline A.L.
Mahalakshmi G.S.
Recommender system
Wik ipages
International Conference on Recent Trends in Information Technology, ICRTIT 2012 English Online educational resources are in abundance, which urged the need for recommender systems to assist learners in identifying learning resources which suits their need. Wikipedia plays a major role in online delivery of educational resources. The open to access and edit feature of wiki allows anonymous users to include resources which necessitate the rating of wiki resources. Recommender systems suggests users with items that suits them the best. In this paper we propose a recommender system for wiki pages, which uses certain measures and metrics to rate the quality of wiki resources. The proposed model categorizes the wiki educational resources based on the purpose of usage. 0 0
A semantic approach to recommending text advertisements for images Weinan Zhang
Tian L.
Xiaohua Sun
Haofen Wang
Yiqin Yu
Crossmedia mining
Semantic matching
Visual contextual advertising
RecSys'12 - Proceedings of the 6th ACM Conference on Recommender Systems English In recent years, more and more images have been uploaded and published on the Web. Along with text Web pages, images have been becoming important media to place relevant advertisements. Visual contextual advertising, a young research area, refers to finding relevant text advertisements for a target image without any textual information (e.g., tags). There are two existing approaches, advertisement search based on image annotation, and more recently, advertisement matching based on feature translation between images and texts. However, the state of the art fails to achieve satisfactory results due to the fact that recommended advertisements are syntactically matched but semantically mismatched. In this paper, we propose a semantic approach to improving the performance of visual contextual advertising. More specifically, we exploit a large high-quality image knowledge base (ImageNet) and a widely-used text knowledge base (Wikipedia) to build a bridge between target images and advertisements. The image-advertisement match is built by mapping images and advertisements into the respective knowledge bases and then finding semantic matches between the two knowledge bases. The experimental results show that semantic match outperforms syntactic match significantly using test images from Flickr. We also show that our approach gives a large improvement of 16.4% on the precision of the top 10 matches over previous work, with more semantically relevant advertisements recommended. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM). 0 0
A semantic wiki for editing and sharing decision guidelines in oncology Meilender T.
Lieber J.
Palomares F.
Herengt G.
Jay N.
Oncology decision guidelines
Semantic wiki
Social semantic web
Web 2.0
Studies in Health Technology and Informatics English The Internet has totally changed the way information is published and shared in medicine. With web 2.0 and semantic web technologies, web applications allow now collaborative information editing in a way that can be reused by machines. These new tools could be used to in local health networks to promote the editing and sharing of medical knowledge between practitioners. Oncolor, a French oncology network, has edited 144 decision guidelines. These local guidelines rely upon national French guidelines and are built and updated collaboratively by medical experts. To improve working conditions, the need of an online collaborative tool has been expressed. This paper presents ONCOLOGIK, a semantic wiki approach for local oncology guideline editing. Semantic wikis allow online collaborative work and manage semantic annotations which can be reused automatically to bring new services. Applied to oncology guidelines, semantic technologies improve the guideline management and provide additional services such as targeted queries to external bibliographical resources. © 2012 European Federation for Medical Informatics and IOS Press. All rights reserved. 0 0
A semantic-based social network of academic researchers Davoodi E.
Kianmehr K.
Clustering Analysis
Information retrieval
Semantic-based Similarity
Social Network Analysis
Lecture Notes in Computer Science English We proposed a framework to construct a semantic-based social network of academic researchers to discover hidden social relationships among the researchers in a particular domain. The challenging task in in the process is to detect accurate relationships that exist among researchers according to their expertise and academic experience. In this paper, we first construct content-based profiles of researchers by crawling online resources. Then background knowledge derived from Wikipedia ,represented in a semantic kernel, is employed to enrich the researchers' profiles. Researchers' social network is then constructed based on the similarities among semantic-based profiles. Social communities are then detected by applying the social network analysis and using factors such as experience, background, knowledge level, personal preferences. Representative members of a community are identified using the eigenvector centrality measure. An interesting application of the constructed social network in academic conferences, when there is a need to assign papers to relevant researchers for the review process, is investigated. 0 0
A social network for video annotation and discovery based on semantic profiling Bertini M.
Del Bimbo A.
Ferracani A.
Pezzatini D.
Internet videos
Social video retrieval
Social video tagging
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion English This paper presents a system for the social annotation and discovery of videos based on social networks and social knowledge. The system, developed as a web application, allows users to comment and annotate, manually and automatically, video frames and scenes enriching their content with tags, references to Facebook users and pages and Wikipedia resources. These annotations are used to semantically model the interests and the folksonomy of each user and resource in the network, and to suggest to users new resources, Facebook friends and videos whose content is related to their interests. A screencast showing an example of these functionalities is publicly available at: http://vimeo.com/miccuni-/facetube. Copyright is held by the International World Wide Web Conference. 0 0
A study in language identification Milne R.M.
O'keefe R.A.
Andrew Trotman
Language identification Proceedings of the 17th Australasian Document Computing Symposium, ADCS 2012 English Language identification is automatically determining the language that a previously unseen document was written in. We compared several prior methods on samples from the Wikipedia and the EuroParl collections. Most of these methods work well. But we identify that these (and presumably other document) collections are heterogeneous in size, and short documents are systematically different from large ones. That techniques that work well on long documents are different from those that work well on short ones. We believe that improvement in algorithms will be seen if length is taken into account. Copyright 0 0
A study of social behavior in collaborative user generated services Yao P.
Hu Z.
Zhao Z.
Crespi N.
Data mining
Social network analysis
User generated services
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, ICUIMC'12 English User-generated content has become more and more popular. The success of collaborative content creation such as Wikipedia shows the level of user's accomplishments in knowledge sharing and socialization. In this paper we extend this research in the service domain, to explore users' social behavior in Collaborative User-Generated Services (Co-UGS). We create a model which is derived from a real social network with its behavior being similar to that of Co-UGS. The centrality approach of social network analysis is used to analyze Co-UGS simulation on this model. Three Co-UGS network actors are identified to distinguish users according to their reactions to a service, i.e. ignoring users, sharing users and co-creating users. Moreover, six hypotheses are proposed to keep the Co-UGS simulation. The results show that the Co-UGS network constructed by the sharing and co-creating users is a connected group superimposed on the basis of the social network of users. In addition, the feasibility of this simulation method is demonstrated along with the validity of applying social network analysis to the study of users' social behavior in Co-UGS. 0 0
A supervised method for lexical annotation of schema labels based on wikipedia Sorrentino S.
Bergamaschi S.
Parmiggiani E.
Lecture Notes in Computer Science English Lexical annotation is the process of explicit assignment of one or more meanings to a term w.r.t. a sense inventory (e.g., a thesaurus or an ontology). We propose an automatic supervised lexical annotation method, called ALA TK (Automatic Lexical Annotation -Topic Kernel), based on the Topic Kernel function for the annotation of schema labels extracted from structured and semi-structured data sources. It exploits Wikipedia as sense inventory and as resource of training data. 0 0
A survey of RE-specific wikis for distributed requirements engineering Lai H.
Peng R.
Sun D.
Shao F.
Yuanyuan Liu
Collaborative requirements activity
RE-specific wikis
Proceedings - 2012 8th International Conference on Semantics, Knowledge and Grids, SKG 2012 English Wiki, as one of the Web 2.0 technology, has received considerable interest due to its capability to support collaboratively online contents' creation in a flexible and simple manner. Lots of researchers and practitioners committed themselves to enhancing wiki's capability to support Requirements Engineering (RE). The main goal of this study is to discover all the available tools that use the wiki way or extend the wiki technology to support RE (called as RE-specific wikis), how these RE-specific wikis have been applied, and identify future research directions. We performed a survey through a thorough search for literature and tools that answer our research questions. After data synthesis, we found 12 available RE-specific wikis. And then, we drew out their features, evaluated their RE adaptability. Based on the above findings, we discussed future research directions on how to promote RE-specific wikis to support the collaborative requirements activities from representation, agreement and specification dimensions. 0 0
A technique for suggesting related Wikipedia articles using link analysis Markson C.
Song M.
Link analysis
Recommendation system
Socia media mining
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries English With more than 3.7 million articles, Wikipedia has become an important social medium for sharing knowledge. However, with this enormous repository of information, it can often be difficult to locate fundamental topics that support lower-level articles. By exploiting the information stored in the links between articles, we propose that related companion articles can be automatically generated to help further the reader's understanding of a given topic. This approach to a recommendation system uses tested link analysis techniques to present users with a clear path to related high-level articles, furthering the understanding of low-level topics. 0 0
A theoretical framework for healthcare innovation management using wiki based Digital Ecosystem Fasihuddin H.
Skinner G.
Athauda R.
Digital Ecosystem
Healthcare system
Knowledge sharing
Advanced Materials Research English Information Technology innovations have strongly affected today's businesses and the way we work. This effect involves different industries, and the healthcare industry is one of them. Various healthcare information systems have been introduced to manage and share patient records and information. However, based on the reviewed literatures, the healthcare knowledge management system does not have the same focus and attention. It is found that there is no system that is able to manage the tacit healthcare knowledge and innovation. As a result, this paper aims to introduce a theoretical framework that enables healthcare tacit knowledge management and global sharing. Digital Ecosystem is found to be the most suitable technology to achieve this aim; specifically with the wiki environment as it is most suitable for the healthcare industry requirements. 0 0
A video recording and viewing protocol for student group presentations: Assisting self-assessment through a Wiki environment Barry S. Group experience
Group feedback
Group presentations
Computers and Education English The purpose of this research was to firstly develop a protocol for video recording student group oral presentations, for later viewing and self-assessment by student group members. Secondly, evaluations of students' experiences of this process were undertaken to determine if this self-assessment method was a positive experience for them in gaining insights into the quality of their group's presentation. Participants were students undertaking a first year course in a bachelor of business degree within an Australian university. Students were surveyed twice, once prior to group formation to determine their previous oral group presentation experiences and then after viewing their presentations. Data from survey items assessing students' perspectives on the utility of viewing their video presentations, within their group Wikis, revealed that watching the video of their group presentation was an effective method of feedback and could improve both group and individual performance in the future. Further, content analysis of open ended survey questions and focus groups identified that students were highly engaged in the activity and after reviewing and reflecting on their video recording had deeper insights and raised awarenesses of making group presentations. Students identified that this experience would benefit any future group oral presentations they made. © 2012 Elsevier Ltd. All rights reserved. 0 0
A wiki application for artificial neural network course in engineering education Cetin G.
Karakis R.
2012 15th International Conference on Interactive Collaborative Learning, ICL 2012 English Using a wiki is one of the most popular and simple way to collaborate in education. A survey conducted in electronics and computer education students in Gazi University, Faculty of Technical Education shows that wikis containing lecture materials are the most preferred source for the course following the instructor's handouts. In this paper, the combination of lecture notes, presentations and interactive animations on a wiki for artificial neural network course is presented. The opportunities and enhancement of wikis for collaborative engineering education are discussed. 0 0
A wiki as a common framework for promoting autonomous learning among university students Viciana-Abad R.
Munoz-Exposito J.E.
Perez-Lorenzo J.M.
Garcia-Galan S.
Parra-Rodriguez F.
Autonomous learning
Telematics engineering
Web 2.0 technologies
International Journal of Innovation and Learning English The process of adapting methodologically to European Credit Transfer System suffers from a lack in practical evaluations within the engineering field. One of the main competencies within the studies of telematics engineering is the development of skills related to behaving as technical consultants. This competency has been traditionally developed via publishing additional material through learning management systems; however, the approach followed within this study has promoted its development through the creation of practical guides within a wiki. The evaluation of this activity with students of different courses is presented herein, providing certain guidelines about its use as a support system for autonomous learning. Copyright 0 0
APOLLO: A general Framework for POpuLating ontoLOgy with named entities via random walks on graphs Shen W.
Wang J.
Luo P.
Wang M.
Entity linking
Label propagation
Named entity classification
Ontology population
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion English Automatically populating ontology with named entities extracted from the unstructured text has become a key issue for Semantic Web. This issue naturally consists of two subtasks: (1) for the entity mention whose mapping entity does not exist in the ontology, attach it to the right category in the ontology (i.e., fine-grained named entity classification), and (2) for the entity mention whose mapping entity is contained in the ontology, link it with its mapping real world entity in the ontology (i.e., entity linking). Previous studies only focus on one of the two subtasks. This paper proposes APOLLO, a general weakly supervised frAmework for POpuLating ontoLOgy with named entities. APOLLO leverages the rich semantic knowledge embedded in the Wikipedia to resolve this task via random walks on graphs. An experimental study has been conducted to show the effectiveness of APOLLO. Copyright is held by the author/owner(s). 0 0
Academic research into Wikipedia Eduard Aibar
Mayo Fuster Morell
Digithum English
2 0
Adding semantics to microblog posts Edgar Meij
Weerkamp W.
Maarten de Rijke
WSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining English Microblogs have become an important source of information for the purpose of marketing, intelligence, and reputation management. Streams of microblogs are of great value because of their direct and real-time nature. Determining what an individual microblog post is about, however, can be non-trivial because of creative language usage, the highly contextualized and informal nature of microblog posts, and the limited length of this form of communication. We propose a solution to the problem of determining what a microblog post is about through semantic linking: we add semantics to posts by automatically identifying concepts that are semantically related to it and generating links to the corresponding Wikipedia articles. The identified concepts can subsequently be used for, e.g., social media mining, thereby reducing the need for manual inspection and selection. Using a purpose-built test collection of tweets, we show that recently proposed approaches for semantic linking do not perform well, mainly due to the idiosyncratic nature of microblog posts. We propose a novel method based on machine learning with a set of innovative features and show that it is able to achieve significant improvements over all other methods, especially in terms of precision. Copyright 2012 ACM. 0 0
Adoption of a wiki within a large internal medicine residency program: A 3-year experience Crotty B.H.
Mostaghimi A.
Reynolds E.E.
Journal of the American Medical Informatics Association English Objective To describe the creation and evaluate the use of a wiki by medical residents, and to determine if a wiki would be a useful tool for improving the experience, efficiency, and education of housestaff. Materials and methods In 2008, a team of medical residents built a wiki containing institutional knowledge and reference information using Microsoft SharePoint. We tracked visit data for 3 years, and performed an audit of page views and updates in the second year. We evaluated the attitudes of medical residents toward the wiki using a survey. Results Users accessed the wiki 23 218, 35 094, and 40 545 times in each of three successive academic years from 2008 to 2011. In the year two audit, 85 users made a total of 1082 updates to 176 pages and of these, 91 were new page creations by 17 users. Forty-eight percent of residents edited a page. All housestaff felt the wiki improved their ability to complete tasks, and 90%, 89%, and 57% reported that the wiki improved their experience, efficiency, and education, respectively, when surveyed in academic year 2009-2010. Discussion A wiki is a useful and popular tool for organizing administrative and educational content for residents. Housestaff felt strongly that the wiki improved their workflow, but a smaller educational impact was observed. Nearly half of the housestaff edited the wiki, suggesting broad buy-in among the residents. Conclusion A wiki is a feasible and useful tool for improving information retrieval for house officers. 0 0
Advertising Keywords Recommendation for Short-Text Web Pages Using Wikipedia Weinan Zhang
Dingquan Wang
Gui-Rong Xue
Hongyuan Zha
Contextual advertising
Advertising keywords recommendation
Topic-sensitive PageRank
ACM Trans. Intell. Syst. Technol. English 0 0
Agent for mining of significant concepts in DBpedia Boo V.K.
Anthony P.
Concept ranking
Communications in Computer and Information Science English DBpedia.org is a community effort that tries to extract structured information from Wikipedia such that the extracted information can be queried just like a database. This information is opened to public in the form of RDF triple which is compatible with the semantic web standard. Various applications are developed for the purpose of utilizing the structured data in DBpedia. This paper makes an attempt to apply PageRank analysis on the link structure of DBpedia using a mining agent to mine significant concepts in DBpedia. Based on the result, popular concepts have the tendency to be ranked higher than the less popular ones. This paper also proposes an alternative view on how PageRank analysis can be applied to DBpedia link structure based on special characteristics of Wikipedia. The result shows that even concepts with a low PageRank value can be used as a valuable resource for recommending pages in Wikipedia. 0 0
Alternative interfaces for deletion discussions in Wikipedia: Some proposals using decision factors Jodi Schneider
Samp K.
Articles for deletion
WikiSym 2012 English Content deletion is an important mechanism for maintaining quality in online communities. In Wikipedia, deletion is guided by complex procedures. Controversial cases (~12% [4]) are sent to special community discussions called "Articles for Deletion" (AfD). Deciding the outcome of these deletion debates can be difficult. Further, deletion seems to be a point of friction, which demotivates new editors without sufficiently informing them about Wikipedia's values and standards. 0 0
An Aesthetic for Deliberating Online: Thinking Through "Universal Pragmatics" and "Dialogism" with Reference to Wikipedia Nicholas Cimini
Burr J.
Stem cell
Information Society English In this article we examine contributions to Wikipedia through the prism of two divergent critical theorists: Jürgen Habermas and Mikhail Bakhtin. We show that, in slightly dissimilar ways, these theorists came to consider an "aesthetic for democracy" (Hirschkop 1999) or template for deliberative relationships that privileges relatively free and unconstrained dialogue to which every speaker has equal access and without authoritative closure. We employ Habermas's theory of "universal pragmatics" and Bakhtin's "dialogism" for analyses of contributions on Wikipedia for its entry on stem cells and transhumanism and show that the decision to embrace either unified or pluralistic forms of deliberation is an empirical matter to be judged in sociohistorical context, as opposed to what normative theories insist on. We conclude by stressing the need to be attuned to the complexity and ambiguity of deliberative relations online. 0 0
An Improved Contextual Advertising Matching Approach based on Wikipedia Knowledge ZongDa Wu
GuanDong Xu
YanChun Zhang
Peter Dolog
ChengLang Lu
Comput. J. English 0 0
An activity-theory analysis of corporate wikis Helen Hasan
Pfaff C.C.
Activity theory
Case study
Knowledge management
Knowledge management systems
Knowledge work
Information Technology and People English Purpose: Wiki technologies, which are popular in social settings, are beginning to contribute to more flexible and participatory approaches to the exploitation of knowledge in corporate settings. Through the lens of activity theory, this paper aims to investigate contentious challenges to organizational activities that may be associated with the introduction of corporate wikis, in particular the potential democratization of knowledge work. Design/methodology/approach: From a study of several cases of corporate wiki adoption, this paper presents and interprets two representative cases sampled to provide more generalized results. Qualitative data were collected through semi-structured interviews and observation. The analysis followed a systematic process of data reduction, display, and rich interpretation using the concepts of activity theory. Findings: This research provides new understandings of the undervalued activities of knowledge workers, their challenges as wiki users and resulting implications for organizational transformation and improved organizational performance. Research limitations/implications: There is potential bias and limited scope as the choice of cases was determined through organizations known to the researchers and involved some action research. However, the authors justify this approach for a dynamic, emergent topic worthy of immediate investigation and direct applicability of findings to corporate practice. Social implications: This paper addresses the implications of new Web 2.0 technologies for the democratization of knowledge management in the workplace. Originality/value: The novelty of this work lies in using activity theory to explore reasons why some organizations are more successful than others in implementing wikis. This work contributes to research on how social and technological interventions may lead to improved exploitation of knowledge as a corporate resource. 0 0
An adaptive semantic Wiki for CoPs of teachers - Case of higher education context Berkani L.
Azouaou F.
Higher education
Knowledge resource
Semantic annotation
Semantic wiki
International Conference on Information Society, i-Society 2012 English This paper presents an adaptive semantic wiki dedicated to CoPs made up of actors from the higher education context (faculties, lecturers, teaching assistants, lab assistants). The wiki called ASWiki-CoPs (Adaptive Semantic Wiki for CoPs) is based on the semantic web technologies in order to enhance the knowledge sharing and reuse, offering the functionalities of a wiki together with some knowledge management features. ASWiki-CoP is based on an ontology used to describe the knowledge resources through the objective annotations and to express the member's feedback through the subjective annotations. Furthermore, we describe the member's profile in order to allow an adaptive access to the semantic wiki. 0 0
An analysis of systematic judging errors in information retrieval Gabriella Kazai
Nick Craswell
Yilmaz E.
Tahaghoghi S.M.M.
ACM International Conference Proceeding Series English Test collections are powerful mechanisms for the evaluation and optimization of information retrieval systems. However, there is reported evidence that experiment outcomes can be affected by changes to the judging guidelines or changes in the judge population. This paper examines such effects in a web search setting, comparing the judgments of four groups of judges: NIST Web Track judges, untrained crowd workers and two groups of trained judges of a commercial search engine. Our goal is to identify systematic judging errors by comparing the labels contributed by the different groups, working under the same or different judging guidelines. In particular, we focus on detecting systematic differences in judging depending on specific characteristics of the queries and URLs. For example, we ask whether a given population of judges, working under a given set of judging guidelines, are more likely to consistently overrate Wikipedia pages than another group judging under the same instructions. Our approach is to identify judging errors with respect to a consensus set, a judged gold set and a set of user clicks. We further demonstrate how such biases can affect the training of retrieval systems. 0 0
An automatic approach for generating tables in semantic wikis Al-Husain L.
El-Masri S.
Resemblance Metric Wiki
Semantic wiki
Journal of Theoretical and Applied Information Technology English Wiki is well-known content management systems. Semantic wikis extends the classical wikis with semantic annotations that made its contents more structured. Tabular representations of information have a considerable value, especially in wikis which are rich in content and contain large amount of information. For this reason, we propose an approach for automatically generating tables for representing the semantic data contained in wiki articles. The proposed approach composed of three steps (1) extract the semantic data of Typed Links and Attributes from the wiki articles and call them Article Properties (2) cluster the collection of wiki articles based on extracted properties from the first step, and (3) construct the table that aggregates the shared properties between articles and present them in two-dimensions. The proposed approach is based on a simple heuristic which is the number of properties that are shared between wiki articles. © 2005 - 2012 JATIT & LLS. All rights reserved. 0 0
An automatic method of managing resources based on wikipedia Yu X.
Zhang Z.
Huang Z.
Community Detection
Resource Space Model
Topic Model
Journal of Computational Information Systems English This paper presents an unsupervised method to automatically build resource space. Utilizing the rich content and structure of Wikipedia as a background knowledge to automatically interpret, label document and construct the resource space. It combines the methods of sub-tree decomposition, community detection and statistical topic model. The results on the three datasets demonstrate the efficiency of the proposed method. 0 0
An efficient voice enabled web content retrieval system for limited vocabulary Bharath Ram G.R.
Jayakumaur R.
Narayan R.
Shahina A.
Khan A.N.
Content Retrieval
Regular Expressions
Speech to Text
Sphinx 4
Communications in Computer and Information Science English Retrieval of relevant information is becoming increasingly difficult owing to the presence of an ocean of information in the World Wide Web. Users in need of quick access to specific information are sub-jected to a series of web re-directions before finally arriving at the page that contains the required information. In this paper, an optimal voice based web content retrieval system is proposed that makes use of an open source speech recognition engine to deal with voice inputs. The proposed system performs a quicker retrieval of relevant content from Wikipedia and instantly presents the textual information along with the related image to the user. This search is faster than the conventional web content retrieval technique. The current system is built with limited vocabulary but can be extended to support a larger vocabulary. Additionally, the system is also scalable to retrieve content from few other sources of information apart from Wikipedia. 0 0
An english-translated parallel corpus for the CJK wikipedia collections Tang L.-X.
Shlomo Geva
Andrew Trotman
Cross-lingual information retrieval
Cross-lingual link discovery
Machine learning
Proceedings of the 17th Australasian Document Computing Symposium, ADCS 2012 English In this paper, we describe a machine-translated parallel English corpus for the NTCIR Chinese, Japanese and Korean (CJK) Wikipedia collections. This document collection is named CJK2E Wikipedia XML corpus. The corpus could be used by the information retrieval research community and knowledge sharing in Wikipedia in many ways; for example, this corpus could be used for experimentations in cross-lingual information retrieval, cross-lingual link discovery, or omni-lingual information retrieval research. Furthermore, the translated CJK articles could be used to further expand the current coverage of the English Wikipedia. Copyright 0 0
An innovative approach to collaborative document improvement Burov V.
Patarakin E.
Yarmakhov B.
Proceedings of the IADIS International Conference Web Based Communities and Social Media 2012, IADIS International Conference Collaborative Technologies 2012 English The paper describes an innovative approach to lawmaking. A draft of a law is split into segments and is improved by a network community which members can vote for the segments and suggest their own versions. The case of the Russian Law on Education crowdsourcing improvement is discussed. The Wikivote technology can be used in E-government, E-work, E-society and E-learning. 0 0
An ontology evolution-based framework for semantic information retrieval Rodriguez-Garcia M.A.
Valencia-Garcia R.
Garcia-Sanchez F.
Lecture Notes in Computer Science English Ontologies evolve continuously during their life cycle to adapt to new requirements and necessities. Ontology-based information retrieval systems use semantic annotations that are also regularly updated to reflect new points of view. In order to provide a general solution and to minimize the users' effort in the ontology enriching process, a methodology for extracting terms and evolve the domain ontology from Wikipedia is proposed in this work. The framework presented here combines an ontology-based information retrieval system with an ontology evolution approach in such a way that it simplifies the tasks of updating concepts and relations in domain ontologies. This framework has been validated in a scenario where ICT-related cloud services matching the user needs are to be found. 0 0
An overview of a spatial hypertext wiki and its applications Carlos Solis SIGWEB Newsl. English 0 0
Analysis and enhancement of wikification for microblogs with context expansion Cassidy T.
Ji H.
Lev Ratinov
Zubiaga A.
Houkuan Huang
Disambiguation context
Disambiguation to wikipedia (D2W)
24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers English Disambiguation to Wikipedia (D2W) is the task of linking mentions of concepts in text to their corresponding Wikipedia entries. Most previous work has focused on linking terms in formal texts (e.g. newswire) to Wikipedia. Linking terms in short informal texts (e.g. tweets) is difficult for systems and humans alike as they lack a rich disambiguation context. We first evaluate an existing Twitter dataset as well as the D2W task in general. We then test the effects of two tweet context expansion methods, based on tweet authorship and topic-based clustering, on a state-of-the-art D2W system and evaluate the results. 0 0
Analysis of discussion page in Wikipedia based on user's discussion capability Joo S.
Hideaki Takeda
Discussion capability
Discussion evaluation
Proceedings - 2012 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2012 English Wikipedia is the user contributed encyclopedia edited collaboratively by a wide-range of people. Wikipedia usually determines contents of article and editorial policies through discussion among participants. It requires a lot of effort in deducing its conclusion often due to protracted discussion. We need some measurement to examine how discussion is valuable to reach conclusion. So, we call it discussion validity and define it with the model with discussion capability of participants. Discussion capability of participants consists of three features each of which represents characteristic aspect of users' behavior in discussion, and approximated from the corresponding three features of their utterances. We conducted the experiments with the subjects to verify the model and found that our model resulted better than the conventional model and proposed the automatic prediction of discussion validity using the text analysis. Then we estimated discussion validity through the model with the estimated values. It turned out that the estimation was well fitted with the values by the subjects. 0 0
Analysis on construction of information commons of Wiki-based Olympic Library Ma Q. Consultation
Information Commons
Olympic Library
Proceedings - 2012 International Conference on Computer Science and Information Processing, CSIP 2012 English As one of the WEB2.0 technologies, wiki technology emerged in the early 21st century after Beijing Olympics. This study explores how to put such technology into application by effectively using the Olympic legacy of Beijing Olympic Games - Olympic Library. It uses literature methods, and combines with practical work experiences. Firstly, it introduces the overview of the Library of Capital Institute of Physical Education in the Olympic Library Project, and the development of Olympic Library in post-Olympic period; then, it collates the basic concepts of information commons (IC) within the industry, including the concept of IC, composing elements and the IC construction profiles in national libraries; finally, based on the existing conditions and the Olympic libraries advantages and combined with the rapid development of the digital environment, it discusses the application of wiki technology, the principles and ideas to achieve the innovative development of Olympic Library through the construction of information commons. 0 0
Analyzing design tradeoffs in large-scale socio-technical systems through simulation of dynamic collaboration patterns Dorn C.
Edwards G.
Medvidovic N.
Collaboration Patterns
Design Tools and Techniques
Large-scale Socio-Technical Systems
System Simulation
Lecture Notes in Computer Science English Emerging online collaboration platforms such as Wikipedia, Twitter, or Facebook provide the foundation for socio-technical systems where humans have become both content consumer and provider. Existing software engineering tools and techniques support the system engineer in designing and assessing the technical infrastructure. Little research, however, addresses the engineer's need for understanding the overall socio-technical system behavior. The effect of fundamental design decisions becomes quickly unpredictable as multiple collaboration patterns become integrated into a single system. We propose the simulation of human and software elements at the collaboration level. We aim for detecting and evaluating undesirable system behavior such as users experiencing repeated update conflicts or software components becoming overloaded. To this end, this paper contributes (i) a language and (ii) methodology for specifying and simulating large-scale collaboration structures, (iii) example individual and aggregated pattern simulations, and (iv) evaluation of the overall approach. 0 0
Analyzing the effect of OpenStreetMap during crises: The Great East Japan Earthquake Imi Y.
Hayakawa T.
Ito T.
OpenStreetMap (OSM) crisis mapping
The Great East Japan Earthquake (3.11)
Proceedings of the 2012 IEEE 14th International Conference on Commerce and Enterprise Computing, CEC 2012 English This paper shows that OpenStreetMap (OSM) played a useful role during the Great East Japan Earthquake of 2011. OSM, which provides free geographical information, is sometimes referred to as a map version of Wikipedia. Its data additions, updates, and corrections are made available by its participants. We compare the data before and after the Great East Japan Earthquake and analyze the transitions. An increase in the amount of OSM editing was observed just after the earthquake. For example, the distance of roads was about six times greater and the area of buildings was approximately five times larger. The number of edited satellite photos nearly doubled. Data collection was carried out in OSM as soon Abstract: possible after the earthquake. 0 0
Analyzing user click paths in a Wikipedia navigation game Denis Helic MIPRO 2012 - 35th International Convention on Information and Communication Technology, Electronics and Microelectronics - Proceedings English Due to the enormous success of Web search technology navigation became only a second-class information seeking strategy on the Web. However, numerous studies highlight the importance of navigation as an alternative information retrieval technique to search. These studies provide evidences that the most efficient information finding occurs in the settings where search and navigation seamlessly integrate and complement each other. Recently, the research community has also recognized the importance of understanding the human navigation behavior since the knowledge on how users navigate helps in designing optimal navigation structures. In this paper we try to gain more insight in how users navigate towards a known target page in Wikipedia. To that end, we conduct an initial analysis of user click paths from a Wikipedia navigation game. In addition, we compare the structure of Wikipedia navigational paths with the structure of search paths in social networks and routing paths in general complex networks. 0 0
Annotating words using wordnet semantic glosses Szymanski J.
Duch W.
Natural Language Processing
Word Sense Disambiguation
Lecture Notes in Computer Science English An approach to the word sense disambiguation (WSD) relaying on the WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-the-art WSD methods indicates that the use of WordNet relations and semantically tagged glosses should enhance accuracy of word disambiguation methods. 0 0
Annotation of adversarial and collegial social actions in discourse Bracewell D.B.
Tomlinson M.T.
Brunson M.
Plymale J.
Bracewell J.
Boerger D.
LAW 2012 - 6th Linguistic Annotation Workshop, In Conjunction with ACL 2012 - Proceedings English We posit that determining the social goals and intentions of dialogue participants is crucial for understanding discourse taking place on social media. In particular, we examine the social goals of being collegial and being adversarial. Through our early experimentation, we found that speech and dialogue acts are not able to capture the complexities and nuances of the social intentions of discourse participants. Therefore, we introduce a set of 9 social acts specifically designed to capture intentions related to being collegial and being adversarial. Social acts are pragmatic speech acts that signal a dialogue participant's social intentions. We annotate social acts in discourses communicated in English and Chinese taken from Wikipedia talk pages, public forums, and chat transcripts. Our results show that social acts can be reliably understood by annotators with a good level of inter-rater agreement. 0 0
Análisis de enlaces hacia Bibliotecas y Archivos Digitales de Patrimonio Cultural desde Wikipedia en español y catalán Tomás Saorín-Pérez
Emilio J. Rodríguez-Posada
BiD: textos universitaris de biblioteconomia i documentació Spanish Objetivo. Describir y evaluar el uso en Wikipedia de enlaces a las colecciones digitalizadas en bibliotecas,

archivos y otras instituciones culturales.

Metodología. El estudio se realiza sobre la totalidad de los artículos de las ediciones en español y catalán de Wikipedia, usando una herramienta de análisis de wikis. Se realiza una selección amplia de 81 colecciones digitales españolas de diferente alcance. También se toman datos de otros proyectos de digitalización para poder comparar los resultados.

Resultados. Se muestra una presencia aún débil de enlaces desde Wikipedia, excepto para la Biblioteca Virtual Miguel de Cervantes, cuyas magnitudes son sensiblemente diferentes. Algunas colecciones especializadas son más usadas, pero en general se aprecia una falta de atención hacia estas colecciones desde el colectivo de editores de Wikipedia, lo cual debería tenerse en cuenta en el desarrollo de los

proyectos de digitalización tipo Europeana.
5 0
Applying conflict management process to wiki communities De Melo Bezerra J.
Hirata C.M.
Conflict analysis
Conflict management process
Conflict response mechanisms
Wiki community
Lecture Notes in Business Information Processing English Conflicts are disagreements among members and imply incompatible goals, whishes and interests. Unhandled conflicts can negatively impact group performance and members' satisfaction. In virtual communities, members discuss when performing collaboratively online tasks so that conflicts can arise. Wiki communities are popular virtual communities that involve an expressive number of members for the online production of articles. Conflicts in wiki context are then critical, being responsible for damaging articles' quality and even wiki credibility. We propose a management process that includes activities for identification, analysis, response, and monitoring and control of conflicts for wiki communities. In order to explain the activities and evaluate the process, we use Wikipedia. 0 0
Approach for building ontology automatically based on Wikipedia Wu T.
Xiao K.
Tan X.
Knowledge Extraction
ICIC Express Letters, Part B: Applications English Building ontology is groundwork of many web 2.0 applications. As one of the most important public knowledge bases, Wikipedia has a lot of comparative advantages in the research field. In this paper, we propose a new method for extracting domain-oriented semantic knowledge from Wikipedia. During the process, every category in domain is assigned a weight, so that we can calculate the score of articles. As a result, a light ontology of software domain is built automatically with the semantic knowledge. Besides, the semantic knowledge is evaluated manually. 0 0
Approximate semantic matching of heterogeneous events Hasan S.
O'Riain S.
Curry E.
Approximate event matching
Semantic decoupling
Semantic event matching
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems, DEBS'12 English Event-based systems have loose coupling within space, time and synchronization, providing a scalable infrastructure for information exchange and distributed workflows. However, event-based systems are tightly coupled, via event subscriptions and patterns, to the semantics of the underlying event schema and values. The high degree of semantic heterogeneity of events in large and open deployments such as smart cities and the sensor web makes it difficult to develop and maintain event-based systems. In order to address semantic coupling within event-based systems, we propose vocabulary free subscriptions together with the use of approximate semantic matching of events. This paper examines the requirement of event semantic decoupling and discusses approximate semantic event matching and the consequences it implies for event processing systems. We introduce a semantic event matcher and evaluate the suitability of an approximate hybrid matcher based on both thesauri-based and distributional semantics-based similarity and relatedness measures. The matcher is evaluated over a structured representation of Wikipedia and Freebase events. Initial evaluations show that the approach matches events with a maximal combined precision-recall F1 score of 75.89% on average in all experiments with a subscription set of 7 subscriptions. The evaluation shows how a hybrid approach to semantic event matching outperforms a single similarity measure approach. Copyright 0 0
Arabic retrieval revisited: Morphological hole filling Kareem Darwish
Ali A.M.
50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference English Due to Arabic's morphological complexity, Arabic retrieval benefits greatly from morphological analysis - particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level morphological transformation that is trained using Wikipedia hypertext to page title links. The use of our model yields statistically significant improvements in Arabic retrieval over the use of the best statistical stemming technique. The technique can potentially be applied to other languages. 0 0
ArchaeoApp Rome Edition (AARE): Making invisible sites visible: e-business aspects of historic knowledge discovery via mobile devices Holzinger K.
Koiner G.
Kosec P.
Fassold M.
Andreas Holzinger
Information retrieval on mobile devices
Knowledge management
DCNET 2012, ICE-B 2012, OPTICS 2012 - Proceedings of the International Conference on Data Communication Networking, e-Business and Optical Communication Systems, ICETE English Rome is visited by 7 to 10 million tourists per year, many of them interested in historical sites. Most sites that are described in tourist guides (printed or online) are archaeological sites; we can call them visible archaeological sites. Unfortunately, even visible archaeological sites in Rome are barely marked - and invisible sites are completely ignored. In this paper, we present the ArchaeoApp Rome Edition (AARE). The novelty is not just to mark the important, visible, barely known sites, but to mark the invisible sites, consequently introducing a completely novel type of site to the tourist guidance: historical invisible sites. One challenge is to get to reliable, historic information on demand. A possible approach is to retrieve the information from Wikipedia directly. The second challenge is that most of the end users have no Web-access due to the high roaming costs. The third challenge is to address a balance between the best platform available and the most used platform. For e-Business purposes, it is of course necessary to support the highest possible amount of various mobile platforms (Android, iOS and Windows Phone). The advantages of AARE include: no roaming costs, data update on demand (when connected to Wi-Fi, e.g. at a hotel, at a public hotspot, etc.. for free), automatic nearby notification of invisible sites (markers) with a Visual- Auditory-Tactile technique to make invisible sites visible. 0 0
Architecture-driven modeling of adaptive collaboration structures in large-scale social web applications Dorn C.
Taylor R.N.
Adaptation Flexibility
Collaboration Patterns
Design Tools and Techniques
Lecture Notes in Computer Science English Internet-based, large-scale systems provide the technical foundation for massive online collaboration forms such as social networks, crowdsourcing, content sharing, or source code generation. Such systems are typically designed to adapt at the software level to achieve availability and scalability. They, however, remain mostly unaware of the changing requirements of the various ongoing collaborations. As a consequence, cooperative efforts cannot grow and evolve as easily nor efficiently as they need to. An adaptation mechanism needs to become aware of a collaboration's structure and flexibility to consider changing collaboration requirements during system reconfiguration. To this end, this paper presents the human Architecture Description Language (hADL) for describing the envisioned collaboration dynamics. Inspired by software architecture concepts, hADL introduces human components and collaboration connectors for describing the underlying human coordination dependencies. We further outline a methodology for designing collaboration patterns based on a set of fundamental principles that facilitate runtime adaptation. An exemplary model transformation demonstrates hADL's feasibility. It produces the group permission configuration for MediaWiki in reaction to changing collaboration conditions. 0 0
Are buildings only instances? Exploration in architectural style categories Goel A.
Juneja M.
Jawahar C.V.
ACM International Conference Proceeding Series English Instance retrieval has emerged as a promising research area with buildings as the popular test subject. Given a query image or region, the objective is to find images in the database containing the same object or scene. There has been a recent surge in efforts in finding instances of the same building in challenging datasets such as the Oxford 5k dataset [19], Oxford 100k dataset and the Paris dataset [20]. We ascend one level higher and pose the question: Are Buildings Only Instances? Buildings located in the same geographical region or constructed in a certain time period in history often follow a specific method of construction. These architectural styles are characterized by certain features which distinguish them from other styles of architecture. We explore, beyond the idea of buildings as instances, the possibility that buildings can be categorized based on the architectural style. Certain characteristic features distinguish an architectural style from others. We perform experiments to evaluate how characteristic information obtained from low-level feature configurations can help in classification of buildings into architectural style categories. Encouraged by our observations, we mine characteristic features with semantic utility for different architectural styles from our dataset of European monuments. These mined features are of various scales, and provide an insight into what makes a particular architectural style category distinct. The utility of the mined characteristics is verified from Wikipedia. 0 0
Are human-input seeds good enough for entity set expansion? Seeds rewriting by leveraging Wikipedia semantic knowledge Qi Z.
Kang Liu
Jun Zhao
Information extraction
Seed rewrite
Semantic knowledge
Lecture Notes in Computer Science English Entity Set Expansion is an important task for open information extraction, which refers to expanding a given partial seed set to a more complete set that belongs to the same semantic class. Many previous researches have proved that the quality of seeds can influence expansion performance a lot since human-input seeds may be ambiguous, sparse etc. In this paper, we propose a novel method which can generate new, high-quality seeds and replace original, poor-quality ones. In our method, we leverage Wikipedia as a semantic knowledge to measure semantic relatedness and ambiguity of each seed. Moreover, to avoid the sparseness of the seed, we use web resources to measure its population. Then new seeds are generated to replace original, poor-quality seeds. Experimental results show that new seed sets generated by our method can improve entity expansion performance by up to average 9.1% over original seed sets. 0 0
Assessing quality values of Wikipedia articles using implicit positive and negative ratings Yu Suzuki Edit history
Lecture Notes in Computer Science English In this paper, we propose a method to identify high-quality Wikipedia articles by mutually evaluating editors and text using implicit positive and negative ratings. One of major approaches for assessing Wikipedia articles is a text survival ratio based approach. However, the problem of this approach is that many low quality articles are misjudged as high quality, because of two issues. This is because, every editor does not always read the whole articles. Therefore, if there is a low quality text at the bottom of a long article, and the text have not seen by the other editors, then the text survives beyond many edits, and the survival ratio of the text is high. To solve this problem, we use a section or a paragraph as a unit of remaining instead of a whole page. This means that if an editor edits an article, the system treats that the editor gives positive ratings to the section or the paragraph that the editor edits. This is because, we believe that if editors edit articles, the editors may not read the whole page, but the editors should read the whole sections or paragraphs, and delete low-quality texts. From experimental evaluation, we confirmed that the proposed method could improve the accuracy of quality values for articles. 0 0
Assessing the accuracy and quality of Wikipedia entries compared to popular online encyclopaedias Imogen Casebourne
Chris Davies
Michelle Fernandes
Naomi Norman
English 8 0
Assessing the relationship between context, user preferences, and content in search behaviour Knaeusl H.
Ludwig B.
Eye tracking
Preference elicitation
Reading behaviour
International Conference on Information and Knowledge Management, Proceedings English Searching information by using search engines and browsers is a tedious task for users. Navigational and informational search tasks are complicated by the fact that web servers always provide complete web pages and do not tailor their content to the user's current information need. In this paper, we present a proposal for the application of contextaware recommendation techniques to simulate human decision making when selecting elements of content to be included in an answer to an information need. As a first step towards live generation of content, we present results on our experimental study to capture decision criteria for this selection problem that web users apply in choosing content. These preferences could then later be formalized in terms of a knowledge-based context-aware and personalized model for recommending content during information search. Copyright 0 0
Assessment of collaborative learning experiences by graphical analysis of wiki contributions Manuel Palomo-Duarte
Juan Manuel Dodero-Beardo
Inmaculada Medina-Bulo
Emilio J. Rodríguez-Posada
Iván Ruiz-Rube
Computer-supported collaborative learning
E-Learning assessment
Data visualization
Graphical analysis tool
Interactive Learning Environments English The widespread adoption of computers and Internet in our life has reached the classrooms, where Computer-Supported Collaborative Learning based on wikis offers new ways of collaboration and encourages student participation. When the number of contributions from students increases, traditional assessment procedures of e-learning settings suffer from scalability problems. In a wiki-based learning experience, automatic tools are required to support the assessment of such huge amounts of data. In this work we present StatMediaWiki, a tool that collects and aggregates information that helps to analyze a MediaWiki installation. It generates charts, tables and different statistics enabling easy analysis of wiki evolution.. We have used StatMediaWiki in a Higher Education course and present the results obtained in this case study. 14 0
Author disambiguation using wikipedia-based explicit semantic analysis Kang I.-S. Author Disambiguation
Explicit Semantic Analysis
Topical Representation
Lecture Notes in Computer Science English Author disambiguation suffers from the shortage of topical terms to identify authors. This study attempts to augment term-based topical representation of authors with the concept-based one obtained from Wikipedia-based explicit semantic analysis (ESA). Experiments showed that the use of additional ESA concepts improves author-resolving performance by 13.5%. 0 0
Automatic Document Topic Identification using Wikipedia Hierarchical Ontology Hassan M.M.
Fakhri Karray
Kamel M.S.
2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012 English The rapid growth in the number of documents available to end users from around the world has led to a greatly-increased need for machine understanding of their topics, as well as for automatic grouping of related documents. This constitutes one of the main current challenges in text mining. In this work, a novel technique is proposed, to automatically construct a background knowledge structure in the form of a hierarchical ontology, using one of the largest online knowledge repositories: Wikipedia. Then, a novel approach is presented to automatically identify the documents' topics based on the proposed Wikipedia Hierarchical Ontology (WHO). Results show that the proposed model is efficient in identifying documents' topics, and promising, as it outperforms the accuracy of the other conventional algorithms for document clustering. 0 0
Automatic classification and relationship extraction for multi-lingual and multi-granular events from Wikipedia Hienert D.
Wegener D.
Paulheim H.
Historical events
CEUR Workshop Proceedings English Wikipedia is a rich data source for knowledge from all domains. As part of this knowledge, historical and daily events (news) are collected for different languages on special pages and in event portals. As only a small amount of events is available in structured form in DBpedia, we extract these events with a rule-based approach from Wikipedia pages. In this paper we focus on three aspects: (1) extending our prior method for extracting events for a daily granularity, (2) the automatic classification of events and (3) finding relationships between events. As a result, we have extracted a data set of about 170,000 events covering different languages and granularities. On the basis of one language set, we have automatically built categories for about 70% of the events of another language set. For nearly every event, we have been able to find related events. 0 0
Automatic extraction of semantic concept-relation triple pattern from wikipedia articles Choi J.
Choi C.
Choi D.
Jihie Kim
Kim P.
Link grammar
Ontology population
Term extraction
Triple extraction
Information English In the existing methods for building and extension of Ontology, there is a manually worked form by professionals or a semi-automated method method uses the probability distribution of statistics through analysis of universal dictionary or thesaurus group. If it is produced manually, its accuracy for concept extraction and relation production is excellent, but it requires a lot of time and money. To solve these problems in the semi-automated method, there are differences in the interpretation of words tagged when analyzing the text and it relies on the universal dictionaries or learning articles for concept and relation extraction. This has a disadvantage that the Ontology building and extension are limited before referenced article is modified. For this, this paper analyzed the link pattern of Link Grammar after extracting the terminology within the Wikipedia articles which represent the collective intelligence, and proposed the domain Ontology extension method that extracts the Triple pattern describing the relation between concepts. The order has been determined by assigning die weight for each extracted relation and concept through the proposed method. For the results of this test, after extracting concept-relation of 5, 100 key sentences extracted from the Wikipedia articles using Link Grammar, the results were evaluated. 0 0
Automatic query expansion based on tag recommendation Oliveira V.
Gomes G.
Belem F.
Brandao W.
Jussara Almeida
Ziviani N.
Goncalves M.
Query expansion
Tag recommendation
ACM International Conference Proceeding Series English We here propose a new method for expanding entity related queries that automatically filters, weights and ranks candidate expasion terms extracted from Wikipedia articles related to the original query. Our method is based on state-of-the-art tag recommendation methods that exploit heuristic metrics to estimate the descriptive capacity of a given term. Originally proposed for the context of tags, we here apply these recommendation methods to weight and rank terms extracted from multiple fields of Wikipedia articles according to their relevance for the article. We evaluate our method comparing it against three state-of-the-art baselines in three collections. Our results indicate that our method outperforms all baselines in all collections, with relative gains in MAP of up to 14% against the best ones. 0 0
Automatic subject metadata generation for scientific documents using wikipedia and genetic algorithms Joorabchi A.
Mahdi A.E.
Genetic algorithms
Keyphrase annotation
Keyphrase indexing
Scientific digital libraries
Subject metadata
Text mining
Lecture Notes in Computer Science English Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of our method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods. 0 0
Automatic taxonomy extraction in different languages using wikipedia and minimal language-specific information Dominguez Garcia R.
Schmidt S.
Rensing C.
Steinmetz R.
Hyponymy Detection
Multilingual large-scale taxonomies
Natural Language Processing
Data mining
Lecture Notes in Computer Science English Knowledge bases extracted from Wikipedia are particularly useful for various NLP and Semantic Web applications due to their co- verage, actuality and multilingualism. This has led to many approaches for automatic knowledge base extraction from Wikipedia. Most of these approaches rely on the English Wikipedia as it is the largest Wikipedia version. However, each Wikipedia version contains socio-cultural knowledge, i.e. knowledge with relevance for a specific culture or language. In this work, we describe a method for extracting a large set of hyponymy relations from the Wikipedia category system that can be used to acquire taxonomies in multiple languages. More specifically, we describe a set of 20 features that can be used for for Hyponymy Detection without using additional language-specific corpora. Finally, we evaluate our approach on Wikipedia in five different languages and compare the results with the WordNet taxonomy and a multilingual approach based on interwiki links of the Wikipedia. 0 0
Automatic typing of DBpedia entities Aldo Gangemi
Nuzzolese A.G.
Valentina Presutti
Draicchio F.
Alberto Musetti
Paolo Ciancarini
Lecture Notes in Computer Science English We present Tìpalo, an algorithm and tool for automatically typing DBpedia entities. Tìpalo identifies the most appropriate types for an entity by interpreting its natural language definition, which is extracted from its corresponding Wikipedia page abstract. Types are identified by means of a set of heuristics based on graph patterns, disambiguated to WordNet, and aligned to two top-level ontologies: WordNet supersenses and a subset of DOLCE+DnS Ultra Lite classes. The algorithm has been tuned against a golden standard that has been built online by a group of selected users, and further evaluated in a user study. 0 0
Automatic vandalism detection in Wikipedia with active associative classification Maria Sumbana
Goncalves M.A.
Rodrigo Silva
Jussara Almeida
Adriano Veloso
Lecture Notes in Computer Science English Wikipedia and other free editing services for collaboratively generated content have quickly grown in popularity. However, the lack of editing control has made these services vulnerable to various types of malicious actions such as vandalism. State-of-the-art vandalism detection methods are based on supervised techniques, thus relying on the availability of large and representative training collections. Building such collections, often with the help of crowdsourcing, is very costly due to a natural skew towards very few vandalism examples in the available data as well as dynamic patterns. Aiming at reducing the cost of building such collections, we present a new active sampling technique coupled with an on-demand associative classification algorithm for Wikipedia vandalism detection. We show that our classifier enhanced with a simple undersampling technique for building the training set outperforms state-of-the-art classifiers such as SVMs and kNNs. Furthermore, by applying active sampling, we are able to reduce the need for training in almost 96% with only a small impact on detection results. 0 0
Avoimen suomenkielisen morfologian liittäminen Wikimedian hakujärjestelmään Niklas Laxström University of Helsinki Finnish In my thesis I investigated the feasibility of using a Finnish morphology implementation with the Lucene search system. With the same Lucene-search package that is used by the Wikimedia Foundation I built two search indexes: one with the existing Porter stemming algorithm and the other one with morphological analysis. The corpus I used was the current text dump of Finnish Wikipedia. [...] See http://laxstrom.name/blag/2012/02/13/exploring-the-states-of-open-source-search-stack-supporting-finnish/ 9 0
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network Roberto Navigli
Ponzetto S.P.
Graph algorithms
Knowledge acquisition
Semantic networks
Word sense disambiguation
Artificial Intelligence English We present an automatic approach to the construction of BabelNet, a very large, wide-coverage multilingual semantic network. Key to our approach is the integration of lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition, Machine Translation is applied to enrich the resource with lexical information for all languages. We first conduct in vitro experiments on new and existing gold-standard datasets to show the high quality and coverage of BabelNet. We then show that our lexical resource can be used successfully to perform both monolingual and cross-lingual Word Sense Disambiguation: thanks to its wide lexical coverage and novel semantic relations, we are able to achieve state-of the-art results on three different SemEval evaluation tasks. © 2012 Elsevier B.V. 0 0
Behind the Article: Recognizing Dialog Acts in Wikipedia Talk Pages Oliver Ferschke
Iryna Gurevych
Yevgen Chebotar
Talk Pages
Discourse Analysis
Work Coordination
Information quality
Proceedings of the 13th Conference of the European Chapter of the ACL (EACL 2012) In this paper, we propose an annotation schema for the discourse analysis of Wikipedia Talk pages aimed at the coordination efforts for article improvement. We apply the annotation schema to a corpus of 100 Talk pages from the Simple English Wikipedia and make the resulting dataset freely available for download1 . Furthermore, we perform automatic dialog act classification on Wikipedia discussions and achieve an average F1 -score of 0.82 with our classification pipeline. 0 0
Being Where Our Faculty Are: Emerging Technology Use and Faculty Information-Seeking Workflows Bauder J.
Emanuel J.
Internet Reference Services Quarterly English Academic libraries frequently consider students' high usage of certain technologies when deciding what services to offer electronically, but less research has been done on whether faculty also use any of these technologies. This study surveyed faculty at two very different institutions, a liberal arts college and a Research I university, about the technologies they use or might consider using, with a particular focus on information and/or communication-related technologies. The results show that faculty do use many of the technologies that previous studies have associated with students. 0 0
BiCWS: Mining cognitive differences from bilingual web search results Xiangji Huang
Wan X.
Jie Xiao
Comparative Text Mining
Cross Lingual Text Mining
Information retrieval
Lecture Notes in Computer Science English In this paper we propose a novel comparative web search system - BiCWS, which can mine cognitive differences from web search results in a multi-language setting. Given a topic represented by two queries (they are the translations of each other) in two languages, the corresponding web search results for the two queries are firstly retrieved by using a general web search engine, and then the bilingual facets for the topic are mined by using a bilingual search results clustering algorithm. The semantics in Wikipedia are leveraged to improve the bilingual clustering performance. After that, the semantic distributions of the search results over the mined facets are visually presented, which can reflect the cognitive differences in the bilingual communities. Experimental results show the effectiveness of our proposed system. 0 0
Bieber no more: First Story Detection using Twitter and Wikipedia Miles Osborne
Saša Petrović
Richard McCreadie
Craig Macdonald
Iadh Ounis
Event Detection
English Twitter is a well known source of information regarding breaking news stories. This aspect of Twitter makes it ideal for identifying events as they happen. However, a key problem with Twitter-driven event detection approaches is that they produce many spurious events, i.e., events that are wrongly detected or simply are of no interest to anyone. In this paper, we examine whether Wikipedia (when viewed

as a stream of page views) can be used to improve the quality of discovered events in Twitter. Our results suggest that Wikipedia is a powerful filtering mechanism, allowing for easy blocking of large numbers of spurious events. Our results also indicate that events within Wikipedia tend to lag

behind Twitter.
0 0
Bill gates is not a parking meter: Philosophical quality control in automated ontology-building Cathy Legg
Samuel Sarjant
AISB/IACAP World Congress 2012: Computational Philosophy, Part of Alan Turing Year 2012 English The somewhat old-fashioned concept of philosophical categories is revived and put to work in automated ontology building. We describe a project harvesting knowledge from Wikipedia's category network in which the principled ontological structure of Cyc was leveraged to furnish an extra layer of accuracy-checking over and above more usual corrections which draw on automated measures of semantic relatedness. 0 0
Biographical Social Networks on Wikipedia: A cross-cultural study of links that made history Pablo Aragón
Andreas Kaltenbrunner
David Laniado
Yana Volkovich
Social network analysis
Cross language studies
WikiSym English It is arguable whether history is made by great men and women or vice versa, but undoubtably social connections shape history. Analysing Wikipedia, a global collective memory place, we aim to understand how social links are recorded across cultures. Starting with the set of biographies in the English Wikipedia we focus on the networks of links between these biographical articles on the 15 largest language Wikipedias. We detect the most central characters in these networks and point out culture-related peculiarities. Furthermore, we reveal remarkable similarities between distinct groups of language Wikipedias and highlight the shared knowledge about connections between persons across cultures. 0 0
Biographical social networks on Wikipedia: A cross-cultural study of links that made history Aragon P.
David Laniado
Andreas Kaltenbrunner
Yana Volkovich
Cross language studies
Social network analysis
WikiSym 2012 English It is arguable whether history is made by great men and women or vice versa, but undoubtably social connections shape history. Analysing Wikipedia, a global collective memory place, we aim to understand how social links are recorded across cultures. Starting with the set of biographies in the English Wikipedia we focus on the networks of links between these biographical articles on the 15 largest language Wikipedias. We detect the most central characters in these networks and point out culture-related peculiarities. Furthermore, we reveal remarkable similarities between distinct groups of language Wikipedias and highlight the shared knowledge about connections between persons across cultures. 0 0
Bioqueries: A social community sharing experiences while querying Biological Linked Data Garcia-Godoy M.J.
Navas-Delgado I.
Aldana-Montes J.
Life sciences
Linked Open Data
Semantic web
ACM International Conference Proceeding Series English Life Sciences have emerged as a key domain in the Linked Data community because of the diversity of data semantics and formats available by means of a great variety of databases and web technologies. Thus, it has been used as the perfect domain for applications in the Web of Data. Unfortunately, on the one hand, bioinformaticians are not exploiting the full potential of this already available technology and, on the other hand, the experts in Life Sciences have real problems to discover, understand and devise how to take advantage of these interlinked (integrated) data. In this paper, we present Bioqueries, a wiki-based portal that is aimed at community building around Biological Linked Data. This public space offers several services and a collaborative infrastructure with the objective of stimulating the generation of activity in the consumption of Biological Linked Data and therefore contributing to the deployment of the benefits of the Web of Data in this domain. This tool is not only designed to aid bioinformaticians when designing SPARQL queries to access biological databases exposed as Linked Data but also aid biologists to gain a deeper insight into the potential use of this technology. These queries published in the portal are also described and commented on natural language, to enable their use by experts in the domain but with less expertise in semantic technologies. Copyright 0 0
Bootstrapping wikis: developing critical mass in a fledgling community by seeding content Jacob Solomon
Rick Wash
Critical mass
Online contribution
Computer-Supported Cooperative Work English 0 0
Bots and cyborgs: Wikipedia's immune system Aaron Halfaker
John Riedl
Social computing
Computer English Bots and cyborgs are more than tools to better manage content quality on Wikipediathrough their interaction with humans, they're fundamentally changing its culture. 0 0
Breaking news on Wikipedia: Dynamics, structures, and roles in high-tempo collaboration Brian C. Keegan Breaking news
Current events
Network analysis
Social network
Social role
English The goal of my research is to evaluate how distributed virtual teams are able to use socio-technical systems like Wikipedia to self-organize and respond to complex tasks. I examine the roles Wikipedians adopt to synthesize content about breaking news events out of a noisy and complex information space. Using data from Wikipedia's revision histories as well as from other sources like IRC logs, I employ methods in content analysis, statistical network analysis, and trace ethnography to illuminate the multilevel processes which sustain these temporary collaborations as well as the dynamics of how they emerge and dissolve. 0 0
Bricking Semantic Wikipedia by relation population and predicate suggestion Haofen Wang
Linyun Fu
Yiqin Yu
Predicate suggestion
Relation classification
Relation population
Semantic Wikipedia
Web Intelligence and Agent Systems English Semantic Wikipedia aims to enhance Wikipedia by adding explicit semantics to links between Wikipedia entities. However, we have observed that it currently suffers the following limitations: lack of semantic annotations and lack of semantic annotators. In this paper, we resort to relation population to automatically extract relations between any entity pair to enrich semantic data, and predicate suggestion to recommend proper relation labels to facilitate semantic annotating. Both tasks leverage relation classification which tries to classify extracted relation instances into predefined relations. However, due to the lack of labeled data and the excessiveness of noise in Semantic Wikipedia, existing approaches cannot be directly applied to these tasks to obtain high-quality annotations. In this paper, to tackle the above problems brought by Semantic Wikipedia, we use a label propagation algorithm and exploit semantic features like domain and range constraints on categories as well as linguistic features such as dependency trees of context sentences in Wikipedia articles. The experimental results on 7 typical relation types show the effectiveness and efficiency of our approach in dealing with both tasks. © 2012-IOS Press and the authors. All rights reserved. 0 0
Building a large scale knowledge base from Chinese Wiki Encyclopedia Zhe Wang
Jing-Woei Li
Pan J.Z.
Knowledge base
Linked data
Semantic web
Lecture Notes in Computer Science English DBpedia has been proved to be a successful structured knowledge base, and large scale Semantic Web data has been built by using DBpedia as the central interlinking-hubs of the Web of Data in English. But in Chinese, due to the heavily imbalance in size (no more than one tenth) between English and Chinese in Wikipedia, there are few Chinese linked data are published and linked to DBpedia, which hinders the structured knowledge sharing both within Chinese resources and cross-lingual resources. This paper aims at building large scale Chinese structured knowledge base from Hudong, which is one of the largest Chinese Wiki Encyclopedia websites. In this paper, an upper-level ontology schema in Chinese is first learned based on the category system and Infobox information in Hudong. Totally, there are 19542 concepts are inferred, which are organized in hierarchy with maximally 20 levels. 2381 properties with domain and range information are learned according to the attributes in the Hudong Infoboxes. Then, 802593 instances are extracted and described using the concepts and properties in the learned ontology. These extracted instances cover a wide range of things, including persons, organizations, places and so on. Among all the instances, 62679 of them are linked to identical instances in DBpedia. Moreover, the paper provides RDF dump or SPARQL to access the established Chinese knowledge base. The general upper-level ontology and wide coverage makes the knowledge base a valuable Chinese semantic resource. It not only can be used in Chinese linked data building, the fundamental work for building multi lingual knowledge base across heterogeneous resources of different languages, but also can largely facilitate many useful applications of large-scale knowledge base such as knowledge question-answering and semantic search. 0 0
Building a standpoints web to support decision-making in Wikipedia Jodi Schneider Collaboration
Decision rationale
Online argumentation
English Although the Web enables large-scale collaboration, its potential to support group decision-making has not been fully exploited. My research aims to analyze, extract, and represent disagreement in purposeful social web conversations. This supports decision-making in distributed groups by representing individuals' claims and their justifications in a "Standpoints Web", a hypertext web interlinking the claims and justifications made throughout the social web. The two main contributions of my dissertation are an architecture for the Standpoints Web and a case study implementing the Standpoints Web for Wikipedia's deletion discussions. 0 0
Building disease and target knowledge with Semantic MediaWiki Harland L.
Marshall C.
Gardner B.
Chang M.
Head R.
Verdemato P.
Disease maps
Drug target
Knowledge management
Semantic MediaWiki
Open Source Software in Life Science Research: Practical Solutions to Common Challenges in the Pharmaceutical Industry and Beyond English The efficient flow of both formal and tacit knowledge is critical in the new era of information-powered pharmaceutical discovery. Yet, one of the major inhibitors of this is the constant flux within the industry, driven by rapidly evolving business models, mergers and collaborations. A continued stream of new employees and external partners brings a need to not only manage the new information they generate, but to find and exploit existing company results and reports. The ability to synthesise this vast information 'substrate' into actionable intelligence is crucial to industry productivity. In parallel, the new 'digital biology' era provides yet more and more data to find, analyse and exploit. In this chapter we look at the contribution that Semantic MediaWiki (SMW) technology has made to meeting the information challenges faced by Pfizer. We describe two use-cases that highlight the flexibility of this software and the ultimate benefit to the user. © 2012 Woodhead Publishing Limited. All rights reserved. 0 0
Building for social translucence: A domain analysis and prototype system David W. McDonald
Stephanie Gokhman
Mark Zachry
Social translucence
System architecture
English The relationships and work that facilitate content creation in large online contributor system are not always visible. Social translucence is a stance toward the design of systems that allows users to better understand collaborative system participation through awareness of contributions and interactions. Like many socio-technical constructs, social translucence is not something that can be simply added after a system is built; it should be at the core of system design. In this paper, we conduct a domain analysis to understand the space of architectural support required to facilitate social translucence in systems. We describe an instantiation of those requirements as a system architecture that relies on data from Wikipedia and illustrate how translucence can be propagated to some basic visualizations which we have created for Wikipedia users. We close with some reflections on the state of social translucence research and some openings for this important design perspective. 0 0
Building up a class hierarchy with properties from Japanese Wikipedia Takeshi Morita
Sekimoto Y.
Susumu Tamagawa
Takahira Yamaguchi
Ontology learning
Proceedings - 2012 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2012 English Japanese Wikipedia Ontology, which we have constructed semi-automatically from Japanese Wikipedia, has problems of lacking upper classes and appropriate definition of properties. The purpose of our research is to complement the upper classes in Japanese Wikipedia Ontology and build up a class hierarchy with properties by integrating Japanese Wikipedia Ontology and Japanese Word Net. In this paper, we propose a method to build up the class hierarchy with properties by lifting up common properties that are defined in sibling classes to more upper classes in Japanese Wikipedia Ontology. We also introduce an attempt to integrate Japanese Wikipedia Ontology and Japanese Word Net. 0 0
CARPO: Correlation-aware power optimization in data center networks Xiaolong Wang
Yao Y.
Lu K.
Cao Q.
Proceedings - IEEE INFOCOM English Power optimization has become a key challenge in the design of large-scale enterprise data centers. Existing research efforts focus mainly on computer servers to lower their energy consumption, while only few studies have tried to address the energy consumption of data center networks (DCNs), which can account for 20% of the total energy consumption of a data center. In this paper, we propose CARPO, a correlation-aware power optimization algorithm that dynamically consolidates traffic flows onto a small set of links and switches in a DCN and then shuts down unused network devices for energy savings. In sharp contrast to existing work, CARPO is designed based on a key observation from the analysis of real DCN traces that the bandwidth demands of different flows do not peak at exactly the same time. As a result, if the correlations among flows are considered in consolidation, more energy savings can be achieved. In addition, CARPO integrates traffic consolidation with link rate adaptation for maximized energy savings. We implement CARPO on a hardware testbed composed of 10 virtual switches configured with a production 48-port OpenFlow switch and 8 servers. Our empirical results with Wikipedia traces demonstrate that CARPO can save up to 46% of network energy for a DCN, while having only negligible delay increases. CARPO also outperforms two state-of-the-art baselines by 19.6% and 95% on energy savings, respectively. Our simulation results with 61 flows also show the superior energy efficiency of CARPO over the baselines. 0 0
Capturing malicious bots using a beneficial bot and wiki Takashi Yamanoue
Kentaro Oda
Koichi Shimozono
Information security
Vandal bot
SIGUCCS English Locating malicious bots in a large network is problematic because its internal firewalls and NAT routers unintentionally contribute to hiding bots' host address and malicious packets. However, eliminating firewalls and NAT routers for merely locating bots is generally not acceptable. In this paper, we propose an easy to deploy, easy to manage network security controlling system for locating a malicious host behind the internal secure gateways. This network security controlling system consists of a remote security device and a command server. Each of the remote security devices is installed as a transparent link (implemented as a L2 switch), between the subnet and its gateway, to detect a host which is compromised with a malicious bot in a target subnet, while minimizing impact of deployment. The security devices are remote controlled by 'polling' the command server in order to eliminating NAT traversal problem and to be firewall friendly. Since the remote security device lives in transparent, remote controlled and robust to security gateways, we regard it as a beneficial bot. We adopt a web server with wiki software as the command server in order to take advantage of its power of customization, easy to use and easy to deployment of the server. 4 1
Catching the drift - Indexing implicit knowledge in chemical digital libraries Kohncke B.
Tonnies S.
Balke W.-T.
Chemical digital collections
Document ranking
Lecture Notes in Computer Science English In the domain of chemistry the information gathering process is highly focused on chemical entities. But due to synonyms and different entity representations the indexing of chemical documents is a challenging process. Considering the field of drug design, the task is even more complex. Domain experts from this field are usually not interested in any chemical entity itself, but in representatives of some chemical class showing a specific reaction behavior. For describing such a reaction behavior of chemical entities the most interesting parts are their functional groups. The restriction of each chemical class is somehow also related to the entities' reaction behavior, but further based on the chemist's implicit knowledge. In this paper we present an approach dealing with this implicit knowledge by clustering chemical entities based on their functional groups. However, since such clusters are generally too unspecific, containing chemical entities from different chemical classes, we further divide them into sub-clusters using fingerprint based similarity measures. We analyze several uncorrelated fingerprint/similarity measure combinations and show that the most similar entities with respect to a query entity can be found in the respective sub-cluster. Furthermore, we use our approach for document retrieval introducing a new similarity measure based on Wikipedia categories. Our evaluation shows that the sub-clustering leads to suitable results enabling sophisticated document retrieval in chemical digital libraries. 0 0
Categorizing search results using WordNet and Wikipedia Hemayati R.T.
Meng W.
Yu C.
Search Engine
Search Result Clustering and Categorization
Lecture Notes in Computer Science English Terms used in search queries often have multiple meanings and usages. Consequently, search results corresponding to different meanings or usages may be retrieved, making identifying relevant results inconvenient and time-consuming. In this paper, we study the problem of grouping the search results based on the different meanings and usages of a query. We build on a previous work that identifies and ranks possible categories of any user query based on the meanings and common usages of the terms and phrases within the query. We use these categories to group search results. In this paper, we study different methods, including several new methods, to assign search result record (SRRs) to the categories. Our SRR grouping framework supports a combination of categorization, clustering and query rewriting techniques. Our experimental results show that some of our grouping methods can achieve high accuracy. 0 0
Characterisation of pre-service teachers' attitude to feedback in a wiki-environment framework Peled Y.
Bar-Shalom O.
Sharon R.
Higher education
Pre-service teachers
Interactive Learning Environments English This is a two-phase research process in a wiki environment. The first phase explored (1) the significance of peer-feedback on students’ academic performance on a specific task in a wiki environment, (2) three types of peer-feedback and (3) students’ constraints on offering meaningful feedback to their peers. The objective of phase two was to determine the reasons why students refrain from peer-feedback. Significant correlation exists between the willingness to give feedback and the willingness to receive feedback. Significant correlation also exists between the difficulty of giving feedback and the difficulty of receiving feedback. Female students tend to consider feedback as the teacher's responsibility more than male students. Religious beliefs did not have any significant effect on any of the parameters tested. The results indicate that traditional students tend to be more conservative regarding feedback. There are worrying implications to pre-service teachers’ refraining from giving feedback. 0 0
Chinese named entity recognition and disambiguation based on wikipedia Yajie Miao
Yajuan L.
Qun L.
Jinsong S.
Hao X.
Named Entity Disambiguation
Named entity recognition
Communications in Computer and Information Science English This paper presents a method for named entity recognition and disambiguation based on Wikipedia. First, we establish Wikipedia database using open source tools named JWPL. Second, we extract the definition term from the first sentence of Wikipedia page and use it as external knowledge in named entity recognition. Finally, we achieve named entity disambiguation using Wikipedia disambiguation pages and contextual information. The experiments show that the use of Wikipedia features can improve the accuracy of named entity recognition. 0 0
Chinese relation extraction using web features and HNC theory Wang J.
Cheng X.
Gu X.
Hierarchical network of concepts
Relation extraction
Web features
Journal of Information and Computational Science English Chinese named-entity relation extraction is a key step in the task of Chinese information extraction. Feature-based method is one of the main methods of Chinese relation extraction. In this new method, Web co-occurrence feature and Bag-of-word (BoW) correlation feature are introduced, and the words similarity is defined based on HNC theory. Experimental results showed that the F-score was improved by this method, and both of the features are effective on Chinese relation extraction. 1548-7741/Copyright 0 0
Chinese word similarity computing Li L.
Zhe Wang
3D model
Specific corpus
Proceedings - 2012 3rd IEEE International Conference on Network Infrastructure and Digital Content, IC-NIDC 2012 English This paper studies Chinese word similarity computing. A 3D model is proposed for representing word meaning based on different points of view. The first one is the view of primitive from Hownet, the second one is the view of words' occurrence in sentences from a specific corpus, and the third one is the view of well known background knowledge from online resources. A Chinese content word is represented in a 3D model. Similarity of two words is computed according to it. Experiments on Chinese news have shown that this method could perform better than existed ones based only on one point of view. 0 0
Choosing better seeds for entity set expansion by leveraging wikipedia semantic knowledge Qi Z.
Kang Liu
Jun Zhao
Information extraction
Seed set refinement
Semantic knowledge
Communications in Computer and Information Science English Entity Set Expansion, which refers to expanding a human-input seed set to a more complete set which belongs to the same semantic category, is an important task for open information extraction. Because human-input seeds may be ambiguous, sparse etc., the quality of seeds has a great influence on expansion performance, which has been proved by many previous researches. To improve seeds quality, this paper proposes a novel method which can choose better seeds from original input ones. In our method, we leverage Wikipedia semantic knowledge to measure semantic relatedness and ambiguity of each seed. Moreover, to avoid the sparseness of the seed, we use web corpus to measure its population. Lastly, we use a linear model to combine these factors to determine the final selection. Experimental results show that new seed sets chosen by our method can improve expansion performance by up to average 13.4% over random selected seed sets. 0 0
Circadian patterns of Wikipedia editorial activity: A demographic analysis Taha Yasseri
Róbert Sumi
János Kertész
Editorial activity
Editors demography
Circadian patterns
PLoS ONE English Wikipedia (WP) as a collaborative, dynamical system of humans is an appropriate subject of social studies. Each single action of the members of this society, i.e. editors, is well recorded and accessible. Using the cumulative data of 34 Wikipedias in different languages, we try to characterize and find the universalities and differences in temporal activity patterns of editors. Based on this data, we estimate the geographical distribution of editors for each WP in the globe. Furthermore we also clarify the differences among different groups of WPs, which originate in the variance of cultural and social features of the communities of editors. 10 2
Citation needed: The dynamics of referencing in Wikipedia Chih-Chun Chen
Camille Roth
Collaborative system
WikiSym English The extent to which a Wikipedia article refers to external sources to substantiate its content can be seen as a measure of its externally invoked authority. We introduce a protocol for characterising the referencing process in the context of general article editing. With a sample of relatively mature articles, we show that referencing does not occur regularly through an article’s lifetime but is associated with periods of more substantial editing, when the article has reached a certain level of maturity (in terms of the number of times it has been revised and its length). References also tend to be contributed by editors who have contributed more frequently and more substantially to an article, suggesting that a subset of more qualified or committed editors may exist for each article. 13 1
Classification of short texts by deploying topical annotations Vitale D.
Paolo Ferragina
Ugo Scaiella
Lecture Notes in Computer Science English We propose a novel approach to the classification of short texts based on two factors: the use of Wikipedia-based annotators that have been recently introduced to detect the main topics present in an input text, represented via Wikipedia pages, and the design of a novel classification algorithm that measures the similarity between the input text and each output category by deploying only their annotated topics and the Wikipedia link-structure. Our approach waives the common practice of expanding the feature-space with new dimensions derived either from explicit or from latent semantic analysis. As a consequence it is simple and maintains a compact intelligible representation of the output categories. Our experiments show that it is efficient in construction and query time, accurate as state-of-the-art classifiers (see e.g. Phan et al. WWW '08), and robust with respect to concept drifts and input sources. 0 0
Classifying Wikipedia Articles Using Network Motif Counts and Ratios Guangyu Wu
Martin Harrigan
Pádraig Cuningham
Edit Networks
WikiSym English Because the production of Wikipedia articles is a collaborative process, the edit network around a article can tell us something about the quality of that article. Articles that have received little attention will have sparse networks; at the other end of the spectrum, articles that are Wikipedia battle grounds will have very crowded networks. In this paper we evaluate the idea of characterizing edit networks as a vector of motif counts that can be used in clustering and classification. Our objective is not immediately to develop a powerful classifier but to assess what is the signal in network motifs. We show that this motif count vector representation is effective for classifying articles on the Wikipedia quality scale. We further show that ratios of motif counts can effectively overcome normalization problems when comparing networks of radically different sizes. 0 0
Classifying Wikipedia articles using network motif counts and ratios Guangyu Wu
Martin Harrigan
Pádraig Cunningham
Edit networks
WikiSym 2012 English Because the production of Wikipedia articles is a collaborative process, the edit network around a article can tell us something about the quality of that article. Articles that have received little attention will have sparse networks; at the other end of the spectrum, articles that are Wikipedia battle grounds will have very crowded networks. In this paper we evaluate the idea of characterizing edit networks as a vector of motif counts that can be used in clustering and classification. Our objective is not immediately to develop a powerful classifier but to assess what is the signal in network motifs. We show that this motif count vector representation is effective for classifying articles on the Wikipedia quality scale. We further show that ratios of motif counts can effectively overcome normalization problems when comparing networks of radically different sizes. 0 0
Classifying image galleries into a taxonomy using metadata and wikipedia Kramer G.
Gosse Bouma
Hendriksen D.
Homminga M.
Hierarchical classification
Image gallery
Lecture Notes in Computer Science English This paper presents a method for the hierarchical classification of image galleries into a taxonomy. The proposed method links textual gallery metadata to Wikipedia pages and categories. Entity extraction from metadata, entity ranking, and selection of categories is based on Wikipedia and does not require labeled training data. The resulting system performs well above a random baseline, and achieves a (micro-averaged) F-score of 0.59 on the 9 top categories of the taxonomy and 0.40 when using all 57 categories. 0 0
Classifying the wikipedia articles into the open cyc taxonomy Pohl A. CEUR Workshop Proceedings English This article presents a method of classification of the Wikipedia articles into the taxonomy of OpenCyc. This method utilises several sources of the classification information, namely the Wikipedia category system, the infoboxes attached to the articles, the first sentences of the articles, treated as their definitions and the direct mapping between the articles and the Cyc symbols. The classification decision made using these methods are accommodated using the Cyc built-in inconsistency detection mechanism. The combination of the best classification methods yields 1.47 millions of classified articles and has a manually verified precision above 97%, while the combination of all of them yields 2.2 millions of articles with estimated precision of 93%. 0 0
Classifying trust/distrust relationships in online social networks Bachi G.
Coscia M.
Monreale A.
Giannotti F.
Graph mining
Social network
Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012 English Online social networks are increasingly being used as places where communities gather to exchange information, form opinions, collaborate in response to events. An aspect of this information exchange is how to determine if a source of social information can be trusted or not. Data mining literature addresses this problem. However, if usually employs social balance theories, by looking at small structures in complex networks known as triangles. This has proven effective in some cases, but it under performs in the lack of context information about the relation and in more complex interactive structures. In this paper we address the problem of creating a framework for the trust inference, able to infer the trust/distrust relationships in those relational environments that cannot be described by using the classical social balance theory. We do so by decomposing a trust network in its ego network components and mining on this ego network set the trust relationships, extending a well known graph mining algorithm. We test our framework on three public datasets describing trust relationships in the real world (from the social media Epinions, Slash dot and Wikipedia) and confronting our results with the trust inference state of the art, showing better performances where the social balance theory fails. 0 0
Classroom Wikipedia participation effects on future intentions to contribute Cliff Lampe
Jonathan Obar
Elif Ozkaya
Paul Zube
Alcides Velasquez
Computer-Supported Cooperative Work English One of the biggest challenges faced by social media sites like Wikipedia is how to motivate users to contribute content. Research continues to demonstrate that only a small percentage of users contribute to user-generated content sites. In this study we assess the results of a Wikimedia Foundation initiative, which had graduate and undergraduate students from 22 U.S. universities contribute content to Wikipedia articles as part of their coursework. 185 students were asked about their participation in the initiative and their intention to participate on Wikipedia in the future. Results suggest that intentions to continue contributing are influenced by the initial attitude towards the class, and the degree to which students perceived they were writing for a global audience. 7 2
Clustering Wikipedia infoboxes to discover their types Nguyen T.H.
Nguyen H.D.
Viviane Moreira
Juliana Freire
Wikipedia infobox
ACM International Conference Proceeding Series English Wikipedia has emerged as an important source of structured information on the Web. But while the success of Wikipedia can be attributed in part to the simplicity of adding and modifying content, this has also created challenges when it comes to using, querying, and integrating the information. Even though authors are encouraged to select appropriate categories and provide infoboxes that follow pre-defined templates, many do not follow the guidelines or follow them loosely. This leads to undesirable effects, such as template duplication, heterogeneity, and schema drift. As a step towards addressing this problem, we propose a new unsupervised approach for clustering Wikipedia infoboxes. Instead of relying on manually assigned categories and template labels, we use the structured information available in infoboxes to group them and infer their entity types. Experiments using over 48,000 infoboxes indicate that our clustering approach is effective and produces high quality clusters. 0 0
CoSyne: Synchronizing multilingual wiki content Bronner A.
Matteo Negri
Yashar Mehdad
Angela Fahrni
Christof Monz
Context-sensitive machine translation
Cross-lingual textual entailment
Cross-lingual topical alignment
Multilingual content synchronization
RESTful web services
User edits classification
User generated content
WikiSym 2012 English CoSyne is a content synchronization system for assisting users and organizations involved in the maintenance of multilingual wikis. The system allows users to explore the diversity of multilingual content using a monolingual view. It provides suggestions for content modification based on additional or more specific information found in other language versions, and enables seamless integration of automatically translated sentences while giving users the flexibility to edit, correct and control eventual changes to the wiki page. To support these tasks, CoSyne employs state-of-the-art machine translation and natural language processing techniques. 0 0
Coarse lexical semantic annotation with supersenses: An Arabic case study Schneider N.
Mohit B.
Oflazer K.
Smith N.A.
50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference English "Lightweight" semantic annotation of text calls for a simple representation, ideally without requiring a semantic lexicon to achieve good coverage in the language and domain. In this paper, we repurpose WordNet's supersense tags for annotation, developing specific guidelines for nominal expressions and applying them to Arabic Wikipedia articles in four topical domains. The resulting corpus has high coverage and was completed quickly with reasonable inter-annotator agreement. 0 0
Codification and collaboration: Information quality in social media Kane G.C.
Ransbotham S.
Knowledge management
International Conference on Information Systems, ICIS 2012 English This paper argues that social media combines the codification and collaboration features of earlier generations of knowledge management systems. This combination potentially changes the way knowledge is created, potentially requiring new theories and methods for understanding these processes. We forward the specialized social network method of two-mode networks as one such approach. We examine the information quality of 16,244 articles built through 2,677,397 revisions by 147,362 distinct contributors to Wikipedia's Medicine Wikiproject. We find that the structure of the contributor-artifact network is associated with information quality in these networks. Our findings have implications for managers seeking to cultivate effective knowledge creation environments using social media and to identify valuable knowledge created external to the firm. 0 0
Coercion or empowerment? Moderation of content in Wikipedia as 'essentially contested' bureaucratic rules De Laat P.B. Bureaucracy
Ethics and Information Technology English In communities of user-generated content, systems for the management of content and/or their contributors are usually accepted without much protest. Not so, however, in the case of Wikipedia, in which the proposal to introduce a system of review for new edits (in order to counter vandalism) led to heated discussions. This debate is analysed, and arguments of both supporters and opponents (of English, German and French tongue) are extracted from Wikipedian archives. In order to better understand this division of the minds, an analogy is drawn with theories of bureaucracy as developed for real-life organizations. From these it transpires that bureaucratic rules may be perceived as springing from either a control logic or an enabling logic. In Wikipedia, then, both perceptions were at work, depending on the underlying views of participants. Wikipedians either rejected the proposed scheme (because it is antithetical to their conception of Wikipedia as a community) or endorsed it (because it is consonant with their conception of Wikipedia as an organization with clearly defined boundaries). Are other open-content communities susceptible to the same kind of 'essential contestation'?. 0 0
Cognitive biases - Improving project team's work Kraus W.E. AACE International Transactions English Wow. I learned a little bit about cognitive biases and thought they must play a role in estimators not using the best judgment in their work. I performed a search in the AACE virtual library for papers containing the word bias and found I'm not the only one who thinks so, and also that biases affect much more in project controls than just estimates. In doing more reading, I found that it is a much bigger factor in our lives than I realized. That is truer in estimating than I want to think about. Wikipedia, the free online encyclopedia, in the article List of Cognitive Biases, lists a total of 113 variations on biases. How do these cognitive biases affect our estimates and how can we negate the effects? Understanding these points will hopefully allow estimators to produce estimates better meeting the needs of their projects. 0 0
Cognitive linguistics as the underlying framework for semantic annotation Pipitone A.
Pirrone R.
Cognitive Linguistics
Construction Grammar
Semantic Annotator
Proceedings - IEEE 6th International Conference on Semantic Computing, ICSC 2012 English In recent years many attempts have been made to design suitable sets of rules aimed at extracting the semantic meaning from plain text, and to achieve annotation, but very few approaches make extensive use of grammars. Current systems are mainly focused on extracting the semantic role of the entities described in the text. This approach has limitations: in such applications the semantic role is conceived merely as the meaning of the involved entities without considering their context. As an example, current semantic annotators often specify a date entity without any annotation regarding the kind of the date itself i.e. a birth date, a book publication date, and so on. Moreover, these systems use ontologies that have been developed specifically for the system's purposes and have reduced portability. Extensive use of both linguistic resources and semantic representations of the domain are needed in this scenario, the semantic representation of the domain addresses the semantic interpretation of the context, while NLP tools can help to solve some linguistic problems related to the semantic annotation, as synonymy, ambiguities, and co-references. A novel framework inspired to Cognitive Linguistics theories is proposed in this work that is aimed at facing the problem outlined above. In particular, our work is based on Construction Grammar (CxG). CxG defines a "construction" as a form-meaning couple. We use RDF triples in the domain ontology as the "semantic seeds" to build constructions. A suitable set of rules based on linguistic typology have been designed to infer semantics and syntax from the semantic seed, while combining them as the poles of constructions. A hierarchy of rules to infer syntactic patterns for either single words or sentences using Word Net and Frame Net has been designed to overcome the limitations when expressing the syntactic poles using solely the terms stated in the ontology. As a consequence, semantic annotation of plain text is achieved by computing all possible syntactic forms for the same meaning during the analysis of document corpora. The proposed framework has been finalized to semantic annotation of Wikipedia pages, the result is a system for automatic generation of Semantic Web wiki contents from standard Wikipedia pages, leading to a possible solution of the big challenge to make existing wiki sources semantic wikis. 0 0
Coherence and responsiveness Harris J.
Henderson A.
Interactions English Systems are always engaged in an interplay of responsiveness and coherence to local circumstances. Researchers believe that there are many ways to simultaneously increase the coherence, responsiveness, and scalability of systems, and this quest has enormous potential to improve life. Allowing language to evolve in use is a powerful means of managing the trade-off curve for a growing space of products. Larry Page and Sergei Brin have developed PageRank algorithm, which aggregates the local knowledge implicit in the network of references between pages. PageRank can adapt to almost unlimited changes in the content and uses of the Web without needing to change the core algorithm at all. Yelp is an online service that aggregates, curates, and helps users search reviews of local businesses. The Internet has catalyzed the emergence of very large open working groups, such as the Linux development community, the Wikipedia authoring community, and various fan and support groups. 0 0
Collaboration amidst disagreement and moral judgment: The dynamics of Jewish and Arab students' collaborative inquiry of their joint past Pollack S.
Kolikant Y.B.-D.
Collaborative learning
Historical thinking
International Journal of Computer-Supported Collaborative Learning English We present an instructional model involving a computer-supported collaborative learning environment, in which students from two conflicting groups collaboratively investigate an event relevant to their past using historical texts. We traced one enactment of the model by a group comprised of two Israeli Jewish and two Israeli Arab students. Our data sources included the texts participants wrote-pre-, post- and during the activity, jointly and individually-the transcripts of the e-discussion and reflections written after the activity. The setting enabled us to further our understanding of what collaboration means when students' voices do not converge. We examined whether the activity was productive in terms of learning, and the dynamics of collaboration within the milieu, especially the intersubjective meaning making. The e-discussion that was co-constructed by participants was a chain of disagreements. However, participants' reflections reveal that the group structure and the e-communication method were perceived as affording sensitive collaboration. Furthermore, a comparison between the individual texts, pre- and post- the group discussion, revealed that the activity was productive, since students moved from a one-sided presentation of the event to a more multi-sided representation. Based on the analysis of the e-discussion, we conclude that the setting provided students with opportunities to examine their voices in light of alternatives. We propose the term fission to articulate certain moments of intersubjectivity, where a crack is formed in one's voice as the Other's voice impacts it, and one's voice become more polyphonic. © 2012 International Society of the Learning Sciences, Inc.; Springer Science + Business Media, LLC. 0 0
Collaborative hypervideo editing using MediaWiki Niels Seidel Hypervideo
Multimedia wiki
Video authoring
WikiSym 2012 English Current wikis cannot be used to host or author rich dynamic hypervideos along with hypertext elements. In this article vi-wiki is presented as an approach for seamless and collaborative integration of interactive hypervideos into MediaWiki. Vi-wiki combines the wiki metaphor with a direct manipulation user interface for hypervideo authoring and particular markup conventions. The research makes a contribution to collaborative work and learning with wikis. It enables users to annotate spatio-temporal hyperlinks as well as composite sequential video clips through both, a graphical user interface and a generic markup language. 0 0
Collaborative knowledge building with wikis: The impact of redundancy and polarity Johannes Moskaliuk
Joachim Kimmerle
Ulrike Cress
Cooperative/collaborative learning
Interactive learning environments
Teaching/learning strategies
Comput. Educ. English 0 0
Collaborative learning and wiki: Creating a collaborative activity sheet for distance learning via the Wild tool Ibrahimi F.
Essaaidi M.
Collaborative learning
Distance learning
Pedagogical scenario
Wiki tool
Proceedings of 2012 International Conference on Multimedia Computing and Systems, ICMCS 2012 English This paper aims to develop access for students and teachers in an activity via a wiki collaboration website. To this end, it proposes using XWiki for the implementation of a collaborative learning environment. It allows the involvement of users, collaboration, social interaction and knowledge sharing. No doubt, this tool fits well with the most current pedagogical trends that actively focus, in a real context, specifically on collaboration, participation and knowledge construction. This paper describes the proposed system and the pedagogical environment of the project, including the concept of collaborative learning scenario. Finally, it describes the processes involved in achieving our collaborative learning website TENSWiki. 0 0
Collaborative machine tool design environment based on semantic wiki technology Zapp M.
Singh M.
Zendoia J.
Brencsics I.
Knowledge management
Machine design
Semantic MediaWiki
Semantic search
Proceedings of the European Conference on Knowledge Management, ECKM English This paper presents a light-weight collaboration environment for the conceptual design of machine tools. For the design of specialized machine tools and their components, machine designers, customers and suppliers need to gather, retrieve and exchange heterogeneous information like customer requirements, component specifications, design drawings and life-cycle performance data. This knowledge management process can be supported by collaboration tools. Since the European machine tool industry is dominated by SMEs and machine tools are mostly manufactured in small series, light-weight and flexible solutions are required. The collaboration environment proposed in this work is built on the Semantic MediaWiki+ (SMW+) solution, which enhances a regular MediaWiki system with the capabilities of semantic annotations and semantic queries. To facilitate the semantic annotation, the design environment is equipped with ontologies, which represent relevant concepts, attributes, relations and rules in the machine tool design domain. In addition, a rich web application as an extension to SMW+ is developed, which leads the designer through the steps of a machine design project. The environment supports the retrieval and re-use of information from previous design projects, the use of lifecycle performance data of machines, the knowledge exchange among designers and the data exchange to commercial-off-the-shelf assessment and simulation tools. 0 0
Collaborative online writing assignments to foster active learning Olivo R.F. Class notes
Google docs
Learning strategies
Neurophysiology course
Online work
Journal of Undergraduate Neuroscience Education English To help students master the content of a neurophysiology course, they were asked to participate in collaborative writing projects. In the first two years, students contributed to a class wiki by summarizing one lecture and editing summaries of several others. In the second two years, students worked in teams of three or four to write a series of illustrated chapters spanning the entire semester. The second assignment kept students more engaged than the wiki project, and although they found it a significant amount of work, they also believed that it helped them learn the subject matter. Working in teams, however, was not always a happy experience. 0 0
Collaborative trend analysis using web 2.0 technologies: A case study Kaiser I. Collaboration
Innovation Management
Trend Scouting
Web 2.0
International Journal of Distributed Systems and Technologies English Through early trend recognition in the business environment and specific processing within the innovation management, companies can achieve long-term market success. A particular challenge is the systematic identification, gathering, structuration and evaluation of trends. Web 2.0 technologies and especially Wikis, which allow several people to maintain and use content simultaneously, are eminently suitable for an efficient process of continuous collection and analysis of relevant market trends. In this paper, trend management processes are introduced and demonstrated how trends can be collected, structured and communicated within the enterprise using a customized wiki. The trend assessment is carried out inter alia on methods of crowd sourcing, resulting in an extensive evaluation basis. In addition, the presented approach includes a visualization of the trends and its assessment for decision support. A case study of global polymer solutions supplier REHAU AG demonstrates the use of the methodology in practice. Copyright 0 0
Collaboratively constructed knowledge repositories as a resource for domain independent concept extraction Kerschbaumer J.
Reichhold M.
Winkler C.
Fliedl G.
Concept mining
Information extraction
Knowledge acquisition
Text mining
Data mining
Proceedings of the 10th Terminology and Knowledge Engineering Conference: New Frontiers in the Constructive Symbiosis of Terminology and Knowledge Engineering, TKE 2012 English To achieve a domain independent text management, a flexible and adaptive knowledge repository is indispensable and represents the key resource for solving many challenges in natural language processing. Especially for real world applications, the needed resources cannot be provided for technical disciplines, like engineering in the energy or the automotive domain. We therefore propose in this paper, a new approach for knowledge (concept) acquisition based on collaboratively constructed knowledge repositories like Wikipedia and enterprise Wikis. 0 0
Collective context-aware topic models for entity disambiguation Sen P. Entity disambiguation
Topic models
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web English A crucial step in adding structure to unstructured data is to identify references to entities and disambiguate them. Such disambiguated references can help enhance readability and draw similarities across different pieces of running text in an automated fashion. Previous research has tackled this problem by first forming a catalog of entities from a knowledge base, such as Wikipedia, and then using this catalog to disambiguate references in unseen text. However, most of the previously proposed models either do not use all text in the knowledge base, potentially missing out on discriminative features, or do not exploit word-entity proximity to learn high-quality catalogs. In this work, we propose topic models that keep track of the context of every word in the knowledge base; so that words appearing within the same context as an entity are more likely to be associated with that entity. Thus, our topic models utilize all text present in the knowledge base and help learn high-quality catalogs. Our models also learn groups of co-occurring entities thus enabling collective disambiguation. Unlike most previous topic models, our models are non-parametric and do not require the user to specify the exact number of groups present in the knowledge base. In experiments performed on an extract of Wikipedia containing almost 60,000 references, our models outperform SVM-based baselines by as much as 18% in terms of disambiguation accuracy translating to an increment of almost 11,000 correctly disambiguated references. 0 0
Collective intelligence model: How to describe collective intelligence Georgi S.
Jung R.
Collective intelligence
Advances in Intelligent and Soft Computing English A large number of scientific research exists, describing forms of collective intelligence (e.g. Wikipedia). But there are only few publications that describe how different forms of collective intelligence be described in general. In this paper, we therefore describe an approach how to characterise different forms of collective intelligence. We draw from existing research and build a comprehensive model and identify further characteristics to describe collective intelligence in a fine-grained manner. We propose a model with different characteristics, like form of cooperation, organisational pattern, and decision making process, which distinctively describe forms of collective intelligence and suggest possible attribute values. 0 0
Combining AceWiki with a CAPTCHA system for collaborative knowledge acquisition Nalepa G.J.
Adrian W.T.
Szymon Bobek
Maslanka P.
Collaborative knowledge engineering
Knowledge acquisition
Semantic wiki
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI English Formalized knowledge representation methods allow to build useful and semantically enriched knowledge bases which can be shared and reasoned upon. Unfortunately, knowledge acquisition for such formalized systems is often a time-consuming and tedious task. The process requires a domain expert to provide terminological knowledge, a knowledge engineer capable of modeling knowledge in a given formalism, and also a great amount of instance data to populate the knowledge base. We propose a CAPTCHA-like system called AceCAPTCHA in which users are asked questions in a controlled natural language. The questions are generated automatically based on a terminology stored in a knowledge base of the system, and the answers provided by users serve as instance data to populate it. The implementation uses AceWiki semantic wiki and a reasoning engine written in Prolog. 0 0
Community optimization: Function optimization by a simulated web community Veenhuis C.B. Behavioral Model
Collective intelligence
Community Optimization
Knowledge Base
Web Community
International Conference on Intelligent Systems Design and Applications, ISDA English In recent years a number of web-technology supported communities of humans have been developed. Such a web community is able to let emerge a collective intelligence with a higher performance in solving problems than the single members of the community. Based on the successes of collective intelligence systems like Wikipedia, the web encyclopedia, the question arises, whether such a collaborative web community could also be capable of function optimization. This paper introduces an optimization algorithm called Community Optimization (CO), which optimizes a function by simulating a collaborative web community, which edits or improves an article-base, or, more general, a knowledge-base. In order to realize this, CO implements a behavioral model derived from the human behavior that can be observed within certain types of web communities (e.g., Wikipedia or open source communities). The introduced CO method is applied to four well-known benchmark problems. CO significantly outperformed the Fully Informed Particle Swarm Optimization as well as two Differential Evolution approaches in all four cases especially in higher dimensions. 0 0
Comparing taxonomies for organising collections of documents Fernando S.
Mary Hall
Eneko Agirre
Aitor Soroa
Clough P.
Stevenson M.
Semantic network
24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers English There is a demand for taxonomies to organise large collections of documents into categories for browsing and exploration. This paper examines four existing taxonomies that have been manually created, along with two methods for deriving taxonomies automatically from data items. We use these taxonomies to organise items from a large online cultural heritage collection. We then present two human evaluations of the taxonomies. The first measures the cohesion of the taxonomies to determine how well they group together similar items under the same concept node. The second analyses the concept relations in the taxonomies. The results show that the manual taxonomies have high quality well defined relations. However the novel automatic method is found to generate very high cohesion. 0 0
Components of a Wiki-based software development environment Gruhn V.
Hannebauer C.
Reference Architecture
Software Engineering
2012 IEEE Symposium on E-Learning, E-Management and E-Services, IS3e 2012 English Software developers who want to join an existing software development project must first overcome a contribution barrier. The contribution barrier can prevent prospective software developers from joining the project. This contribution barrier comprises technical as well as social hurdles. This paper describes the components of a Wiki Development Environment (WikiDE): A wiki system with which software developers can edit, compile, and debug applications using a standard web browser. Such a WikiDE minimizes the technical hurdles of the contribution barrier. With a WikiDE, software developers can join software development projects more quickly and less software developers are completely prevented from joining. 0 0
Compressed data structures for annotated web search Soumen Chakrabarti
Kasturi S.
Balakrishnan B.
Ganesh Ramakrishnan
Saraf R.
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web English Entity relationship search at Web scale depends on adding dozens of entity annotations to each of billions of crawled pages and indexing the annotations at rates comparable to regular text indexing. Even small entity search benchmarks from TREC and INEX suggest that the entity catalog support thousands of entity types and tens to hundreds of millions of entities. The above targets raise many challenges, major ones being the design of highly compressed data structures in RAM for spotting and disambiguating entity mentions, and highly compressed disk-based annotation indices. These data structures cannot be readily built upon standard inverted indices. Here we present a Web scale entity annotator and annotation index. Using a new workload-sensitive compressed multilevel map, we fit statistical dis-ambiguation models for millions of entities within 1.15GB of RAM, and spend about 0.6 core-milliseconds per disambiguation. In contrast, DBPedia Spotlight spends 158 milliseconds, Wikipedia Miner spends 21 milliseconds, and Zemanta spends 9.5 milliseconds. Our annotation indices use ideas from vertical databases to reduce storage by 30%. On 40×8 cores with 40×3 disk spindles, we can annotate and index, in about a day, a billion Web pages with two million entities and 200,000 types from Wikipedia. Index decompression and scan speed are comparable to MG4J. 0 0
Computational reputation model based on selecting consensus choices: An empirical study on semantic wiki platform Jason J. Jung Computational reputation model
Conflict resolution
Consensus choice selection
Semantic wiki
Social media
Expert Syst. Appl. English 0 0
Computing text-to-text semantic relatedness based on building and analyzing enriched concept graph Jahanbakhsh Nagadeh Z.
Mahmoudi F.
Jadidinejad A.H.
Enriched concept graph
Key concept extraction
Semantic relatedness
Lecture Notes in Electrical Engineering English This paper discusses about effective usage of key concepts in computing texts semantic relatedness. Thus, we present a novel method for computing texts semantic relatedness by using key concepts. Problem of appropriate semantic resource selection is very important in Semantic Relatedness algorithms. For this purpose, we proposed to use a collection of two semantic resource namely, WordNet, Wikipedia, so that provide more complete data source and accuracy for calculate the semantic relatedness. Result of this proposal is compute semantic relatedness between almost any concepts pair. In purposed method, text is modeled as a graph of semantic relatedness between concepts of text that are exploited from WordNet and Wikipedia. This graph is named Enriched Concepts Graph. Then key concepts are extracted by analyzing ECG. Finally, texts semantic relatedness is obtained semantically by comparing key concepts of texts together. We evaluated our approach and obtained a high correlation coefficient of 0.782 which outperformed all other existing state of art approaches. © 2012 Springer Science+Business Media B.V. 0 0
Conceptualizing documents with wikipedia Nomoto T.
Kando N.
Cluster Labeling
Relevance Feedback
Topic Detection
International Conference on Information and Knowledge Management, Proceedings English In this work, we will discuss how to improve Wikilabel, an approach which makes use of titles in Wikipedia pages to generate labels for documents, by retooling ideas from story link detection (SLD). A comparison of our approach against Elastic Net, a powerful machine learner, on the real world data, finds the visible superiority of our approach over the latter. 0 0
Conflict, confidence, or criticism: An empirical examination of the gender gap in wikipedia Benjamin Collier
Julia Bear
English A recent survey of contributors to Wikipedia found that less than 15% of contributors are women. This gender contribution gap has received significant attention from both researchers and the media. A panel of researchers and practitioners has offered several insights and opinions as to why a gender gap exists in contributions despite gender anonymity online. The gender research literature suggests that the difference in contribution rates could be due to three factors: (1) the high levels of conflict in discussions, (2) dislike of critical environments, and (3) lack of confidence in editing other contributors' work. This paper examines these hypotheses regarding the existence of the gender gap in contribution by using data from an international survey of 176,192 readers, contributors, and former contributors to Wikipedia, including measures of demographics, education, motivation, and participation. Implications for improving the design and culture of online communities to be more gender inclusive are discussed. 0 0
Content analysis of wiki discussions for knowledge construction: Opportunities and challenges Buraphadeja V.
Sujai Kumar
Content analysis
Critical thinking
Higher education
Knowledge construction
Web 2.0
International Journal of Web-Based Learning and Teaching Technologies English Research on several aspects of asynchronous online discussions in online and hybrid courses has been successfully conducted using content analysis in the past. With the increase in Web 2.0 and social media use in education, research on knowledge construction within newer virtual environments like blogs or wikis is just emerging. This study applies a well-known model of content analysis for knowledge construction to an educational wiki environment. Twelve graduate students' contributions to a wiki in a 14-week on-campus course on Web 2.0 technologies in education are analyzed. Results indicate that the wiki platform fosters collaborative knowledge construction and that is necessary to develop new frameworks to analyze content in new learning environments. Wiki environments provide opportunities for researchers to capture the process of collaboration, knowledge construction, and meta-cognition. Copyright 0 0
Context-aware in-page search Lin Y.-H.
Liu Y.-L.
Yen T.-X.
Chang J.S.
Entity linking
Search engine
Support vector machine
Word sense disambiguation
Proceedings of the 24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012 English In this paper we introduce a method for searching appropriate articles from knowledge bases (e.g. Wikipedia) for a given query and its context. In our approach, this problem is transformed into a multi-class classification of candidate articles. The method involves automatically augmenting smaller knowledge bases using larger ones and learning to choose adequate articles based on hyperlink similarity between article and context. At run-time, keyphrases in given context are extracted and the sense ambiguity of query term is resolved by computing similarity of keyphrases between context and candidate articles. Evaluation shows that the method significantly outperforms the strong baseline of assigning most frequent articles to the query terms. Our method effectively determines adequate articles for given query-context pairs, suggesting the possibility of using our methods in context-aware search engines. 0 0
Contributing to wikipedia: Through content or social interaction? Zelenkauskaite A.
Paolo Massa
Novice Users
Social Web
Sociotechnical Systems
International Journal of Distributed Systems and Technologies English While the overall amount of user contributions in various namespaces has been discussed in previous research, the question of how and where users contribute, depending on their time spent in Wikipedia, is still open. This study analyzed contribution patterns in three namespaces of 685,897 active users of English Wikipedia since its inception. User editing behaviors were analyzed according to the amount of time spent within Wikipedia where contributions in content-oriented spaces were compared with social-oriented namespaces. Copyright 0 0
Cooperating or collaborating: Design considerations of employing wikis to engage college-level students Strickland J.
Xie Y.
Active learning
Case study
Learner interaction
Cutting-Edge Technologies in Higher Education English This chapter provides researchers and practitioners with guidelines for employing wikis to foster collaboration and active learning within and between student teams in higher educational settings. The core function of a wiki is to facilitate learner interaction with content. Such engagement is critical whether the course's instructional delivery environment is primarily face-to-face or web-based. Instructors encourage shared understanding through a spirit of investigation that embraces greater collaboration in the process. Collaboratively building knowledge about one content area by dialoguing with peers and negotiating importance in order to present the information in a meaningful way to the public is the strongest aspect of a wiki. To illustrate this, five case studies are detailed ranging from individual wikis to group consensus wikis in undergraduate and graduate-level courses, delivered in blended (i.e., hybrid combinations of face-to-face and online) and online asynchronous environments. As a whole, these studies support that wikis are not the single answer to all problems associated with collaboration and sharedknowledge in any learning situation, but they are a powerful lens for greater clarity in issues of student engagement and may lead to improved performance for diverse learners. Various experts add their views to those of the authors of this chapter; that to be effective, instructors must design purposeful engagement that embraces communication, cooperation and collaboration, active learning, feedback, and respect for differences. Likewise, students must be informed of the value of such engagement and have positive wiki models presented early in their online experiences. Copyright 0 0
Coordination and beyond: Social functions of groups in open content production Andrea Forte
Kittur N.
Vanesa Larco
Haiping Zhu
Amy Bruckman
Kraut R.E.
Open content
Peer production
English We report on a study of the English edition of Wikipedia in which we used a mixed methods approach to understand how nested organizational structures called WikiProjects support collaboration. We first conducted two rounds of interviews with a total of 20 Wikipedians to understand how WikiProjects function and what it's like to participate in them from the perspective of Wikipedia editors. We then used a quantitative approach to further explore interpretations that arose from the qualitative data. Our analysis of these data together demonstrates how WikiProjects not only help Wikipedians coordinate tasks and produce articles, but also support community members and small groups of editors in important ways such as: providing a place to find collaborators, socialize and network; protecting editors' work; and structuring opportunities to contribute. 0 0
Corporate wikis: The effects of owners' motivation and behavior on group members' engagement Ofer Arazy
Gellatly I.
Knowledge management (KM)
Knowledge management systems (KMS)
Knowledge sharing
Regulatory focus theory
Social cognitive theory
Journal of Management Information Systems English Originally designed as a tool to alleviate bottlenecks associated with knowledge management, the suitability of wikis for corporate settings has been questioned given the inherent tensions between wiki affordances and the realities of organizational life. Drawing on regulatory focus theory and social cognitive theory, we developed and tested a model of the motivational dynamics underlying corporate wikis. We examined leaders (owners) and users of 187 wiki-based projects within a large multinational firm. Our findings revealed two countervailing motivational forces, one oriented toward accomplishment and achievement (promotion focus) and one oriented toward safety and security (prevention focus), that not only predicted owners' participation but also the overall level of engagement within the wiki groups. Our primary contribution is in showing that, notwithstanding the potential benefits to users, wikis can trigger risk-avoidance motives that potentially impede engagement. Practically, our findings call for an alignment between organizational procedures surrounding wiki deployment and the technology's affordances. © 2013 M.E. Sharpe, Inc. All rights reserved. 0 0
Creating an extended named entity dictionary from wikipedia Ryuichiro Higashinaka
Tsu K.S.
Saito K.
Makino T.
Yutaka Matsuo
Extended named entity
24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers English Automatic methods to create entity dictionaries or gazetteers have used only a small number of entity types (18 at maximum), which could pose a limitation for fine-grained information extraction. This paper aims to create a dictionary of 200 extended named entity (ENE) types. Using Wikipedia as a basic resource, we classify Wikipedia titles into ENE types to create an ENE dictionary. In our method, we derive a large number of features for Wikipedia titles and train a multiclass classifier by supervised learning. We devise an extensive list of features for the accurate classification into the ENE types, such as those related to the surface string of a title, the content of the article, and the meta data provided with Wikipedia. By experiments, we successfully show that it is possible to classify Wikipedia titles into ENE types with 79.63% accuracy. We applied our classifier to all Wikipedia titles and, by discarding low-confidence classification results, created an ENE dictionary of over one million entities covering 182 ENE types with an estimated accuracy of 89.48%. This is the first large scale ENE dictionary. 0 0
Creation of a topic maps-based wiki with an article similarity measurement Matsuura S.
Naito M.
Toyota H.
Tanimoto similarity
Topic maps
Proceedings of the 20th International Conference on Computers in Education, ICCE 2012 English We created a pilot case of a Topic Maps-based wiki site to exchange ideas about children's behavior in elementary schools. The topic map of this site consisted of the article and subject topic types. The subject topic type consisted of topics classified into areas such as "behavior," "competence," "field," and "school time." Each article was registered as an instance of the article topic type and associated with relevant subject topics. To measure the similarity between two articles and to relate articles on the basis of the similarity, we used the Tanimoto Similarity. To improve similarity-based retrieval, it was suggested that more specific subject topics characterize the articles. 0 0
Cross domain search by exploiting Wikipedia Che-Hung Liu
Wu S.
Jiang S.
Tung A.K.H.
Proceedings - International Conference on Data Engineering English The abundance of Web 2.0 resources in various media formats calls for better resource integration to enrich user experience. This naturally leads to a new cross-modal resource search requirement, in which a query is a resource in one modal and the results are closely related resources in other modalities. With cross-modal search, we can better exploit existing resources. Tags associated with Web 2.0 resources are intuitive medium to link resources with different modality together. However, tagging is by nature an ad hoc activity. They often contain noises and are affected by the subjective inclination of the tagger. Consequently, linking resources simply by tags will not be reliable. In this paper, we propose an approach for linking tagged resources to concepts extracted from Wikipedia, which has become a fairly reliable reference over the last few years. Compared to the tags, the concepts are therefore of higher quality. We develop effective methods for cross-modal search based on the concepts associated with resources. Extensive experiments were conducted, and the results show that our solution achieves good performance. 0 0
Cross-lingual knowledge discovery: Chinese-to-English article linking in wikipedia Tang L.-X.
Andrew Trotman
Shlomo Geva
Xu Y.
Anchor identification
Chinese segmentation
Cross-lingual link discovery
Link mining
Link recommendation
Lecture Notes in Computer Science English In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-to-English cross-lingual links. The techniques described here can assist bi-lingual users where a particular topic is not covered in Chinese, is not equally covered in both languages, or is biased in one language; as well as for language learning. 0 0
Cross-lingual knowledge linking across wiki knowledge bases Zhe Wang
Jing-Woei Li
Tang J.
Knowledge linking
Knowledge sharing
Wiki knowledge base
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web English Wikipedia becomes one of the largest knowledge bases on the Web. It has attracted 513 million page views per day in January 2012. However, one critical issue for Wikipedia is that articles in different language are very unbalanced. For example, the number of articles on Wikipedia in English has reached 3.8 million, while the number of Chinese articles is still less than half million and there are only 217 thousand cross-lingual links between articles of the two languages. On the other hand, there are more than 3.9 million Chinese Wiki articles on Baidu Baike and Hudong.com, two popular encyclopedias in Chinese. One important question is how to link the knowledge entries distributed in different knowledge bases. This will immensely enrich the information in the online knowledge bases and benefit many applications. In this paper, we study the problem of cross-lingual knowledge linking and present a linkage factor graph model. Features are defined according to some interesting observations. Experiments on the Wikipedia data set show that our approach can achieve a high precision of 85.8% with a recall of 88.1%. The approach found 202,141 new cross-lingual links between English Wikipedia and Baidu Baike. 0 0
Cross-modal information retrieval - A case study on Chinese wikipedia Cong Y.
Qin Z.
Jian Yu
Wan T.
Character-based topics
Cross-modal information retrieval
Topic correlation model (TCM)
Word-based topics
Lecture Notes in Computer Science English Probability models have been used in cross-modalmultimedia information retrieval recently by building conjunctive models bridging the text and image components. Previous studies have shown that cross-modal information retrieval systemusing the topic correlation model (TCM) outperforms state-of-the-art models in English corpus. In this paper, we will focus on the Chinese language, which is different from western languages composed by alphabets. Words and characters will be chosen as the basic structural units of Chinese, respectively. We also set up a test database, named Ch-Wikipedia, in which documents with paired image and text are extracted fromChinese website ofWikipedia.We investigate the problems of retrieving texts (ranked by semantic closeness) given an image query, and vice versa. The capabilities of the TCM model is verified by experiments across the Ch-Wikipedia dataset. 0 0
Cross-modal topic correlations for multimedia retrieval Jian Yu
Cong Y.
Qin Z.
Wan T.
Proceedings - International Conference on Pattern Recognition English In this paper, we propose a novel approach for cross-modal multimedia retrieval by jointly modeling the text and image components of multimedia documents. In this model, the image component is represented by local SIFT descriptors based on the bag-of-feature model. The text component is represented by a topic distribution learned from latent topic models such as latent Dirichlet allocation (LDA). The latent semantic relations between texts and images can be reflected by correlations between the word topics and topics of image features. A statistical correlation model conditioned on category information is investigated. Experimental results on a benchmark Wikipedia dataset show that the newly proposed approach outperforms state-of-the-art cross-modal multimedia retrieval systems. 0 0
Cross-modality correlation propagation for cross-media retrieval Zhai X.
Peng Y.
Jie Xiao
Correlation propagation
Crossmedia retrieval
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings English We consider the problem of cross-media retrieval, where the query and the retrieved results can be of different modalities. In this paper, we propose a novel cross-modality correlation propagation approach to simultaneously deal with positive correlation and negative correlation between media objects of different modalities, while existing works focus solely on the positive correlation. Negative correlation is very important because it provides the effective exclusive information. The correlation is modeled as must-link constraints and cannot-link constraints respectively. Furthermore, our approach is able to propagate the correlation between heterogeneous modalities. Experiments on the wikipedia dataset show the effectiveness of our cross-modality correlation propagation approach, compared with state-of-the-art methods. 0 0
Cross-platform video segmentation system based on color histogram Jian Y.
Wei W.
Suhuan W.
Lecture Notes in Electrical Engineering English Content-based video retrieval (CBVR), to some extent, is the expansion of Content-based image retrieval. Early image retrieval techniques used in Image database management were based on textual annotation of images. However, textbased techniques have many limitations, including their reliance on manual annotation, which can be difficult and tedious for large image sets. The shortcomings of Text-Based Image Retrieval (TBIR) become more obvious as there is an enormous increase in the size of digital image databases. To overcome these difficulties, people began to turned glare to the attributes in images in the early 90's, and Content-Based Image Retrieval (CBIR) was proposed. Detailed information can refer to the paper. There are many results on the research of CBIR and Wikipedia maintains a list of CBIR engines. 0 0
CrowdTiles: Presenting crowd-based information for event-driven information needs Whiting S.
Zhou K.
Jose J.
Alonso O.
Leelanupab T.
ACM International Conference Proceeding Series English Time plays a central role in many web search information needs relating to recent events. For recency queries where fresh information is most desirable, there is likely to be a great deal of highly-relevant information created very recently by crowds of people across the world, particularly on platforms such as Wikipedia and Twitter. With so many users, mainstream events are often very quickly reflected in these sources. The English Wikipedia encyclopedia consists of a vast collection of user-edited articles covering a range of topics. During events, users collaboratively create and edit existing articles in near real-time. Simultaneously, users on Twitter disseminate and discuss event details, with a small number of users becoming influential for the topic. In this demo, we propose a novel approach to presenting a summary of new information and users related to recent or ongoing events associated with the user's search topic, therefore aiding most recent information discovery. We outline methods to detect search topics which are driven by events, identify and extract changing Wikipedia article passages and find influential Twitter users. Using these, we provide a system which displays familiar tiles in search results to present recent changes in the event-related Wikipedia articles, as well as Twitter users who have tweeted recent relevant information about the event topics. 0 0
Curating digital content in teaching and learning using wiki technology Verhaart M. Content curation
Digital curation
Wiki Teacher
Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012 English A significant issue for educators is managing the avalanche of content, and with this content volume the educator's role is changing from one of knowledge collection and dissemination, to one of curation and dissemination, in particular with digital objects. This paper builds on previous research into developing a personal integrated framework to manage an individual educator's content that allows enhancement by others, and considers whether wiki technology can be used to manage the changing emphasis from collection and dissemination to digital curation. 0 0
Curriculum-guided crowd sourcing of assessments in a developing country Zualkernan I.A.
Raza A.
Karim A.
Crowd sourcing
Developing world
Online assessments
Primary education
Educational Technology and Society English Success of Wikipedia has opened a number of possibilities for crowd sourcing learning resources. However, not all crowd sourcing initiatives are successful. For developing countries, adoption factors like lack of infrastructure and poor teacher training can have an impact on success of such systems. This paper presents an exploratory study to determine if teachers in a developing country are able to create quality multiple-choice questions for primary school students. An adoption model is developed and evaluated to ascertain if the teachers would actually contribute to such a Wiki. Results are that, given student learning outcomes, content constraints, and a Bloom's assessment level, a reasonable number of teachers were able to formulate quality questions, and that there is a strong intention to use such a system. Teachers with high intention to adopt also had a better attitude, enjoyed making questions and found the process easy to use. However, there is no obvious relationship between the intention to use and an ability to pose good assessments. In addition, there is no obvious predictor of where the good question contributors came from. 0 0
Cyclic entropy of collaborative complex networks Safar M.
Mahdi K.
Jammal L.
IET Communications English Recent models of complex networks rely on degree-based properties evaluation, a new approach is proposed based on other microstructures existing in networks that are cycles (loops). Degree-based entropy measures the uncertainties in relationship whereas cycle-based (cyclic) entropy measures the uncertainties associated with the information feedback in collaboration network, namely WikipediaTM. On the basis of the values of cyclic and degree entropies measured in three different experiments on Wikipedia, the authors conclude that citation activity level in Wikipedia is low, specialisation level is high and low tendency toward contribution in topics with different authors. 0 0
DAnIEL: Language independent character-based news surveillance Lejeune G.
Brixtel R.
Antoine Doucet
Lucas N.
Lecture Notes in Computer Science English This study aims at developing a news surveillance system able to address multilingual web corpora. As an example of a domain where multilingual capacity is crucial, we focus on Epidemic Surveillance. This task necessitates worldwide coverage of news in order to detect new events as quickly as possible, anywhere, whatever the language it is first reported in. In this study, text-genre is used rather than sentence analysis. The news-genre properties allow us to assess the thematic relevance of news, filtered with the help of a specialised lexicon that is automatically collected on Wikipedia. Afterwards, a more detailed analysis of text specific properties is applied to relevant documents to better characterize the epidemic event (i.e., which disease spreads where?). Results from 400 documents in each language demonstrate the interest of this multilingual approach with light resources. DAnIEL achieves an F 1-measure score around 85%. Two issues are addressed: the first is morphology rich languages, e.g. Greek, Polish and Russian as compared to English. The second is event location detection as related to disease detection. This system provides a reliable alternative to the generic IE architecture that is constrained by the lack of numerous components in many languages. 0 0
DBpedia and the live extraction of structured data from Wikipedia Morsey M.
Janette Lehmann
Sören Auer
Claus Stadler
Sebastian Hellmann
Data management
Knowledge Extraction
Knowledge management
Program English Purpose: DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However, the DBpedia release process is heavyweight and releases are sometimes based on several months old data. DBpedia-Live solves this problem by providing a live synchronization method based on the update stream of Wikipedia. This paper seeks to address these issues. Design/methodology/approach: Wikipedia provides DBpedia with a continuous stream of updates, i.e. a stream of articles, which were recently updated. DBpedia-Live processes that stream on the fly to obtain RDF data and stores the extracted data back to DBpedia. DBpedia-Live publishes the newly added/deleted triples in files, in order to enable synchronization between the DBpedia endpoint and other DBpedia mirrors. Findings: During the realization of DBpedia-Live the authors learned that it is crucial to process Wikipedia updates in a priority queue. Recently-updated Wikipedia articles should have the highest priority, over mapping-changes and unmodified pages. An overall finding is that there are plenty of opportunities arising from the emerging Web of Data for librarians. Practical implications: DBpedia had and has a great effect on the Web of Data and became a crystallization point for it. Many companies and researchers use DBpedia and its public services to improve their applications and research approaches. The DBpedia-Live framework improves DBpedia further by timely synchronizing it with Wikipedia, which is relevant for many use cases requiring up-to-date information. Originality/value: The new DBpedia-Live framework adds new features to the old DBpedia-Live framework, e.g. abstract extraction, ontology changes, and changesets publication. 0 0
DBpedia for NLP: A Multilingual Cross-domain Knowledge Base Pablo N. Mendes
Max Jakob
Christian Bizer
International Conference on Language Resources and Evaluation English 0 0
DBpedia ontology enrichment for inconsistency detection Topper G.
Knuth M.
Sack H.
Data cleansing
Linked data
Ontology enrichment
ACM International Conference Proceeding Series English In recent years the Web of Data experiences an extraordinary development: an increasing amount of Linked Data is available on the World Wide Web (WWW) and new use cases are emerging continually. However, the provided data is only valuable if it is accurate and without contradictions. One essential part of the Web of Data is DBpedia, which covers the structured data of Wikipedia. Due to its automatic extraction based on Wikipedia resources that have been created by various contributors, DBpedia data often is error-prone. In order to enable the detection of inconsistencies this work focuses on the enrichment of the DBpedia ontology by statistical methods. Taken the enriched ontology as a basis the process of the extraction of Wikipedia data is adapted, in a way that inconsistencies are detected during the extraction. The creation of suitable correction suggestions should encourage users to solve existing errors and thus create a knowledge base of higher quality. Copyright 2012 ACM. 0 0
DRE-specific wikis for distributed requirements engineering: A review Peng R.
Lai H.
Distributed requirement engineering
DRE-specific wikis
Systematic literature review
Proceedings - Asia-Pacific Software Engineering Conference, APSEC English Wikis, as typical well-known knowledge management tools that support collaborative work, are adopted by more and more practitioners and researchers as the basis to develop Distributed Requirements Engineering (DRE) tools. Thus, many wikis which are enhanced specially for supporting various activities in Distributed Requirements Engineering (namely DRE-specific wikis) are developed. The main goal of this study is to discover all the available DRE-specific wikis, gain an insight into how and to what degree current DRE-specific wikis can support the DRE activities, and identify the future research directions. We adopt the methodology of systematic literature review to find DRE-specific wikis, analyze the features embodied in them, identify DRE activities supported by them, and cluster the users' feedbacks from their literatures. The results show that 1) distributed requirements elicitation is the most popular DRE activity supported by current DRE-specific wikis, 2) enhanced features are mainly designed for this phase, 3) the well recognized advantage for using DRE-specific wikis is that they can facilitate the collaborative work, and the disadvantages mainly lie in the organization of the content and the usability. Based on the findings of this review, the possible future research directions of DRE-specific wikis have been pointed out, especially in distributed requirements elicitation, negotiation, validation, and management. The importance of cross-over studies and empirical research are both emphasized at the end of the paper. 0 0
DartWiki: A semantic wiki for ontology-based knowledge integration in the biomedical domain Yu T.
Hejie Chen
Mi J.
Gu P.
Wu T.
Pan J.Z.
Domain ontology
Integrated medicine
Knowledge management
Semantic web
Semantic wiki
Traditional chinese medicine
Current Bioinformatics English Semantic Web languages and technologies can be used for the annotation, classification, and organization of knowledge assets and digital artifacts based on biomedical ontologies. In this paper, we present a semantic wiki, named DartWiki, to build ontology-based digital encyclopedia for the biomedicine domain. DartWiki provides a Web-based interface for accessing knowledge artifacts in both per-artifact and per-concept mode. In the per-artifact mode, users can access these artifacts, and annotate them in both short texts and logical statements in terms of domain ontologies. In the concept-based mode, users can navigate a graph of concepts, and review and edit the synthesized page about a selected concept, which contains meaningful information about the concept, and also its related concepts and artifacts. Smooth transitions between the two modes are achieved through semantic links. As a use case of the DartWiki, we provide an open platform for the management and maintenance of digital artifacts in Integrated Medicine. This system provides medical practitioners with relevant and trustworthy knowledge artifacts, and also means to input artifacts, to clarify their meaning, and to check and improve their quality, which encourages the inclusion and participation of users, and effectively creates an online community around knowledge sharing. 0 0
Decentralized collaboration with a peer-to-peer wiki Alan Davoust
Alexander Craig
Babak Esfandiari
Kazmierski V.
Proceedings of the 2012 International Conference on Collaboration Technologies and Systems, CTS 2012 English We report our experience using a peer-to-peer (P2P) wiki system for academic writing tutorials. Our wiki system supports a non-traditional collaboration model, where each participant maintains their own version of the documents. The users share their contributions in the P2P network, which allows them to be exposed to multiple viewpoints, and to reuse each other's work. We collected and analyzed the contributions of the participants to these tutorials, and the results demonstrate the value of this collaboration model. In particular, we found the popularity of a document in the system is correlated with its quality, and the similarity between contributions of peers is a good predictor of future similarities. These properties provide helpful criteria for users to identify valuable material for reuse. 0 0
Definition and multi-dimensional comparative analysis of ad hoc communities in Twitter Macskassy S.A. ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media English We here present an early-stage prototype tool for defining and analyzing communities in Twitter. The tool takes a set of Twitter users and profiles them based on their tweets. This profiling is based on earlier work, where we map entities mentioned in tweets to Wikipedia entries, which in turn lets us profile a user based on the Wikipedia categories are related to his or her tweets. From here, we can define ad hoc topic-based communities (e.g., all users who discuss Wikipedia topic K). The tool is focused on contrast analysis, where we have baseline behavior or another community to compare against. Copyright © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 0 0
Deletion discussions in Wikipedia: Decision factors and outcomes Jodi Schneider
Alexandre Passant
Stefan Decker
Articles for deletion
Collaboration and conflict
Factors analysis
Online argumentation
WikiSym 2012 English Deletion of articles is a common process in Wikipedia, in order to ensure the overall quality of the encyclopedia. Yet, there is a need to better understand the procedures in order to promote the best decisions without unnecessary community work. In this paper, we study deletion in Wikipedia, drawing from factor analysis, and taking an in-depth, content-analysis-based approach. We address three research questions: First, what factors contribute to the decision about whether to delete a given article? Second, when multiple factors are given, what is the relative importance of those factors? Third, what are the outcomes of deletion discussions, both for articles and for the community? We find that multiple factors contribute to the assessment of an article, and we discuss their relative frequency. Further, we show how the assessment timeline focuses attention on improving borderline articles that have the potential to meet Wikipedia's content inclusion policies, and we highlight the role of novice contributors in this improvement process. 0 0
Deliberation in Wikipedia: Rationales in article deletion discussions Xiao L.
Askin N.
Quality control
Proceedings of the ASIST Annual Meeting English In this paper, we describe a study-in-progress aimed at evaluating and improving the quality of online deliberation. We analyze the use of rationales in article deletion discussions on Wikipedia. Our preliminary findings suggest that the majority of participants in these discussions were concerned with the notability and credibility of the topics, and for the most part presented cogent rationales based on established site policies. 0 0
Delving into knowledge modeling software supporting collaborative ontology development Kalbasi R.
Alesheikh A.
Amanzadeh S.
Collaborative ontology development
Non-wiki-based ontology editors
Software engineering
Wiki-based ontology editors
International Review on Computers and Software English The proficiency of ontologies as the formal explication of knowledge in different domains has motivated numerous software engineers to develop miscellaneous ontology editors. The recent propensity of ontology engineering community, collaborative ontology development, necessitates exclusive ontology platforms supporting collaborative aspects. Although a few research have strived to look over the collaborative ontology editors, they have not determined a complete list of requirements and benchmarks for their investigations. Therefore, collaborative ontology development requirements have been proposed in this research comprehending collaborative methodologies and platforms. Regarding to these touchstones, two existing categories of collaborative ontology editors have been explored here. The most discussed and applied instances of these platforms, wiki-based and non-wiki-based, have been enumerated in this paper too. The itemization, description, and evaluation of platforms can be fruitful for the ontology-based applications in future. The results of this research (in the context of an ongoing EU project) demonstrate that workflow support, reliability of software, and integration/representation/storing of changes are the prominent gaps of current collaborative ontology development platforms. © 2012 Praise Worthy Prize S.r.l. - All rights reserved. 0 0
Design Tracker: An easy to use and flexible hypothesis tracking system to aid project team working Bruce C.
Harrison M.
High availability
Open Source Software in Life Science Research: Practical Solutions to Common Challenges in the Pharmaceutical Industry and Beyond English Design Tracker is a hypothesis tracking system used across all sites and research areas in AstraZeneca by the global chemistry community. It is built on the LAMP (Linux, Apache, MySQL, PHP/Python) software stack, which started as a single server and has now progressed to a six-server cluster running cutting-edge high availability software and hardware. This chapter describes how a local tool was developed into a global production system. © 2012 Woodhead Publishing Limited. All rights reserved. 0 0
Design and Evaluation of an IR-Benchmark for SPARQL Queries with Fulltext Conditions Mishra A.
Gurajada S.
Martin Theobald
Linked data
SPARQL with Fulltext Search
International Conference on Information and Knowledge Management, Proceedings English In this paper, we describe our goals in introducing a new, annotated benchmark collection, with which we aim to bridge the gap between the fundamentally different aspects that are involved in querying both structured and unstructured data. This semantically rich collection, captured in a unified XML format, combines components (unstructured text, semistructured infoboxes, and category structure) from 3.1 Million Wikipedia articles with highly structured RDF properties from both DBpedia and YAGO2. The new collection serves as the basis of the INEX 2012 Ad-hoc, Faceted Search, and Jeopardy retrieval tasks. With a focus on the new Jeopardy task, we particularly motivate the usage of the collection for question-answering (QA) style retrieval settings, which we also exemplify by introducing a set of 90 QA-style benchmark queries which come shipped in a SPARQL-based query format that has been extended by fulltext filter conditions. 0 0
Design for Free Learning - a Case Study on Supporting a Service Design Course Teresa Consiglio
Gerrit C. van der Veer
Experience report
Open source
Cultural diversity
E- learning
Service design
Learner centered design
WikiSym In this experience report, we provide a case study on the use of information and communication technology (ICT) in higher education, developing an open source interactive learning environment to support a blended course. Our aim is to improve the quality of adult distance learning, ultimately involving peers worldwide, by developing learning invironments as flexible as possible regardless of the culture and context of use, of individual learning style and age of the learners.

Our example concerns a course of Service Design where the teacher was physically present only intermittently for part of the course while in the remaining time students worked in teams using our online learning environment.

We developed a structure where students are guided through discovery learning and mutual teaching. We will show how we started from the students’ authentic goals and how we supported them by a simple structure of pacing the discovery process and merging theoretical understanding with practice in real life.

Based on these first empirical results practical guidelines have been developed regarding improvements on the structure provided for the learning material and on the interaction facilities for students, teachers and instructional designers.
0 0
Designing a pedagogically grounded e-learning activity Chryso P. Collaboration
E-learnin environment
E-learning activity
Turkish Language Learning
Procedia - Social and Behavioral Sciences English Referring to significant learning theories and teaching methodologies, as an insider researcher, my aim is to present you my pedagogically grounded e-learning activity, which I designed last semester, by using pbworks (collaborative learning environment). A rationale and a description of how this e-learning activity was integrated into a Turkish language course at a University level will be discussed together with the description of the aims and objectives of this activity. A summary of the activity, a description of the materials and their pedagogical significance will also be discussed. It will be indicated and explained briefly, how the teaching materials were used, such as individual or group study and how they achieved the stated aims and objectives. The role of the tutor in the activity and the problems which were arised from the beginning of the course until the end of the activity will be identified and discussed. Concluding, suggestions and possible solutions to these problems will also be mentioned. 0 0
Designing the online collaborative learning using the wikispaces Sulisworo D. Collaborative
Jig saw
Online learning
International Journal of Emerging Technologies in Learning English Collaboration has become one of an essential skill necessary for effective functioning in society in the new era. As a consequence, the learning strategy at the higher education should consider this shifting. Web 2.0 technology is a new trend in communication technology that has become a basis of the new generation internet to make it a more mature and distinctive medium of communication. The problem is how to bring the offline learning using cooperative learning based on classroom to the online learning using this wiki. The learning design on this topic will give wide opportunity to access learning that more suitable to the new skill in the new era. Wikispaces is one of wiki facilities that operated in the web based. This wiki is so simple but suitable for collaborative learning. The learning scenario is Jig Saw approach modified to fit in the online collaborative learning environment. This technique, including two different treatments with different small groups in order to help learning and improving cooperation between students. Using this structure, students are responsible for share their skill or knowledge each other material. 0 0
Detecting Korean hedge sentences in Wikipedia documents Kang S.-J.
Jeong J.-S.
Kang I.-S.
Korean Hedge Detection
Machine learning
Uncertainty Detection
Lecture Notes in Computer Science English In this paper we propose automatic hedge detection methods for Korean. We select sentential contextual features adjusted for Korean, and used supervised machine-learning algorithms to train models to detect hedges in Wikipedia documents. Our SVM-based model achieved an F1-score of 90.8% for Korean. 0 0
Detecting Wikipedia vandalism with a contributing efficiency-based approach Tang X.
Guangyou Zhou
Fu Y.
Gan L.
Yu W.
Li S.
Vandalism detection
Lecture Notes in Computer Science English The collaborative nature of wiki has distinguished Wikipedia as an online encyclopedia but also makes the open contents vulnerable against vandalism. The current vandalism detection methods relying on basic statistic language features work well for explicitly offensive edits that perform massive changes. However, these techniques are evadable for the elusive vandal edits which make only a few unproductive or dishonest modifications. In this paper we proposed a contributing efficiency-based approach to detect the vandalism in Wikipedia and implement it with machine-learning based classifiers that incorporate the contributing efficiency along with other languages features. The results of extensional experiment show that the contributing efficiency can improve the recall of machine learning-based vandalism detection algorithms significantly. 0 0
Detecting malapropisms using measures of contextual fitness Torsten Zesch Contextual fitness
Wikipedia revision history
TAL Traitement Automatique des Langues English While detecting simple language errors (e.g. misspellings, number agreement, etc.) is nowadays standard functionality in all but the simplest text-editors, other more complicated language errors might go unnoticed. A difficult case are errors that come in the disguise of a valid word that fits syntactically into the sentence. We use the Wikipedia revision history to extract a dataset with such errors in their context. We show that the new dataset provides a more realistic picture of the performance of contextual fitness measures. The achieved error detection quality is generally sufficient for competent language users who are willing to accept a certain level of false alarms, but might be problematic for non-native writers who accept all suggestions made by the systems. We make the full experimental framework publicly available which will allow other scientists to reproduce our experiments and to conduct follow-up experiments. 0 0
Determinants of wiki diffusion in the greek education system Cotsakis S.
Vassili Loumos
Eleftherios Kayafas
Multiple linear regression
Professor motivation
Relative advantage
Student personal interests
Wiki diffusion
Technics Technologies Education Management English The present study investigates the determinants of wiki diffusion in education. The model presented in this work examines four facilitators and one inhibitor of wiki diffusion among Greek high school students. The sample population of the research was chosen randomly from three schools of the Greek prefecture of Eastern Macedonia. Factor analysis and multiple linear regression analysis were conducted to examine whether, and to what extent, wiki diffusion is affected by factors such as relative advantage, compatibility, complexity, student personal interests and professor motivation. The analysis revealed that the most significant determinant of wiki diffusion is the personal interest of the student for information search. On the other hand, as supported by the literature, complexity was proved to be an inhibitor of wiki expansion in education. Professor motivation, relative advantage and compatibility can also encourage wiki diffusion, though in a smaller scale. 0 0
Developing 21st century communicators Cotler J.
Yoder R.
Breimer E.
Delbelso D.
21st century communication skills
Collaborative learning
Computer-mediated communication
Elevator speech
Web conferencing
Proceedings of the Information Systems Education Conference, ISECON English What are the characteristics of an effective communicator in the 21st century business world? How can we equip our students with the skills necessary to successfully navigate the computer-mediated communication landscape during this time of globalization and rapid technology growth? In this paper, we examine these questions and discuss several methods for addressing the increasing demand for the diverse, complex and often non-routine communications skills required of today's business and information systems students. Drawing on practitioners from education and industry, along with our own research and observations, we discuss several teaching approaches that include developing professional collaboration skills using shared workspaces, delivering presentations using web conferencing, becoming comfortable in front of a video camera, using ePortfolios to articulate and reflect on learning, and professionally leveraging a social networking presence. When introducing new methods of communication there will inevitably be lessons learned and improvements that can be made in future iterations. This paper discusses students' perceptions of their experiences using computer-mediated communication and reflections on how we can improve the way we teach these concepts. 0 0
Developing Dialogic Learning Space: The Case of Online Undergraduate Research Journals Walkington H. Dialogue
List of publications
Undergraduate research
Journal of Geography in Higher Education English This paper explores the learning spaces associated with two geography undergraduate research journals. Wikis provide dedicated spaces for postgraduate reviewers to collaboratively develop constructive feedback to authors creating a supportive online learning environment. In becoming published authors, undergraduates reported that they gained not only academic recognition and curriculum vitae (CV) material but an ability to apply constructive criticism, a desire for more dialogue about their research and the motivation to publish further work in the future. This paper concludes that scaffolding the research writing process can be greatly enhanced by the strategic design of dialogic online learning space. 0 0
Developing WikiBOK: A Wiki-based BOK formulation-aid system Yoshifumi Masunaga
Chiba M.
Fukuda N.
Ishida H.
Kazunari Ito
Masahiro Ito
Masamura T.
Nagata H.
Shimizu Y.
Yoshiyuki Shoji
Tomokazu Takahashi
Yabuki T.
Lecture Notes in Electrical Engineering English The design and implementation of WikiBOK, a Wiki-based body of knowledge (BOK) formulation-aid system, is investigated in this paper. In contrast to formulating a BOK for a matured discipline such as computer science, BOK formulation for a new discipline such as social informatics needs a "bottom-up" approach because academics in a new discipline cannot draw its entire figure par avance. Therefore, an open collaboration approach based on the collective intelligence concept seems promising. WikiBOK is under development as part of our project based on BOK+, which is a novel BOK formulation principle for new disciplines. It uses Semantic MediaWiki (SMW) to facilitate its fundamental functions. To support a rich graphical user interface for WikiBOKers, a a graph visualization software, Graphviz, is adopted. SMW is enhanced to work in conjunction with Graphviz. Because edit conflicts occur when WikiBOKers collaborate, a resolution principle is investigated to resolve BOK tree edit conflicts. 0 0
Developing a university Wikipedia Douglas Edmonson PHP
QR code
English 0 0
Development of IR tool for tree-structured MathML-based mathematical descriptions Watanabe T.
Miyazaki Y.
Fuzzy search
IR system for math
Tree structure
Joint Proceedings of the Work-in-Progress Poster and Invited Young Researcher Symposium at the 18th International Conference on Computers in Education, ICCE 2010 English The quantity of Web contents including math has been skyrocketing in recent years, such as Wikipedia articles and BBS focusing on math. Some pieces of previous research have dealt with the development of IR systems targeting MathML-based math expressions. They are, however, still developing in terms of lack of fuzzy search functions or low hit rates. One of the authors in ICCE2008 proposed the IR tool enjoying a fuzzy search function, by adopting regular expressions used in MySQL. In this study, it is our objective to additionally propose a "tree structure" algorithm for the fuzzy search function with better precisions. 0 0
Didactical patterns for the usage of wikis in educational and learning activities Forment M.A.
Galanis N.
Casan M.J.
Mayol E.
Poch J.P.
Penalvo F.G.
Conde M.A.
Learning design
International Journal of Engineering Education English Wikis have been established as primary online tools for collaborating and for gathering, sharing and organizing knowledge. Wikipedia is an overwhelming proof of this. Introducing a wiki as a learning tool in a classroom poses as a very promising idea. This paper introduces a collection of didactical patterns for the usage of wikis in educational and learning activities that intended as guidelines in the learning design for successfully incorporating wikis in a virtual classroom. 0 0
Die Entwicklung von Qualitätsmängeln in der Wikipedia anhand von Wartungsbausteinen Matthias Busse Bauhaus-University Weimar German 0 0
Digital engineers: Results of a survey study documenting digital media and device use among freshmen engineering students Johri A.
Teo H.J.
Lo J.L.
Schram A.B.
Dufour M.S.
ASEE Annual Conference and Exposition, Conference Proceedings English The current generation of college students has been dubbed Digital Natives, Generation Y and/or the Net Generation and seemingly possesses distinctive habits as well as perceptions about the use of digital media and device that set them apart from their predecessors. Despite the claim that these college students are avid consumers and users of media content and devices, there is limited understanding about the media use habits of engineering students belonging to this generation. To better understand the digital media and device habits of incoming engineering students, we conducted a survey-based study at a large university in the United States. We obtained 204 valid responses. Two surveys were designed to obtain self-reported information on the frequency of media device usage, participation in social networking, academic activities and information seeking tendencies. Through this survey study, not only are we able to document the significant use of digital media and devices among engineering students, we are able to provide specific findings related to the role of digital media and devices on socializing and learning activities of freshmen engineering students. Findings also indicated that female freshmen engineering students are more likely to use the cellphone for talking, texting as well as participation in Wikipedia platform when compared to the males. 0 0
Discipline but not punish. the governance of Wikipedia Dominique Cardon Normative Experience in Internet Politics English The ways in which the Internet is managed and controlled–often labeled as Internet Governance– are usually considered as standing on four main pillars: Technology, Market Laws, State Regulation and Uses. Nevertheless, its specific features, the consequences of the plurality of norms it involves and of the decision-making processes it entails are rarely addressed in a comprehensive analysis.

This book explores the Internet’s functioning both as a practical-intellectual experience and a political challenge. By means of several case studies, it proposes a substantial and reflexive treatment of multileveled, formal or informal Internet Politics.The book’s overall endeavor is to outline an understanding ofwhat is –or may be– a “digital common good”.

The authors are members of a European academic team gathered by the Vox Internet research program’s meetings. They adopt a multi-disciplinary approach, embedding technological innovation in the fi eld of social sciences (communication studies, sociology, law, political science and philosophy).
0 1
Discovery of novel term associations in a document collection Hynonen T.
Mahler S.
Toivonen H.
Lecture Notes in Computer Science English We propose a method to mine novel, document-specific associations between terms in a collection of unstructured documents. We believe that documents are often best described by the relationships they establish. This is also evidenced by the popularity of conceptual maps, mind maps, and other similar methodologies to organize and summarize information. Our goal is to discover term relationships that can be used to construct conceptual maps or so called BisoNets. The model we propose, tpf-idf-tpu, looks for pairs of terms that are associated in an individual document. It considers three aspects, two of which have been generalized from tf-idf to term pairs: term pair frequency (tpf; importance for the document), inverse document frequency (idf; uniqueness in the collection), and term pair uncorrelation (tpu; independence of the terms). The last component is needed to filter out statistically dependent pairs that are not likely to be considered novel or interesting by the user. We present experimental results on two collections of documents: one extracted from Wikipedia, and one containing text mining articles with manually assigned term associations. The results indicate that the tpf-idf-tpu method can discover novel associations, that they are different from just taking pairs of tf-idf keywords, and that they match better the subjective associations of a reader. 0 0
Distributed and collaborative requirements elicitation based on social intelligence Wen B.
Luo Z.
Peng Liang
Requirements elicitation
Semantic wiki
Social intelligence
Web 2.1
Proceedings - 9th Web Information Systems and Applications Conference, WISA 2012 English Requirements is the formal expression of user's needs. Also, requirements elicitation is the process of activity focusing on requirements collection. Traditional acquisition methods, such as interview, observation and prototype, are unsuited for the service-oriented software development featuring in the distributed stakeholders, collective intelligence and behavioral emergence. In this paper, a collaborative requirements elicitation approach based on social intelligence for networked software is put forward, and requirements-semantics concept is defined as the formal requirements description generated by collective participation. Furthermore, semantic wikis technology is chosen as requirements authoring platform to adapt the distributed and collaborative features. Faced to the wide-area distributed Internet, it combines with the Web 2.0 and the semantic web to revise the experts requirements-semantics model through the social classification. At the same time, instantiation of requirements model is finished with semantic tagging and validation. Apart from the traditional documentary specification, requirements-semantics artifacts will be exported from the elicitation process to the subsequent software production process, i.e. services aggregation and services resource customization. Experiment and prototype have proved the feasibility and effectiveness of the proposed approach. 0 0
Diversionary comments under political blog posts Wang J.
Yu C.T.
Yu P.S.
Ben Liu
Meng W.
Coreference resolution
Diversionary comments
Extraction from wikipedia
Topic model
ACM International Conference Proceeding Series English An important issue that has been neglected so far is the identification of diversionary comments. Diversionary comments under political blog posts are defined as comments that deliberately twist the bloggers' intention and divert the topic to another one. The purpose is to distract readers from the original topic and draw attention to a new topic. Given that political blogs have significant impact on the society, we believe it is imperative to identify such comments. We then categorize diversionary comments into 5 types, and propose an effective technique to rank comments in descending order of being diversionary. To the best of our knowledge, the problem of detecting diversionary comments has not been studied so far. Our evaluation on 2,109 comments under 20 different blog posts from Digg.com shows that the proposed method achieves the high mean average precision (MAP) of 92.6%. Sensitivity analysis indicates that the effectiveness of the method is stable under different parameter settings. 0 0
Do editors or articles drive collaboration? Multilevel statistical network analysis of wikipedia coauthorship Brian Keegan
Darren Gergle
Noshir Contractor
Exponential random graph model
Network analysis
English Prior scholarship on Wikipedia's collaboration processes has examined the properties of either editors or articles, but not the interactions between both. We analyze the coauthorship network of Wikipedia articles about breaking news demanding intense coordination and compare the properties of these articles and the editors who contribute to them to articles about contemporary and historical events. Using p*/ERGM methods to test a multi-level, multi-theoretical model, we identify how editors' attributes and editing patterns interact with articles' attributes and authorship history. Editors' attributes like prior experience have a stronger influence on collaboration patterns, but article attributes also play significant roles. Finally, we discuss the implications our findings and methods have for understanding the socio-material duality of collective intelligence systems beyond Wikipedia. 0 1
Do videowikis on the web support better (constructivist) learning in the basics of information systems science? Makkonen P. Connectivism
Constructivist learning
Learning of information systems
Problem-based learning
Screen capture video
Proceedings of the 9th International Conference on Information Technology, ITNG 2012 English This paper describes the combination of a wiki and screen capture videos as a complementary addition to conventional lectures in an information management and information systems development course. Our basis was collaborative problem-based learning with the problems defined by students. The idea was that students were expected to find concepts or issues from four lecture themes which are not well-defined or clarified for them. The students worked in small groups of two or three students or they completed the coursework individually. First, the students selected the theme which was most unclear for them. Second, the students selected the problematic things from this area and created the presentations associated with these issues. Our intention was that in this way we could run collaborative learning under the principles of the Jigsaw method. In our variant of this technique the students create presentations on different themes and the students teach each other by using these video presentations. The study found that videowiki-based coursework affects both external and internal motivation equally in most cases. This reflects that from the perspective of constructivism the videowikibased assignment is equally effective compared to learning without this setting. However, the development of knowledge concerning different course themes was positive in group of students who completed this videowiki assignment and the females benefited slightly more from this coursework. 0 0
DoSO: A document self-organizer Gerasimos Spanakis
Georgios Siolas
Andreas Stafylopatis
Document clustering
Document representation
Journal of Intelligent Information Systems English In this paper, we propose a Document Self Organizer (DoSO), an extension of the classic Self Organizing Map (SOM) model, in order to deal more efficiently with a document clustering task. Starting from a document representation model, based on important "concepts" exploiting Wikipedia knowledge, that we have previously developed in order to overcome some of the shortcomings of the Bag-of-Words (BOW) model, we demonstrate how SOM's performance can be boosted by using themost important concepts of the document collection to explicitly initialize the neurons. We also show how a hierarchical approach can be utilized in the SOMmodel and how this can lead to amore comprehensive final clustering result with hierarchical descriptive labels attached to neurons and clusters. Experiments show that the proposed model (DoSO) yields promising results both in terms of extrinsic and SOM evaluation measures. 0 0
Doctors use, but don’t rely totally on, Wikipedia English 0 0
Document classification by computing an echo in a very simple neural network Brouard C. Classification
Neural network
Relevance models
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI English In this paper we present a new classification system called ECHO. This system is based on a principle of echo and applied to document classification. It computes the score of a document for a class by combining a bottom-up and a top-down propagation of activation in a very simple neural network. This system bridges a gap between Machine Learning methods and Information Retrieval since the bottom-up and the top-down propagations can be seen as the measures of the specificity and exhaustivity which underlie the models of relevance used in Information Retrieval. The system has been tested on the Reuters 21578 collection and in the context of an international challenge on large scale hierarchical text classification with corpus extracted from Dmoz and Wikipedia. Its comparison with other classification systems has shown its efficiency. 0 0
Document graph model based on semantic information Yu X.
Peng L.
Liao J.
Huang Z.
Graph model
Topic model
Journal of Information and Computational Science English Graph model is widely used in a variety of fields. It is a challenge for us to construct the graph model for documents. The semantic information inside the document is coped with the ability to build the graph model. In this paper, a novel document graph model based on semantic information is proposed. The semantic information is extracted from Wikipedia, combining with other statistical information included in the contents, we can construct an original graph for documents. Based on PageRank algorithm, the original graph is processed to mine the main information and an accurate graph model can be obtained. Finally, our model is evaluated by utilizing LDA and GN algorithm. The results demonstrate the efficiency of the proposed method. 0 0
Domain-oriented semantic knowledge extraction Xiao K.
Li B.
Tan X.
Knowledge Extraction
Journal of Computational Information Systems English Semantic knowledge extraction task is groundwork of ontology building. As one of the most important public knowledge bases, Wikipedia has a lot of comparative advantages in the research field. In this paper, we propose a new method for extracting domain-oriented semantic knowledge from Wikipedia. During the process, every category in domain is assigned a weight, so that we can calculate the score of articles. Besides, practical experience in storing and utilizing big data of Wikipedia are detailed in this paper too. 0 0
Drawing a Data-Driven Portrait of Wikipedia Editors Robert West
Ingmar Weber
Carlos Castillo
Web usage
WikiSym English While there has been a substantial amount of research into the editorial and organizational processes within Wikipedia, little is known about how Wikipedia editors (Wikipedians) relate to the online world in general. We attempt to shed light on this issue by using aggregated log data from Yahoo!’s browser toolbar in order to analyze Wikipedians’ editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in popular culture. Then we inspect how editors’ general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors are more familiar with their active domains than average users. Finally, we analyze the data from a temporal perspective; e.g., we demonstrate that a user’s interest in the edited topic peaks immediately before the edit. Our results are relevant as they illuminate novel aspects of what has become many Web users’ prevalent source of information. 0 0
Drawing a data-driven portrait of Wikipedia editors Robert West
Ingmar Weber
Carlos Castillo
Web usage
WikiSym 2012 English While there has been a substantial amount of research into the editorial and organizational processes within Wikipedia, little is known about how Wikipedia editors (Wikipedians) relate to the online world in general. We attempt to shed light on this issue by using aggregated log data from Yahoo!'s browser toolbar in order to analyze Wikipedians' editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in popular culture. Then we inspect how editors' general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors are more familiar with their active domains than average users. Finally, we analyze the data from a temporal perspective; e.g., we demonstrate that a user's interest in the edited topic peaks immediately before the edit. Our results are relevant as they illuminate novel aspects of what has become many Web users' prevalent source of information. 0 0
Dynamic PageRank using evolving teleportation Rossi R.A.
Gleich D.F.
Lecture Notes in Computer Science English The importance of nodes in a network constantly fluctuates based on changes in the network structure as well as changes in external interest. We propose an evolving teleportation adaptation of the PageRank method to capture how changes in external interest influence the importance of a node. This framework seamlessly generalizes PageRank because the importance of a node will converge to the PageRank values if the external influence stops changing. We demonstrate the effectiveness of the evolving teleportation on the Wikipedia graph and the Twitter social network. The external interest is given by the number of hourly visitors to each page and the number of monthly tweets for each user. 0 0
Dynamic vocabularies for web-based concept detection by trend discovery Borth D.
Ulges A.
Breuel T.M.
Concept detection
Social media
MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia English We present a novel approach towards automatic vocabulary selection for video concept detection. Our key idea is to expand concept vocabularies with trending topics that we mine automatically on other media like Wikipedia or Twitter. We evaluate several strategies for extending concept detection to auto-detect these topics in new videos, either by linking them to a static concept vocabulary, by a visual learning of trends on the fly, or by an expansion of the vocabulary. Our study on 6,800 YouTube clips and the top 23 target trends (covering a timespan of 6 months) demonstrates that a direct visual classification of trends (by a "live" learning on trend videos) outperforms an inference from static vocabularies. However, further improvements can be achieved by a combination of both approaches. 0 0
Dynamics of Conflicts in Wikipedia Taha Yasseri
Róbert Sumi
András Rung
András Kornai
János Kertész
PLoS ONE English In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only. 44 1
Dynamics of conflicts in Wikipedia Taha Yasseri
Róbert Sumi
András Rung
András Kornai
János Kertész
Wikipedia; editorial activity; editors demography; circadian patterns To appear in PLoS ONE English In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only. 0 1
E-learning experience using open source software: Moodle Bansode S.Y.
Kumbhar R.
Technology-based learning
DESIDOC Journal of Library and Information Technology English The present paper highlights the efforts made by the Department of Library and Information Science, University of Pune to use an open source software, viz., Moodle for the promotion of e-learning in the department. Various utilities of the Moodle such as development of the course, blogs, wiki, question banks, notification to the students, etc., has been used. This article narrates the experience of designing, development and implementation of e-learning course for the 'Information Technology' paper of the MLISc curriculum. 0 0
E-participation and e-participants: Solving the patent 'crisis' Leith P. E-Gov
Patent system
International Review of Law, Computers and Technology English One of the major planks of some visions for E-Gov is that there is a willing participatory group who are more than happy to be involved in new forms of democracy and will be active and useful suppliers of input to e-consultation or e-participation processes. This group is different from that which goes online to the government website and signs a petition asking the prime minister to resign. It is becoming clear, though, that the commitment to e-participation may well be there in theory, but difficult to access in practice. Further, the participation that is most welcome can frequently require training and expertise that is not widely available or there may be differences in opinion as to the point of participation. In this paper I will look to the attempts to encourage participation in the patent system. The UK has initiated a trial system utilising New York Law School's Peer-To-Patent project, but has also attempted to involve participants in previous consultation exercises. I will use these as demonstrations of the sorts of problems that e-participation has met, and consider whether this new form of E-Gov is perhaps being oversold. The interesting question is whether participation is a growing tool that can ensure better public services from the State. My conclusion is that consultation and participatory projects can demonstrate I nvolvement and are certainly educative, but e-participatory projects are most likely incapable of achieving the goals set by their more optimistic advocates. The paper emphasises the patents field, but the lessons from it can - I suggest - beviewed as indicators having wider governance relevance. The primary point being made is that the technocratic view is always over-optimistic. 0 0
EachWiki: Facilitating wiki authoring by annotation suggestion Haofen Wang
Linyun Fu
Jin W.
Yiqin Yu
Category suggestion
Link suggestion
Semantic relation suggestion
ACM Transactions on Intelligent Systems and Technology English Wikipedia, one of the best-known wikis and the world's largest free online encyclopedia, has embraced the power of collaborative editing to harness collective intelligence. However, using such a wiki to create high-quality articles is not as easy as people imagine, given for instance the difficulty of reusing knowledge already available in Wikipedia. As a result, the heavy burden of upbuilding and maintaining the evergrowing online encyclopedia still rests on a small group of people. In this article, we aim at facilitating wiki authoring by providing annotation recommendations, thus lightening the burden of both contributors and administrators. We leverage the collective wisdom of the users by exploiting Semantic Web technologies with Wikipedia data and adopt a unified algorithm to support link, category, and semantic relation recommendation. A prototype system named EachWiki is proposed and evaluated. The experimental results show that it has achieved considerable improvements in terms of effectiveness, efficiency and usability. The proposed approach can also be applied to other wiki-based collaborative editing systems. 0 0
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data Márton Mestyán
Taha Yasseri
János Kertész
Big Data
English Use of socially generated "big data" to access information about collective states of the minds in human societies becomes a new paradigm in the emerging field of computational social science. One of the natural application of this would be prediction of the society's reaction to a new product in the sense of popularity and adoption rate. However, bridging between "real time monitoring" and "early predicting" remains as a big challenge. Here, we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie could be predicted well in advance by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia. 0 0
Edit conflict resolution in wikiBOK: A wiki-based bok formulation-aid system for new disciplines Yoshifumi Masunaga
Kazunari Ito
Yabuki T.
Takeshi Morita
Body of knowledge
Collective intelligence
Edit conflict resolver
Open collaboration
Semantic Media Wiki
Social informatics
Three-way merge
Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012 English A body of knowledge (BOK) of an academic field is an indispensable aid not only to people's understanding of the entirety of a targeted academic field, but also to designing a perfect curriculum of that field for educational purposes. However, in contrast to a BOK for a mature discipline such as computer science, the formulation of a BOK for a new discipline, such as social informatics, life science, and sustainability science, is difficult because academics in such a new discipline cannot present it in its entirety par advance. Therefore, a bottom-up and open collaborative approach based on collective intelligence seems promising, and contrasts strongly with the traditional style in which a BOK is formulated: by the authorities in the field in a top-down manner. WikiBOK is a wiki-based body of knowledge (BOK) formulation-aid system for new disciplines. It is developed based on BOK+, which is a novel BOK formulation principle for new disciplines that enables us to construct a BOK in a bottom-up manner. As its name indicates, WikiBOK uses Semantic Media Wiki (SMW) to facilitate its fundamental functions. A rich graphical user interface is provided using open source graph visualization software. The main objective of this paper is to illustrate how edit conflicts are resolved in WikiBOK. Needless to say, edit conflicts are unavoidable when WikiBOKers collaborate to formulate a BOK-tree. A WikiBOK edit conflict resolution principle is shown, and the WikiBOK Edit Conflict Resolver is implemented based on this principle. Social Informatics BOK (SIBOK) is under construction using WikiBOK. 0 0
Edição colaborativa na Wikipédia: desafios e possibilidades Carlos Frederico de Brito d’Andréa Educação científica e cidadania: abordagens teóricas e metodológicas para a formação de pesquisadores juvenis Portuguese 14 1
Educational Technology: Web 2.0 Bond M.C.
Cooney R.
Educational technology
Web 2.0
LMS to run a course online
And students "voice" removing inhibitions
Recommended EM education blogs
Social networking as Facebook
In sharing information
Web 1.0 and Web 2.0
Controlling what one sees
The when and how
Web 2.0 applications and technology
Facilitating learning
Web 2.0 applications
Users to gather
Web 2.0 holding one's interest
Reinforcing principles using test questions
Web 2.0. Web 2.0
Users interacting/sharing
Asynchronous learning opportunities
Wiki use
Educational material transported anywhere
Practical Teaching in Emergency Medicine, Second Edition English [No abstract available] 0 0
Effective ontology learning: Concepts' hierarchy building using plain text Wikipedia Ahmed K.B.S.
Toumouh A.
Malki M.
Concepts' hierarchy
Domain ontologies
Ontology learning from texts
CEUR Workshop Proceedings English Ontologies stand in the heart of the Semantic Web. Nevertheless, heavyweight or formal ontologies' engineering is being commonly judged to be a tough exercise which requires time and heavy costs. Ontology Learning is thus a solution for this exigency and an approach for the 'knowledge acquisition bottleneck'. Since texts are massively available everywhere, making up of experts' knowledge and their know-how, it is of great value to capture the knowledge existing within such texts. Our approach is thus an interesting research work which tries to answer the challenge of creating concepts' hierarchies from textual data. The significance of such a solution stems from the idea by which we take advantage of the Wikipedia encyclopedia to achieve some good quality results. 0 0
Effective tag recommendation system based on topic ontology using Wikipedia and WordNet Subramaniyaswamy V.
Chenthur Pandian S.
International Journal of Intelligent Systems English In this paper, we proposed a novel approach based on topic ontology for tag recommendation. The proposed approach intelligently generates tag suggestions to blogs. In this approach, we construct topic ontology through enriching the set of categories in existing small ontology called as Open Directory Project. To construct topic ontology, a set of topics and their associated semantic relationships is identified automatically from the corpus-based external knowledge resources such as Wikipedia and WordNet. The construction relies on two folds such as concept acquisition and semantic relation extraction. In the first fold, a topic-mapping algorithm is developed to acquire the concepts from the semantic of Wikipedia. A semantic similarity-clustering algorithm is used to compute the semantic similarity measure to group the set of similar concepts. The second is the semantic relation extraction algorithm, which derives associated semantic relations between the set of extracted topics from the lexical patterns between synsets in WordNet. A suitable software prototype is created to implement the topic ontology construction process. A Jena API framework is used to organize the set of extracted semantic concepts and their corresponding relationship in the form of knowledgeable representation of Web ontology language. Thus, Protégé tool provides the platform to visualize the automatically constructed topic ontology successfully. Using the constructed topic ontology, we can generate and suggest the most suitable tags for the new resource to users. The applicability of topic ontology with a spreading activation algorithm supports efficient recommendation in practice that can recommend the most popular tags for a specific resource. The spreading activation algorithm can assign the interest scores to the existing extracted blog content and tags. The weight of the tags is computed based on the activation score determined from the similarity between the topics in constructed topic ontology and content of the existing blogs. High-quality tags that has the highest activation score is recommended to the users. Finally, we conducted experimental evaluation of our tag recommendation approach using a large set of real-world data sets. Our experimental results explore and compare the capabilities of our proposed topic ontology with the spreading activation tag recommendation approach with respect to the existing AutoTag mechanism. And also discuss about the improvement in precision and recall of recommended tags on the data sets of Delicious and BibSonomy. The experiment shows that tag recommendation using topic ontology results in the folksonomy enrichment. Thus, we report the results of an experiment mean to improve the performance of the tag recommendation approach and its quality. 0 0
Effectiveness of shared leadership in online communities Haiping Zhu
Kraut R.
Aniket Kittur
Online community
Shared leadership
English Traditional research on leadership in online communities has consistently focused on the small set of people occupying leadership roles. In this paper, we use a model of shared leadership, which posits that leadership behaviors come from members at all levels, not simply from people in high-level leadership positions. Although every member can exhibit some leadership behavior, different types of leadership behavior performed by different types of leaders may not be equally effective. This paper investigates how distinct types of leadership behaviors (transactional, aversive, directive and person-focused) and the legitimacy of the people who deliver them (people in formal leadership positions or not) influence the contributions that other participants make in the context of Wikipedia. After using propensity score matching to control for potential pre-existing differences among those who were and were not targets of leadership behaviors, we found that 1) leadership behaviors performed by members at all levels significantly influenced other members' motivation; 2) transactional leadership and person-focused leadership were effective in motivating others to contribute more, whereas aversive leadership decreased other contributors' motivations; and 3) legitimate leaders were in general more influential than regular peer leaders. We discuss the theoretical and practical implication of our work. 0 1
Efficient updates for web-scale indexes over the cloud Antonopoulos P.
Konstantinou I.
Tsoumakos D.
Koziris N.
Proceedings - 2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012 English In this paper, we present a distributed system which enables fast and frequent updates on web-scale Inverted Indexes. The proposed update technique allows incremental processing of new or modified data and minimizes the changes required to the index, significantly reducing the update time which is now independent of the existing index size. By utilizing Hadoop MapReduce, for parallelizing the update operations, and HBase, for distributing the Inverted Index, we create a high-performance, fully distributed index creation and update system. To the best of our knowledge, this is the first open source system that creates, updates and serves large-scale indexes in a distributed fashion. Experiments with over 23 million Wikipedia documents demonstrate the speed and robustness of our implementation: It scales linearly with the size of the updates and the degree of change in the documents and demonstrates a constant update time regardless of the size of the underlying index. Moreover, our approach significantly increases its performance as more computational resources are acquired: It incorporates a 15.4GB update batch to a 64.2GB indexed dataset in about 21 minutes using just 12 commodity nodes, 3.3 times faster compared to using two nodes. 0 0
Emotions and dialogue in a peer-production community: The case of Wikipedia David Laniado
Andreas Kaltenbrunner
Carlos Castillo
Morell M.F.
Talk page
Gender gap
WikiSym 2012 English This paper presents a large-scale analysis of emotions in conversations among Wikipedia editors. Our focus is on the emotions expressed by editors in talk pages, measured by using the Affective Norms for English Words (ANEW). We find evidence that to a large extent women tend to participate in discussions with a more positive tone, and that administrators are more positive than non-administrators. Surprisingly, female non-administrators tend to behave like administrators in many aspects. We observe that replies are on average more positive than the comments they reply to, preventing many discussions from spiralling down into conflict. We also find evidence of emotional homophily: editors having similar emotional styles are more likely to interact with each other. Our findings offer novel insights into the emotional dimension of interactions in peer-production communities, and contribute to debates on issues such as the flattening of editor growth and the gender gap. 0 0
Emotions and dialogue in a peer-production community: the case of Wikipedia David Laniado
Carlos Castillo
Andreas Kaltenbrunner
Mayo Fuster Morell
Talk page
Gender gap
WikiSym English This paper presents a large-scale analysis of emotions in conversations among Wikipedia editors. Our focus is on the emotions expressed by editors in talk pages, measured by using the Affective Norms for English Words (ANEW).

We find evidence that to a large extent women tend to participate in discussions with a more positive tone, and that administrators are more positive than non-administrators. Surprisingly, female non-administrators tend to behave like administrators in many aspects.

We observe that replies are on average more positive than the comments they reply to, preventing many discussions from spiralling down into conflict. We also find evidence of emotional homophily: editors having similar emotional styles are more likely to interact with each other.

Our findings offer novel insights into the emotional dimension of interactions in peer-production communities, and contribute to debates on issues such as the flattening of editor growth and the gender gap.
0 0
Engaging second/foreign language students through electronic writing tasks: When learning design matters Caws C.G. Blogs
Foreign/second language
Learning design
Cutting-Edge Technologies in Higher Education English Based on the premise that computers have now become cultural and cognitive artifacts with which and not from which learners interact on a daily basis, this chapter focuses on best practices in preparing and engaging digital natives to become tomorrow's leaders of a global knowledge economy that is increasingly dependent on electronic modes of communications. Using a study based on online tools in a writing course taught at the University of Victoria (Canada), we take a qualitative interpretative stance to explain the opportunities and challenges of learning and teaching in such environments. We comment on such aspects as the need to properly address learner's functional skills (or lack off), the various tools that can be used to engage and motivate learners, and the need to go beyond methods based on delivery in order to better focus on the development of multiliteracies, in particular critical literacy and functional literacy. Our argument, grounded in cognitive and sociocultural theories of learning, favors an interdisciplinary approach while focusing on disciplines that are typically housed in the humanities, in particular second language academic programs. Our discussions and conclusions move from these case studies to a more general reflection on the extent to which electronic environments are reshaping higher education. Copyright 0 0
Engineering a controlled natural language into semantic MediaWiki Dantuluri P.
Davis B.
Ludwick P.
Handschuh S.
Lecture Notes in Computer Science English The Semantic Web is yet to gain mainstream recognition. In part this is caused by the relative complexity of the various semantic web formalisms, which act as a major barrier of entry to naive web users. In addition, in order for the Semantic Web to become a reality, we need semantic metadata. While controlled natural language research has sought to address these challenges, in the context of user friendly ontology authoring for domain experts, there has been little focus on how to adapt controlled languages for novice social web users. The paper describes an approach to using controlled languages for fact creation and management as opposed to ontology authoring, focusing on the domain of meeting minutes. For demonstration purposes, we developed a plug-in to the Semantic MediaWiki, which adds a controlled language editor extension. This editor aids the user while authoring or annotating in a controlled language in a user friendly manner. Controlled content is sent to a parsing service which generates semantic metadata from the sentences which are subsequently displayed and stored in the Semantic MediaWiki. The semantic metadata generated by the parser is grounded against a project documents ontology. The controlled language modeled covers a wide variety of sentences and topics used in the context of a meeting minute. Finally this paper provides a architectural overview of the annotation system. 0 0
English-to-traditional Chinese cross-lingual link discovery in articles with wikipedia corpus Chen L.-P.
Shih Y.-L.
Chen C.-T.
Ku T.
Hsieh W.-T.
Chiu H.-S.
Yang R.-D.
Cross-lingual link discovery
Link discovery
Linked data
Proceedings of the 24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012 English In this paper, we design a processing flow to produce linked data in articles, providing anchor-based term's additional information and related terms in different languages (English to Chinese). Wikipedia has been a very important corpus and knowledge bank. Although Wikipedia describes itself not a dictionary or encyclopedia, it is if high potential values in applications and data mining researches. Link discovery is a useful IR application, based on Data Mining and NLP algorithms and has been used in several fields. According to the results of our experiment, this method does make the result has improved. 0 0
Enhancing Critical Reflection on Simulation Through Wikis Beyer D.A. Debriefing
Nursing students
Clinical Simulation in Nursing English This article discusses using wikis as a teaching strategy for follow-up debriefing after students participated in human patient simulator activities. Wiki tools assist students in collaborating, sharing, creating, and editing documents that reinforce learning developed during simulation. Students are actively engaged in the learning process by sharing information and experiences obtained in simulation scenarios. The use of a wiki enhances debriefing reflection. Wikis provide students with a written document to answer questions, discuss content, and develop notes and study guides after a classroom simulation activity. 0 0
Enhancing document modeling for information retrieval using wikipedia Jing Luo
Meng B.
Tu X.
Document model
Information retrieval
Language model
International Journal of Advancements in Computing Technology English In this paper, we propose a Wikipedia-based document model within the language modeling framework, to break through the limitation of traditional BOW(Bag of Words)-based approaches. By means of concept detection and disambiguation, the original documents are translated into conceptual representations, which are subsequently used to update the document models. The Wikipedia-based document model is evaluated on the TREC Ad Hoc Track (Disks 1, 2, and 3) collections. Experiments show significant improvements with respect to the baseline models. 0 0
Enhancing networking and proactive learning skills in the first year university experience through the use of wikis Morley D.A. Blended
Web 2.0
Nurse Education Today English This paper discusses the introduction of blended learning strategies, a combination of traditional and online techniques, into the first year of a new preregistration nursing advanced diploma and degree programme at Bournemouth University (UK).During a ten week sociology of health academic unit, in the first term of a three year nursing course, wikis were introduced as a complementary learning technique to traditional lectures and seminars. Wikis, an online application, provided eleven student seminar groups (each divided into 4 online or elearning groups of 6-8 students) with the potential to communicate collaboratively "anytime, anywhere" (JISC, 2010) to discuss a sociology preparation activity for the preceding week. The implementation of this elearning tool was structured through the application of Salmon's five stage model (Salmon, 2002) and evaluated from 69 students' online contributions to wikis as well as questionnaires completed by both a sample of students and academic staff. As well as the many comments made by students the evaluation indicated that 45% of students' responses valued wikis as a communication tool and 33% believed it promoted or allowed the sharing of group views.The evaluation presents and critiques the initial project management using Salmon's five stage model and the engagement of students and academic staff. In particular it begins to show how wikis have the potential to structure academic learning and promote social networking in the crucial first few months of a course. 0 0
Enhancing the scenario: Emerging technologies and experiential learning in second language instructional design DeHaan J.
Johnson N.H.
Digital video
English as a foreign language
Experiential learning
Sociocultural theory
Strategic interaction
International Journal of Learning English The affordances provided by technology for increasing efficacy of foreign language education have been a major research area within applied linguistics over the past thirty years or so (see Hubbard, 2006 for an overview). In a Japanese context, there are culturally based issues with foreign language education at the tertiary level, such as large class sizes and low student motivation, that present educators with specific challenges where technology may provide effective mediational means to improve practice and learner outcomes. In this article, we describe an eight-week teaching intervention that was designed, through digital and web technologies readily available to teachers, to improve the communication skills of Japanese university students of English. The strategic interaction framework, developed by DiPietro (1987), was enhanced by use of digital video and a freely available wiki site. Performances were digitally video recorded and uploaded to a private wiki and participants used this to evaluate, transcribe and self-correct their performances. The instructor then used the video and text to focus post-performance group debriefing sessions. The results suggest that a wiki, digital video, and strategic interaction-based experiential learning cycles can be effectively integrated to mediate Japanese university EFL students' oral communication development. Technical and pedagogical recommendations are offered. © Common Ground, Jonathan deHaan, Neil H. Johnson. 0 0
Enhancing the undergraduate experience through a collaborative wiki exercise to teach nursing students discipline specific terminology Doherty I.
Honey M.
Stewart L.
Proceedings of the European Conference on e-Government, ECEG English We present a randomized control trial research project that involved undergraduate nursing students working in small groups using a wiki to develop a collaborative glossary of health specific terminology. The background to the project is explained with reference to the relevant literature and the research aims and research method are both discussed in detail. We also present and discuss some preliminary results. 0 0
Enrichment of inflection dictionaries: Automatic extraction of semantic labels from encyclopedic definitions Chrzaszcz P. Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science, NLPCS 2012, in Conjunction with ICEIS 2012 English Inflection dictionaries are widely used in many natural language processing tasks, especially for inflecting languages. However, they lack semantic information, which could increase the accuracy of such processing. This paper describes a method to extract semantic labels from encyclopedic entries. Adding such labels to an inflection dictionary could eliminate the need of using ontologies and similar complex semantic structures for many typical tasks. A semantic label is either a single word or a sequence of words that describes the meaning of a headword, hence it is similar to a semantic category. However, no taxonomy of such categories is known prior to the extraction. Encyclopedic articles consist of headwords and their definitions, so the definitions are used as sources for semantic labels. The described algorithm has been implemented for extracting data from the Polish Wikipedia. It is based on definition structure analysis, heuristic methods and word form recognition and processing with use of the Polish Inflection Dictionary. This paper contains a description of the method and test results as well as discussion on possible further development. 0 0
… further results