Knowledge representation

From WikiPapers
Jump to: navigation, search

Knowledge representation is included as keyword or extra keyword in 0 datasets, 0 tools and 52 publications.


There is no datasets for this keyword.


There is no tools for this keyword.


Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
A user centred approach to represent expert knowledge: A case study at STMicroelectronics Brichni M.
Gzara L.
Dupuy-Chessa S.
Jeannet C.
Proceedings - International Conference on Research Challenges in Information Science English 2014 The rapid growth of companies, the departure of employees, the complexity of the new technology and the rapid proliferation of information, are reasons why companies seek to capitalize their expert knowledge. STMicroelectronics has opted for a Wiki to effectively capitalize and share some of its knowledge. However, to accomplish its objective, the Wiki content must correspond to users' needs. Therefore, we propose a user centred approach for the definition of knowledge characteristics and its integration in theWiki. Our knowledge representation is based on three facets "What, Why and How". In this paper, the approach is applied to the Reporting activity at STMicroelectronics, which is considered as a knowledge intensive activity. 0 0
Development of a semantic and syntactic model of natural language by means of non-negative matrix and tensor factorization Anisimov A.
Marchenko O.
Taranukha V.
Vozniuk T.
Lecture Notes in Computer Science English 2014 A method for developing a structural model of natural language syntax and semantics is proposed. Syntactic and semantic relations between parts of a sentence are presented in the form of a recursive structure called a control space. Numerical characteristics of these data are stored in multidimensional arrays. After factorization, the arrays serve as the basis for the development of procedures for analyses of natural language semantics and syntax. 0 0
SmartWiki: A reliable and conflict-refrained Wiki model based on reader differentiation and social context analysis Haifeng Zhao
Kallander W.
Johnson H.
Wu S.F.
Knowledge-Based Systems English 2013 Wiki systems, such as Wikipedia, provide a multitude of opportunities for large-scale online knowledge collaboration. Despite Wikipedia's successes with the open editing model, dissenting voices give rise to unreliable content due to conflicts amongst contributors. Frequently modified controversial articles by dissent editors hardly present reliable knowledge. Some overheated controversial articles may be locked by Wikipedia administrators who might leave their own bias in the topic. It could undermine both the neutrality and freedom policies of Wikipedia. As Richard Rorty suggested "Take Care of Freedom and Truth Will Take Care of Itself"[1], we present a new open Wiki model in this paper, called TrustWiki, which bridge readers closer to the reliable information while allowing editors to freely contribute. From our perspective, the conflict issue results from presenting the same knowledge to all readers, without regard for the difference of readers and the revealing of the underlying social context, which both causes the bias of contributors and affects the knowledge perception of readers. TrustWiki differentiates two types of readers, "value adherents" who prefer compatible viewpoints and "truth diggers" who crave for the truth. It provides two different knowledge representation models to cater for both types of readers. Social context, including social background and relationship information, is embedded in both knowledge representations to present readers with personalized and credible knowledge. To our knowledge, this is the first paper on knowledge representation combining both psychological acceptance and truth reveal to meet the needs of different readers. Although this new Wiki model focuses on reducing conflicts and reinforcing the neutrality policy of Wikipedia, it also casts light on the other content reliability problems in Wiki systems, such as vandalism and minority opinion suppression. © 2013 Elsevier B.V. All rights reserved. 0 0
Ukrainian WordNet: Creation and filling Anisimov A.
Marchenko O.
Nikonenko A.
Porkhun E.
Taranukha V.
Lecture Notes in Computer Science English 2013 This paper deals with the process of developing a lexical semantic database for Ukrainian language - UkrWordNet. The architecture of the developed system is described in detail. The data storing structure and mechanisms of access to knowledge are reviewed along with the internal logic of the system and some key software modules. The article is also concerned with the research and development of automated techniques of UkrWordNet Semantic Network replenishment and extension. 0 0
YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia Johannes Hoffart
Suchanek F.M.
Berberich K.
Gerhard Weikum
Artificial Intelligence English 2013 We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 447 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95% of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatio-temporal dimension, and our knowledge representation SPOTL, an extension of the original SPO-triple model to time and space. © 2012 Elsevier B.V. All rights reserved. 0 0
A framework to represent and mine knowledge evolution from Wikipedia revisions Wu X.
Wei Fan
Sheng M.
Lei Zhang
Shi X.
Su Z.
Yiqin Yu
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion English 2012 State-of-the-art knowledge representation in semantic web employs a triple format (subject-relation-object). The limitation is that it can only represent static information, but cannot easily encode revisions of semantic web and knowledge evolution. In reality, knowledge does not stay still but evolves over time. In this paper, we first introduce the concept of "quintuple representation" by adding two new fields, state and time, where state has two values, either in or out, to denote that the referred knowledge takes effective or becomes expired at the given time. We then discuss a twostep statistical framework to mine knowledge evolution into the proposed quintuple representation. Utilizing extracted quintuple properly, it not only can reveal knowledge changing history but also detect expired information. We evaluate the proposed framework on Wikipedia revisions, as well as, common web pages currently not in semantic web format. Copyright is held by the author/owner(s). 0 0
Combining AceWiki with a CAPTCHA system for collaborative knowledge acquisition Nalepa G.J.
Adrian W.T.
Szymon Bobek
Maslanka P.
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI English 2012 Formalized knowledge representation methods allow to build useful and semantically enriched knowledge bases which can be shared and reasoned upon. Unfortunately, knowledge acquisition for such formalized systems is often a time-consuming and tedious task. The process requires a domain expert to provide terminological knowledge, a knowledge engineer capable of modeling knowledge in a given formalism, and also a great amount of instance data to populate the knowledge base. We propose a CAPTCHA-like system called AceCAPTCHA in which users are asked questions in a controlled natural language. The questions are generated automatically based on a terminology stored in a knowledge base of the system, and the answers provided by users serve as instance data to populate it. The implementation uses AceWiki semantic wiki and a reasoning engine written in Prolog. 0 0
DoSO: A document self-organizer Gerasimos Spanakis
Georgios Siolas
Andreas Stafylopatis
Journal of Intelligent Information Systems English 2012 In this paper, we propose a Document Self Organizer (DoSO), an extension of the classic Self Organizing Map (SOM) model, in order to deal more efficiently with a document clustering task. Starting from a document representation model, based on important "concepts" exploiting Wikipedia knowledge, that we have previously developed in order to overcome some of the shortcomings of the Bag-of-Words (BOW) model, we demonstrate how SOM's performance can be boosted by using themost important concepts of the document collection to explicitly initialize the neurons. We also show how a hierarchical approach can be utilized in the SOMmodel and how this can lead to amore comprehensive final clustering result with hierarchical descriptive labels attached to neurons and clusters. Experiments show that the proposed model (DoSO) yields promising results both in terms of extrinsic and SOM evaluation measures. 0 0
Improving cross-document knowledge discovery using explicit semantic analysis Yan P.
Jin W.
Lecture Notes in Computer Science English 2012 Cross-document knowledge discovery is dedicated to exploring meaningful (but maybe unapparent) information from a large volume of textual data. The sparsity and high dimensionality of text data present great challenges for representing the semantics of natural language. Our previously introduced Concept Chain Queries (CCQ) was specifically designed to discover semantic relationships between two concepts across documents where relationships found reveal semantic paths linking two concepts across multiple text units. However, answering such queries only employed the Bag of Words (BOW) representation in our previous solution, and therefore terms not appearing in the text literally are not taken into consideration. Explicit Semantic Analysis (ESA) is a novel method proposed to represent the meaning of texts in a higher dimensional space of concepts which are derived from large-scale human built repositories such as Wikipedia. In this paper, we propose to integrate the ESA technique into our query processing, which is capable of using vast knowledge from Wikipedia to complement existing information from text corpus and alleviate the limitations resulted from the BOW representation. The experiments demonstrate the search quality has been greatly improved when incorporating ESA into answering CCQ, compared with using a BOW-based approach. 0 0
Tasteweights: A visual interactive hybrid recommender system Svetlin Bostandjiev
John O'Donovan
Tobias Hollerer
RecSys'12 - Proceedings of the 6th ACM Conference on Recommender Systems English 2012 This paper presents an interactive hybrid recommendation system that generates item predictions from multiple social and semantic web resources, such as Wikipedia, Facebook, and Twitter. The system employs hybrid techniques from traditional recommender system literature, in addition to a novel interactive interface which serves to explain the recommendation process and elicit preferences from the end user. We present an evaluation that compares different interactive and non-interactive hybrid strategies for computing recommendations across diverse social and semantic web APIs. Results of the study indicate that explanation and interaction with a visual representation of the hybrid system increase user satisfaction and relevance of predicted content. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM). 0 0
The role of AI in wisdom of the crowds for the social construction of knowledge on sustainability Maher M.L.
Fisher D.H.
AAAI Spring Symposium - Technical Report English 2012 One of the original applications of crowdsourcing the construction of knowledge is Wikipedia, which relies entirely on people to contribute, extend, and modify the representation of knowledge. This paper presents a case for combining AI and wisdom of the crowds for the social construction of knowledge. Our social-computational approach to collective intelligence combines the strengths of human cognitive diversity in producing content and the capabilities of an AI, through methods such as topic modeling, to link and synthesize across these human contributions. In addition to drawing from established domains such as Wikipedia for inspiration and guidance, we present the design of a system that incorporates AI into wisdom of the crowds to develop a knowledge base on sustainability. In this setting the AI plays the role of scholar, as might many of the other participants, drawing connections and synthesizing across contributions. We close with a general discussion, speculating on educational implications and other roles that an AI can play within an otherwise collective human intelligence. Copyright © 2012, Association for the Advancement of Artificial Intelligence. All rights reserved. 0 0
Wikipedia in the age of Siri: Task-based evaluation of Google, Wikipedia, and Wolfram Alpha Jeong H. WikiSym 2012 English 2012 In this paper, we describe a task-based method to evaluate relative effectiveness of Wikipedia. We then use this method to compare Wikipedia against an internet search engine (Google) and an answer engine that uses structured data (Wolfram Alpha). 0 0
Capability modeling of knowledge-based agents for commonsense knowledge integration Kuo Y.-L.
Hsu J.Y.-J.
Lecture Notes in Computer Science English 2011 Robust intelligent systems require commonsense knowledge. While significant progress has been made in building large commonsense knowledge bases, they are intrinsically incomplete. It is difficult to combine multiple knowledge bases due to their different choices of representation and inference mechanisms, thereby limiting users to one knowledge base and its reasonable methods for any specific task. This paper presents a multi-agent framework for commonsense knowledge integration, and proposes an approach to capability modeling of knowledge bases without a common ontology. The proposed capability model provides a general description of large heterogeneous knowledge bases, such that contents accessible by the knowledge-based agents may be matched up against specific requests. The concept correlation matrix of a knowledge base is transformed into a k-dimensional vector space using low-rank approximation for dimensionality reduction. Experiments are performed with the matchmaking mechanism for commonsense knowledge integration framework using the capability models of ConceptNet, WordNet, and Wikipedia. In the user study, the matchmaking results are compared with the ranked lists produced by online users to show that over 85% of them are accurate and have positive correlation with the user-produced ranked lists. 0 0
Concept-based information retrieval using explicit semantic analysis Egozi O.
Shaul Markovitch
Evgeniy Gabrilovich
ACM Transactions on Information Systems English 2011 Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in the documents and in the queries. Furthermore, the relationship between these related keywords may be semantic rather than syntactic, and capturing it thus requires access to comprehensive human world knowledge. Concept-based retrieval methods have attempted to tackle these difficulties by using manually built thesauri, by relying on term cooccurrence data, or by extracting latent word relationships and concepts from a corpus. In this article we introduce a new concept-based retrieval approach based on Explicit Semantic Analysis (ESA), a recently proposed method that augments keywordbased text representation with concept-based features, automatically extracted from massive human knowledge repositories such as Wikipedia. Our approach generates new text features automatically, and we have found that high-quality feature selection becomes crucial in this setting to make the retrieval more focused. However, due to the lack of labeled data, traditional feature selection methods cannot be used, hence we propose new methods that use self-generated labeled training data. The resulting system is evaluated on several TREC datasets, showing superior performance over previous state-of-the-art results. 0 0
Engineering intelligent systems on the knowledge formalization continuum Joachim Baumeister
Jochen Reutelshoefer
Frank Puppe
International Journal of Applied Mathematics and Computer Science English 2011 In spite of their industrial success, the development of intelligent systems is still a complex and risky task. When building intelligent systems, we see that domain knowledge is often present at different levels of formalization-ranging from text documents to explicit rules. In this paper, we describe the knowledge formalization continuum as a metaphor to help domain specialists during the knowledge acquisition phase. To make use of the knowledge formalization continuum, the agile use of knowledge representations within a knowledge engineering project is proposed, as well as transitions between the different representations, when required. We show that a semantic wiki is a flexible tool for engineering knowledge on the knowledge formalization continuum. Case studies are taken from one industrial and one academic project, and they illustrate the applicability and benefits of semantic wikis in combination with the knowledge formalization continuum. 0 0
Exploiting arabic wikipedia for automatic ontology generation: A proposed approach Al-Rajebah N.I.
Al-Khalifa H.S.
Al-Salman A.S.
2011 International Conference on Semantic Technology and Information Retrieval, STAIR 2011 English 2011 Ontological models play an important role in the Semantic Web. Despite being widely spread, there are a few known attempts to build ontologies for the Arabic language. As a result, a lack of Arabic Semantic Web applications is encountered. In this paper, we propose an approach to build ontologies automatically for the Arabic language from Wikipedia. Our approach relies on the semantic field theory such that any Wikipedian article is analyzed to extract semantic relations using its infobox and the list of categories. We will also present our system architecture along with an initial evaluation to evaluate the effectiveness and correctness of the resultant ontological model. 0 0
Extracting events from Wikipedia as RDF triples linked to widespread semantic web datasets Carlo Aliprandi
Francesco Ronzano
Andrea Marchetti
Maurizio Tesconi
Salvatore Minutoli
Lecture Notes in Computer Science English 2011 Many attempts have been made to extract structured data from Web resources, exposing them as RDF triples and interlinking them with other RDF datasets: in this way it is possible to create clouds of highly integrated Semantic Web data collections. In this paper we describe an approach to enhance the extraction of semantic contents from unstructured textual documents, in particular considering Wikipedia articles and focusing on event mining. Starting from the deep parsing of a set of English Wikipedia articles, we produce a semantic annotation compliant with the Knowledge Annotation Format (KAF). We extract events from the KAF semantic annotation and then we structure each event as a set of RDF triples linked to both DBpedia and WordNet. We point out examples of automatically mined events, providing some general evaluation of how our approach may discover new events and link them to existing contents. 0 0
Extracting events from wikipedia as RDF triples linked to widespread semantic web datasets Carlo Aliprandi
Francesco Ronzano
Andrea Marchetti
Maurizio Tesconi
Salvatore Minutoli
OCSC English 2011 0 0
How to Reason by HeaRT in a Semantic Knowledge-Based Wiki Weronika T. Adrian
Szymon Bobek
Grzegorz J. Nalepa
Krzysztof Kaczor
Krzysztof Kluza
ICTAI English 2011 0 0
How to reason by HeaRT in a semantic knowledge-based Wiki Adrian W.T.
Szymon Bobek
Nalepa G.J.
Krzysztof Kaczor
Krzysztof Kluza
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI English 2011 Semantic wikis constitute an increasingly popular class of systems for collaborative knowledge engineering. We developed Loki, a semantic wiki that uses a logic-based knowledge representation. It is compatible with semantic annotations mechanism as well as Semantic Web languages. We integrated the system with a rule engine called HeaRT that supports inference with production rules. Several modes for modularized rule bases, suitable for the distributed rule bases present in a wiki, are considered. Embedding the rule engine enables strong reasoning and allows to run production rules over semantic knowledge bases. In the paper, we demonstrate the system concepts and functionality using an illustrative example. 0 0
Is your ontology a burden or a gem? - towards xtreme ontology engineering Tatarintseva O.
Ermolayev V.
Fensel A.
CEUR Workshop Proceedings English 2011 One of the commonly acknowledged shortcomings of Semantic Technologies that prevents their wide adoption in industry is the lack of the commitment by the intended domain experts and users. This shortcoming becomes even more influential in the domains that change sporadically and require appropriate changes in the respective knowledge representations. This discussion paper argues that a more active involvement of the intended user community, comprising subject experts in the domain, may substantially ease gaining the required commitment of the critical mass of the domain users to the developed domain ontology. As a possible approach for building an instrumental platform for that, the paper suggests the use of the Semantic MediaWiki based collaboration infrastructure for maintaining and discussing ontology descriptions by the community of its intended users and developers. We also report how a prototypical ontology documentation wiki has been used for gaining the commitment of ontology users in the ACTIVE European project. 0 0
Loki-presentation of logic-based semantic wiki Adrian W.T.
Nalepa G.J.
CEUR Workshop Proceedings English 2011 TOOL PRESENTATION: The paper presents a semantic wiki, called Loki, with strong logical knowledge representation using rules. The system uses a coherent logic-based representation for semantic annotations of the content and implementing reasoning procedures. The representation uses the logic programming paradigm and the Prolog programming language. The proposed architecture allows for rule-based reasoning in the wiki. It also provides a compatibility layer with the popular Semantic MediaWiki platform, directly parsing its annotations. 0 0
Toward a semantic vocabulary for systems engineering Di Maio P. ACM International Conference Proceeding Series English 2011 The web can be the most efficient medium for sharing knowledge, provided appropriate technological artifacts such as controlled vocabularies and metadata are adopted. In our research we study the degree of such adoption applied to the systems engineering domain. This paper is a work in progress report discussing issues surrounding knowledge extraction and representation, proposing an integrated approach to tackle various challenges associated with the development of a shared vocabulary for the practice. 0 0
Annotating and searching web tables using entities, types and relationships Limaye G.
Sarawagi S.
Soumen Chakrabarti
Proceedings of the VLDB Endowment English 2010 Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational world knowledge is usually considerably better than completely unstructured, free-format text. At the same time, unlike manually-created knowledge bases, relational information mined from "organic" Web tables need not be constrained by availability of precious editorial time. Unfortunately, in the absence of any formal, uniform schema imposed on Web tables, Web search cannot take advantage of these high-quality sources of relational information. In this paper we propose new machine learning techniques to annotate table cells with entities that they likely mention, table columns with types from which entities are drawn for cells in the column, and relations that pairs of table columns seek to express. We propose a new graphical model for making all these labeling decisions for each table simultaneously, rather than make separate local decisions for entities, types and relations. Experiments using the YAGO catalog, DBPedia, tables from Wikipedia, and over 25 million HTML tables from a 500 million page Web crawl uniformly show the superiority of our approach. We also evaluate the impact of better annotations on a prototype relational Web search tool. We demonstrate clear benefits of our annotations beyond indexing tables in a purely textual manner. 0 0
Building ontological models from Arabic Wikipedia: a proposed hybrid approach Nora I. Al-Rajebah
Hend S. Al-Khalifa
AbdulMalik S. Al-Salman
IiWAS English 2010 0 0
Centroid-based classification enhanced with Wikipedia Abdullah Bawakid
Mourad Oussalah
Proceedings - 9th International Conference on Machine Learning and Applications, ICMLA 2010 English 2010 Most of the traditional text classification methods employ Bag of Words (BOW) approaches relying on the words frequencies existing within the training corpus and the testing documents. Recently, studies have examined using external knowledge to enrich the text representation of documents. Some have focused on using WordNet which suffers from different limitations including the available number of words, synsets and coverage. Other studies used different aspects of Wikipedia instead. Depending on the features being selected and evaluated and the external knowledge being used, a balance between recall, precision, noise reduction and information loss has to be applied. In this paper, we propose a new Centroid-based classification approach relying on Wikipedia to enrich the representation of documents through the use of Wikpedia's concepts, categories structure, links, and articles text. We extract candidate concepts for each class with the help of Wikipedia and merge them with important features derived directly from the text documents. Different variations of the system were evaluated and the results show improvements in the performance of the system. 0 0
Faceted Wikipedia search Rasmus Hahn
Christian Bizer
Christopher Sahnwaldt
Christian Herta
Scott Robinson
Burgle M.
Duwiger H.
Ulrich Scheel
Lecture Notes in Business Information Processing English 2010 Wikipedia articles contain, besides free text, various types of structured information in the form of wiki markup. The type of wiki content that is most valuable for search are Wikipedia infoboxes, which display an article's most relevant facts as a table of attribute-value pairs on the top right-hand side of the Wikipedia page. Infobox data is not used by Wikipedia's own search engine. Standard Web search engines like Google or Yahoo also do not take advantage of the data. In this paper, we present Faceted Wikipedia Search, an alternative search interface for Wikipedia, which facilitates infobox data in order to enable users to ask complex questions against Wikipedia knowledge. By allowing users to query Wikipedia like a structured database, Faceted Wikipedia Search helps them to truly exploit Wikipedia's collective intelligence. 0 0
Harnessing collective intelligence: Wiki and social network from end-user perspective Behnaz Gholami
Roshanak Safavi
IC4E 2010 - 2010 International Conference on e-Education, e-Business, e-Management and e-Learning English 2010 In the social web in which "people socialize or interact with each other throughout the World Wide Web, social interactions lead to the creation of explicit and meaningfully rich knowledge representations". Emergence of social web shed light on the concept of collective intelligence (CI). Web 2.0 technologies as key part of social semantic web, play an important role to harness the CI. Web 2.0 technologies are divided into the end-user and technical perspectives. In this paper CI and Web 2.0 is assessed with more details and through a theoretical framework regarding the end-user perspective. From all various kinds of Web 2.0 technologies Wikis and Social networks are chosen due to their huge contribution to CI. This paper focuses on end-user perspective of Wiki and Social network; categorizes the end-user perspective of these two technologies into 4 core aspects; and on the basis of findings from a web-based questionnaire, tests the relationship between each component of these 4 aspects and the CI. 0 0
How to Transform Personal Knowledge into Collaborative Knowledge with a Wiki Dedicated to Microlearning Nathalie Bricon-Souf
Emma Przewozny
ICICCI English 2010 0 0
Image interpretation using large corpus: Wikipedia Rahurkar M.
Tsai S.-F.
Dagli C.
Huang T.S.
Proceedings of the IEEE English 2010 Image is a powerful medium for expressing one's ideas and rightly confirms the adage, One picture is worth a thousand words. In this work, we explore the application of world knowledge in the form of Wikipedia to achieve this objectiveliterally. In the first part, we disambiguate and rank semantic concepts associated with ambiguous keywords by exploiting link structure of articles in Wikipedia. In the second part, we explore an image representation in terms of keywords which reflect the semantic content of an image. Our approach is inspired by the desire to augment low-level image representation with massive amounts of world knowledge, to facilitate computer vision tasks like image retrieval based on this information. We represent an image as a weighted mixture of a predetermined set of concrete concepts whose definition has been agreed upon by a wide variety of audience. To achieve this objective, we use concepts defined by Wikipedia articles, e.g., sky, building, or automobile. An important advantage of our approach is availability of vast amounts of highly organized human knowledge in Wikipedia. Wikipedia evolves rapidly steadily increasing its breadth and depth over time. 0 0
Improving human-agent conversations by accessing contextual knowledge from Wikipedia Alexa Breuing Proceedings - 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2010 English 2010 In order to talk to each other meaningfully, conversational partners utilize different types of conversational knowledge. Due to the fact that speakers often use grammatically incomplete and incorrect sentences in spontaneous language, knowledge about conversational and terminological context turns out to be as much important in language understanding as traditional linguistic analysis. In the context of the KnowCIT project we want to improve human-agent conversations by connecting the agent to an adequate representation of such contextual knowledge drawn from the online encyclopedia Wikipedia. Thereby we make use of additional components provided by Wikipedia which goes beyond encyclopedical information to identify the current dialog topic and to implement human like look-up abilities. 0 0
Increasing collaborative knowledge management in your organization: Characteristics of Wiki technology and Wiki users Hester A.J. SIGMIS CPR'10 - Proceedings of the 2010 ACM SIGMIS Computer Personnel Research Conference English 2010 This study examines characteristics of Wiki technology and wiki users in an effort to uncover factors facilitating increased adoption and usage of Wiki technology as a collaborative knowledge management tool. The current business environment is characterized by trends in mobility, virtualization and globalization. These trends call for more extensive interaction and collaboration both internal and external to organizations. The resulting changes have been met with an emerging trend of a new generation of Internet-based technologies described as Web 2.0. The umbrella of Web 2.0 technologies support a more collaborative business environment and span across time and distance. Web 2.0 advances have subsequently fostered new approaches to knowledge management with Wiki technology making way as an effective alternative to traditional knowledge management systems. Wiki technology features the unique characteristics of open editing and revision and history capabilities. When these features are combined with knowledge representation and maintenance features and harnessing of collective wisdom, Wiki technology may enable higher levels of collaboration facilitating more effective knowledge processes. Nonetheless, technology and processes are not the only components of a knowledge management system. This study focuses on the users of Wiki technology as another key element in effective utilization of a collaborative knowledge management system. 0 0
Mash-up of lexwiki and web-protégé for distributed authoring of large-scale biomedical terminologies Jiang G.
Solbrig H.R.
Chute C.G.
CEUR Workshop Proceedings English 2010 In this presentation, we propose a framework for distributed authoring of large-scale biomedical ter-minologies, which comprises three modules: a structured proposal creation module using semantic wiki machinery, a proposal harvesting module using a formal ontology editing platform and a backend module with a formal terminology model. We devel-oped a prototype of the framework based on a real world use case through a mash-up of LexWiki and Web-Protégé. 0 0
Merging of topic maps based on corpus Xue Y.
Wei Liu
Feng B.
Cao W.
Proceedings - International Conference on Electrical and Control Engineering, ICECE 2010 English 2010 The distributed topic maps often need be merged when they are used for knowledge representation, the similarity calculation of two topics is a critical factor which affects the quality of final topic maps directly. In this paper, we present a novel approach to calculate the similarity of topics and merge the distributed topic maps, the method not only implements the syntax comparison between the topics, but constructs a domain-specific dictionary to resolve the low precision of topic semantic similarity calculation using the common dictionary purely, the massive texts are gathered form Wikipedia and Google snippets as corpus, on which the similarity score of the specific terms is calculated and stored to dictionary by a semantic text comparison method. The experiment indicates the new method can resolve particularly the problems of the common dictionary lacking many technical terms. 0 0
Semantic MediaWiki in operation: Experiences with building a semantic portal Herzig D.M.
Basil Ell
Lecture Notes in Computer Science English 2010 Wikis allow users to collaboratively create and maintain content. Semantic wikis, which provide the additional means to annotate the content semantically and thereby allow to structure it, experience an enormous increase in popularity, because structured data is more usable and thus more valuable than unstructured data. As an illustration of leveraging the advantages of semantic wikis for semantic portals, we report on the experience with building the AIFB portal based on Semantic MediaWiki. We discuss the design, in particular how free, wiki-style semantic annotations and guided input along a predefined schema can be combined to create a flexible, extensible, and structured knowledge representation. How this structured data evolved over time and its flexibility regarding changes are subsequently discussed and illustrated by statistics based on actual operational data of the portal. Further, the features exploiting the structured data and the benefits they provide are presented. Since all benefits have its costs, we conducted a performance study of the Semantic MediaWiki and compare it to MediaWiki, the non-semantic base platform. Finally we show how existing caching techniques can be applied to increase the performance. 0 0
Semantic enrichment of text representation with wikipedia for text classification Yamakawa H.
Peng J.
Feldman A.
Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics English 2010 Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the techniques try to achieve good classification performance while taking a document only by its words (e.g. statistical analysis on word frequency and distribution patterns). One of the recent trends in text classification research is to incorporate more semantic interpretation in text classification, especially by using Wikipedia. This paper introduces a technique for incorporating the vast amount of human knowledge accumulated in Wikipedia into text representation and classification. The aim is to improve classification performance by transforming general terms into a set of related concepts grouped around semantic themes. In order to achieve this goal, this paper proposes a unique method for breaking the enormous amount of extracted Wikipedia knowledge (concepts) into smaller pieces (subsets of concepts). The subsets of concepts are separately used to represent the same set of documents in a number of different ways, from which an ensemble of classifiers is built. Experimental results show that an ensemble of classifiers individually trained on a different representation of the document set performs better with increased accuracy and stability than that of a classifier trained only on the original document set. 0 0
Semantics for digital engineering archives supporting engineering design education Regli W.
Kopena J.B.
Grauer M.
Simpson T.
Stone R.
Lewis K.
Bohm M.
Wilkie D.
Piecyk M.
Osecki J.
AI Magazine English 2010 This article introduces the challenge of digital preservation in the area of engineering design and manufacturing and presents a methodology to apply knowledge representation and semantic techniques to develop digital engineering archives. This work is part of an ongoing, multiuniversity effort to create cyber infrastructure-based engineering repositories for undergraduates (CIBER-U) to support engineering design education. The technical approach is to use knowledge representation techniques to create formal models of engineering data elements, work flows, and processes. With these techniques formal engineering knowledge and processes can be captured and preserved with some guarantee of long-term interpretability. The article presents examples of how the techniques can be used to encode specific engineering information packages and work flows. These techniques are being integrated into a semantic wiki that supports the CIBER-U engineering education activities across nine universities and involving more than 3500 students since 2006. Copyright © 2010, Association for the Advancement of Artificial Intelligence. All rights reserved. 0 0
The tower of Babel meets web 2.0: User-generated content and its applications in a multilingual context Brent Hecht
Darren Gergle
Conference on Human Factors in Computing Systems - Proceedings English 2010 This study explores language's fragmenting effect on user-generated content by examining the diversity of knowledge representations across 25 different Wikipedia language editions. This diversity is measured at two levels: the concepts that are included in each edition and the ways in which these concepts are described. We demonstrate that the diversity present is greater than has been presumed in the literature and has a significant influence on applications that use Wikipedia as a source of world knowledge. We close by explicating how knowledge diversity can be beneficially leveraged to create "culturally- aware applications" and "hyperlingual applications". 0 2
UNIpedia: A unified ontological knowledge platform for semantic content tagging and search Kalender M.
Dang J.
Uskudarli S.
Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010 English 2010 The emergence of an ever increasing number of documents makes it more and more difficult to locate them when desired. An approach for improving search results is to make use of user-generated tags. This approach has led to improvements. However, they are limited because tags are (1) free from context and form, (2) user generated, (3) used for purposes other than description, and (4) often ambiguous. As a formal, declarative knowledge representation model, Ontologies provide a foundation upon which machine understandable knowledge can be obtained and tagged, and as a result, it makes semantic tagging and search possible. With an ontology, semantic web technologies can be utilized to automatically generate semantic tags. WordNet has been used for this purpose. However, this approach falls short in tagging documents that refer to new concepts and instances. To address this challenge, we present UNIpedia - a platform for unifying different ontological knowledge bases by reconciling their instances as WordNet concepts. Our mapping algorithms use rule based heuristics extracted from ontological and statistical features of concept and instances. UNIpedia is used to semantically tag contemporary documents. For this purpose, the Wikipedia and OpenCyc knowledge bases, which are known to contain up to date instances and reliable metadata about them, are selected. Experiments show that the accuracy of the mapping between WordNet and Wikipedia is 84% for the most relevant concept name and 90% for the appropriate sense. 0 0
Using encyclopaedic knowledge for query classification Richard Khoury Proceedings of the 2010 International Conference on Artificial Intelligence, ICAI 2010 English 2010 Identifying the intended topic that underlies a user's queiy can benefit a large range of applications, from search engines to question-answering systems. However, query classification remains a difficult challenge due to the variety of queries a user can ask, the wide range of topics users can ask about, and the limited amount of information that can be mined from the queiy. In this paper, we develop a new query classification system that accounts for these three challenges. Our system relies on encyclopaedic knowledge to understand the user's queiy and fill in the gaps of missing information. Specifically, we use the freely-available online encyclopaedia Wikipedia as a natural-language knowledge base, and exploit Wikipedia's structure to infer the correct classification of any user queiy. 0 0
A semantic layer on semi-structured data sources for intuitive chatbots Augello A.
Vassallo G.
Gaglio S.
Pilato G.
Proceedings of the International Conference on Complex, Intelligent and Software Intensive Systems, CISIS 2009 English 2009 The main limits of chatbot technology are related to the building of their knowledge representation and to their rigid information retrieval and dialogue capabilities, usually based on simple "pattern matching rules". The analysis of distributional properties of words in a texts corpus allows the creation of semantic spaces where represent and compare natural language elements. This space can be interpreted as a "conceptual" space where the axes represent the latent primitive concepts of the analyzed corpus. The presented work aims at exploiting the properties of a data-driven semantic/conceptual space built using semistructured data sources freely available on the web, like Wikipedia. This coding is equivalent to adding, into the Wikipedia graph, a conceptual similarity relationship layer. The chatbot can exploit this layer in order to simulate an "intuitive" behavior, attempting to retrieve semantic relations between Wikipedia resources also through associative sub-symbolic paths. 0 0
Engineering expressive knowledge with semantic wikis Joachim Baumeister
Nalepa G.J.
CEUR Workshop Proceedings English 2009 Semantic wikis are successfully used in various application domains. Such systems combine the flexible and agile authoring process with strong semantics of ontologies. The current state-of-the-art of systems, however, is diverse in the sense of having a common ground. Especially, the expressiveness of the knowledge representation of semantic wikis undergoes continuous improvement. In the paper, two semantic wiki implementations are discussed, that are both extending semantic wiki implementations by strong problem-solving knowledge. We compare their approaches and we aim to condense the fundamental characteristics of a strong problem-solving wiki. 0 0
Terabytes of tobler: Evaluating the first law in a massive, domain-neutral representation of world knowledge Brent Hecht
Moxley E.
Lecture Notes in Computer Science English 2009 The First Law of Geography states, "everything is related to everything else, but near things are more related than distant things." Despite the fact that it is to a large degree what makes "spatial special," the law has never been empirically evaluated on a large, domain-neutral representation of world knowledge. We address the gap in the literature about this critical idea by statistically examining the multitude of entities and relations between entities present across 22 different language editions of Wikipedia. We find that, at least according to the myriad authors of Wikipedia, the First Law is true to an overwhelming extent regardless of language-defined cultural domain. 0 0
Property clustering in semantic mediawiki define your own classes and relationships Scholz G. CEUR Workshop Proceedings English 2008 Semantic MediaWiki (SMW) currently has an atomic understanding of properties: they are seen as annotation marks which can be arbitrarily attached to articles. As a next step towards an object oriented representation of knowledge we introduce a concept of property clustering. This makes it possible to define a formal meta model for a knowledge domain. We support class inheritance and typed relations between objects. As a proof of concept we provide an implementation which is based on a set of templates and a few existing MediaWiki extensions. A graph of the meta model can be generated automatically. We offer different models for entering information based on templates and forms. A demo website ( is available. 0 0
YAGO: A Large Ontology from Wikipedia and WordNet F. Suchanek
G. Kasneci
G. Weikum
Web Semantics: Science, Services and Agents on the World Wide Web English 2008 This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic Is-A hierarchy as well as semantic relations between entities. The facts for YAGO have been extracted from the category system and the infoboxes of Wikipedia and have been combined with taxonomic relations from WordNet. Type checking techniques help us keep YAGO’s precision at 95%—as proven by an extensive evaluation study. YAGO is based on a clean logical model with a decidable consistency. Furthermore, it allows representing n-ary relations in a natural way while maintaining compatibility with RDFS. A powerful query model facilitates access to YAGO’s data. 0 1
Building collaborative capacities in learners: The M/cyclopedia project revisited Axel Bruns
Sal Humphreys
Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Applications, OOPSLA English 2007 In this paper we trace the evolution of a project using a wiki-based learning environment in a tertiary education setting. The project has the pedagogical goal of building learners' capacities to work effectively in the networked, collaborative, creative environments of the knowledge economy. The paper explores the four key characteristics of a 'produsage' environment and identifies four strategic capacities that need to be developed in learners to be effective 'produsers' (user-producers). A case study is presented of our experiences with the subject New Media Technologies, run at Queensland University of Technology, Brisbane, Australia. This progress report updates our observations made at the 2005 WikiSym conference. Copyright 0 0
Organizational wiki usage: Aconceptual model Hester A.J.
Scott J.E.
ICIS 2007 Proceedings - Twenty Eighth International Conference on Information Systems English 2007 A website based on Wiki technology differs from other websites in that content can be created, modified and updated automatically by any user via a web browser. Wiki technology improves upon previous methods of conversational technologies by providing many-to-many communication with current knowledge and history (Wagner 2004). The addition of knowledge representation and maintenance features of Wiki technology enable more effective knowledge sharing (Wagner 2006). We introduce a new model for wiki usage positing that wikis and the Wiki Way can foster collaboration and knowledge sharing given existence of facilitators (Fit of Task and Technology, Effective Motivation, and Effective Training) and absence of deterrents (Cultural Hurdles of Hierarchy, Reluctance to Share Knowledge and Resistance to Change). Our contribution is theoretically informed emphasis on the need to consider both human and technological aspects of the wiki experience. This study marks an important step forward in theoretical understanding of Wiki usage. 0 0
Use of Wikipedia categories in entity ranking Thom J.A.
Jovan Pehcevski
Vercoustre A.-M.
ADCS 2007 - Proceedings of the Twelfth Australasian Document Computing Symposium English 2007 Wikipedia is a useful source of knowledge that has many applications in language processing and knowledge representation. The Wikipedia category graph can be compared with the class hierarchy in an ontology; it has some characteristics in common as well as some differences. In this paper, we present our approach for answering entity ranking queries from the Wikipedia. In particular, we explore how to make use of Wikipedia categories to improve entity ranking effectiveness. Our experiments show that using categories of example entities works significantly better than using loosely defined target categories. 0 0
A semantic wiki for mathematical knowledge management Christoph Lange
Kohlhase M.
CEUR Workshop Proceedings English 2006 We propose the architecture of a semantic wiki for collaboratively building, editing and browsing a mathematical knowledge base. Its hyperlinked pages, containing mathematical theories, are stored as OMDoc, a markup format for mathematical knowledge representation. Our long-term objective is to develop a software that, on the one hand, facilitates the creation of a shared, public collection of mathematical knowledge (e.g. for education). On the other hand the software shall serve work groups of mathematicians as a tool for collaborative development of new theories. 0 0
An online ontology: WiktionaryZ Van Mulligen E.M.
Moller E.
Roes P.-J.
Weeber M.
Meijssen G.
Chichester C.
Mons B.
CEUR Workshop Proceedings English 2006 There is a great demand for online maintenance and refinement of knowledge on biomedical entities1. Collaborative maintenance of large biomedical ontologies combines the intellectual capacity of millions of minds for updating and correcting the annotations of biomedical concepts with their semantic relationships according to latest scientific insights. These relationships extend the current specialization and participation relationships as currently exploited in most ontology projects. The ontology layer has been developed on top of the Wikidata2 component and allows for presentation of these biomedical concepts in a similar way as Wikipedia pages. Each page contains all information on a biomedical concept with semantic relationships to other related concepts. A first version has been populated with data from the Unified Medical Language System (UMLS), SwissProt, GeneOntology, and Gemet. The various fields are online editable in a Wiki style and are maintained via a powerful versioning regiment. Next steps will include the definition of a set of formal rules for the ontology to enforce (onto)logical rigor. 0 0
Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge Evgeniy Gabrilovich
Shaul Markovitch
Proceedings of the National Conference on Artificial Intelligence English 2006 When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. On the other hand, state-of-the-art information retrieval systems are quite brittle - they traditionally represent documents as bags of words, and are restricted to learning from individual word occurrences in the (necessarily limited) training set. For instance, given the sentence "Wal-Mart supply chain goes real time", how can a text categorization system know that Wal-Mart manages its stock with RFID technology? And having read that "Ciprofioxacin belongs to the quinolones group", how on earth can a machine know that the drug mentioned is an antibiotic produced by Bayer? In this paper we present algorithms that can do just that. We propose to enrich document representation through automatic use of a vast compendium of human knowledge - an encyclopedia. We apply machine learning techniques to Wikipedia, the largest encyclopedia to date, which surpasses in scope many conventional encyclopedias and provides a cornucopia of world knowledge. Each Wikipedia article represents a concept, and documents to be categorized are represented in the rich feature space of words and relevant Wikipedia concepts. Empirical results confirm that this knowledge-intensive representation brings text categorization to a qualitatively new level of performance across a diverse collection of datasets. Copyright © 2006, American Association for Artificial Intelligence ( All rights reserved. 0 1
Wild, wild wikis: A way forward Robert Charles
Adigun Ranmi
Proceedings - Fifth International Conference on Creating, Connecting and Collaborating through Computing, C5 2007 English 2006 Wikis can be considered as public domain knowledge sharing system. They provide opportunity for those who may not have the privilege to publish their thoughts through the traditional methods. They are one of the fastest growing systems of online encyclopaedia. In this study, we consider the importance of wikis as a way of creating, sharing and improving public knowledge. We identify some of the problems associated with these public resources to include, (a) identification of the identities of information and its creator (b) accuracy of information (c) justification of the credibility of authors (d) vandalism of quality of information (e) weak control over the contents. A solution to some of these problems is sought through the use of an annotation model. The model assumes that contributions in wikis can be seen as annotation to the initial document. It proposed a systematic control of contributors and contributions to the initiative and the keeping of records of what existed and what was done to initial documents. We believe that with this model, analysis can be done on the progress of wiki initiatives. We assumed that using this model, wikis can be better used for creation and sharing of knowledge for public use. 0 0