Data visualization

From WikiPapers
Jump to: navigation, search

Data visualization is included as keyword or extra keyword in 0 datasets, 1 tools and 21 publications.


There is no datasets for this keyword.


Tool Operating System(s) Language(s) Programming language(s) License Description Image
Wikichron Cross-platform English Python Affero GPL (code) WikiChron is a web tool for the analysis and visualization of the evolution of wiki online communities. It uses processed data of the history dumps of mediawiki wikis, computes different metrics on this data and plot it in interactive graphs. It allows to compare different wikis in the same graphs.

This tool will serve investigators in the task of inspecting the behavior of collaborative online communities, in particular wikis, and generate research hypotheses for further and deeper studies. WikiChron has been thought to be very easy to use and highly interactive from the very first beginning. It comes with a bunch of already downloaded and processed wikis from Wikia (but any MediaWiki wiki is supported), and with more than thirty metrics to visualize and compare between wikis.

Moreover, it can be useful in the case of wiki administrators who want to see, analyze and compare how the activity on their wikis is going.

WikiChron is available online here:


Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
A Platform for Visually Exploring the Development of Wikipedia Articles Erik Borra
David Laniado
Esther Weltevrede
Michele Mauri
Giovanni Magni
Tommaso Venturini
Paolo Ciuccarelli
Richard Rogers
Andreas Kaltenbrunner
ICWSM '15 - 9th International AAAI Conference on Web and Social Media English May 2015 When looking for information on Wikipedia, Internet users generally just read the latest version of an article. However, in its back-end there is much more: associated to each article are the edit history and talk pages, which together entail its full evolution. These spaces can typically reach thousands of contributions, and it is not trivial to make sense of them by manual inspection. This issue also affects Wikipedians, especially the less experienced ones, and constitutes a barrier for new editor engagement and retention. To address these limitations, Contropedia offers its users unprecedented access to the development of an article, using wiki links as focal points. 0 0
Self-sorting map: An efficient algorithm for presenting multimedia data in structured layouts Strong G.
Gong M.
IEEE Transactions on Multimedia English 2014 This paper presents the Self-Sorting Map (SSM), a novel algorithm for organizing and presenting multimedia data. Given a set of data items and a dissimilarity measure between each pair of them, the SSM places each item into a unique cell of a structured layout, where the most related items are placed together and the unrelated ones are spread apart. The algorithm integrates ideas from dimension reduction, sorting, and data clustering algorithms. Instead of solving the continuous optimization problem that other dimension reduction approaches do, the SSM transforms it into a discrete labeling problem. As a result, it can organize a set of data into a structured layout without overlap, providing a simple and intuitive presentation. The algorithm is designed for sorting all data items in parallel, making it possible to arrange millions of items in seconds. Experiments on different types of data demonstrate the SSM's versatility in a variety of applications, ranging from positioning city names by proximities to presenting images according to visual similarities, to visualizing semantic relatedness between Wikipedia articles. 0 0
Visualizing large-scale human collaboration in Wikipedia Biuk-Aghai R.P.
Pang C.-I.
Si Y.-W.
Future Generation Computer Systems English 2014 Volunteer-driven large-scale human-to-human collaboration has become common in the Web 2.0 era. Wikipedia is one of the foremost examples of such large-scale collaboration, involving millions of authors writing millions of articles on a wide range of subjects. The collaboration on some popular articles numbers hundreds or even thousands of co-authors. We have analyzed the co-authoring across entire Wikipedias in different languages and have found it to follow a geometric distribution in all the language editions we studied. In order to better understand the distribution of co-author counts across different topics, we have aggregated content by category and visualized it in a form resembling a geographic map. The visualizations produced show that there are significant differences of co-author counts across different topics in all the Wikipedia language editions we visualized. In this article we describe our analysis and visualization method and present the results of applying our method to the English, German, Chinese, Swedish and Danish Wikipedias. We have evaluated our visualization against textual data and found it to be superior in usability, accuracy, speed and user preference. © 2013 Elsevier B.V. All rights reserved. 0 0
Analyzing multi-dimensional networks within mediawikis Brian C. Keegan
Ceni A.
Smith M.A.
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English 2013 The MediaWiki platform supports popular socio-technical systems such as Wikipedia as well as thousands of other wikis. This software encodes and records a variety of rela- Tionships about the content, history, and editors of its arti- cles such as hyperlinks between articles, discussions among editors, and editing histories. These relationships can be an- Alyzed using standard techniques from social network analy- sis, however, extracting relational data from Wikipedia has traditionally required specialized knowledge of its API, in- formation retrieval, network analysis, and data visualization that has inhibited scholarly analysis. We present a soft- ware library called the NodeXL MediaWiki Importer that extracts a variety of relationships from the MediaWiki API and integrates with the popular NodeXL network analysis and visualization software. This library allows users to query and extract a variety of multidimensional relationships from any MediaWiki installation with a publicly-accessible API. We present a case study examining the similarities and dif- ferences between dierent relationships for the Wikipedia articles about \Pope Francis" and \Social media." We con- clude by discussing the implications this library has for both theoretical and methodological research as well as commu- nity management and outline future work to expand the capabilities of the library. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; D.2.8 [Software Engineering]: Metricscomplexity mea- sures, performance measures General Terms System. Copyright 2010 ACM. 0 0
Assessment of collaborative learning experiences by graphical analysis of wiki contributions Manuel Palomo-Duarte
Juan Manuel Dodero-Beardo
Inmaculada Medina-Bulo
Emilio J. Rodríguez-Posada
Iván Ruiz-Rube
Interactive Learning Environments English 2012 The widespread adoption of computers and Internet in our life has reached the classrooms, where Computer-Supported Collaborative Learning based on wikis offers new ways of collaboration and encourages student participation. When the number of contributions from students increases, traditional assessment procedures of e-learning settings suffer from scalability problems. In a wiki-based learning experience, automatic tools are required to support the assessment of such huge amounts of data. In this work we present StatMediaWiki, a tool that collects and aggregates information that helps to analyze a MediaWiki installation. It generates charts, tables and different statistics enabling easy analysis of wiki evolution.. We have used StatMediaWiki in a Higher Education course and present the results obtained in this case study. 14 0
Feeling the pulse of a wiki: Visualization of recent changes in Wikipedia Biuk-Aghai R.P.
Chan R.C.K.
ACM International Conference Proceeding Series English 2012 Large wikis such as Wikipedia attract large numbers of editors continuously editing content. It is difficult to observe what editing activity goes on at any given moment, what editing patterns can be observed, and which are the currently active editors and articles. We introduce the design and implementation of an information visualization tool for streaming data on recent changes in wikis that aims to address this difficulty, show examples of our visualizations from English Wikipedia, and present several patterns of editing activity that we can visually identify using our tool. 0 0
ViDaX: An interactive semantic data visualisation and exploration tool Dumas B.
Broche T.
Hoste L.
Signer B.
Proceedings of the Workshop on Advanced Visual Interfaces AVI English 2012 We present the Visual Data Explorer (ViDaX), a tool for visualising and exploring large RDF data sets. ViDaX enables the extraction of information from RDF data sources and offers functionality for the analysis of various data characteristics as well as the exploration of the corresponding ontology graph structure. In addition to some basic data mining features, our interactive semantic data visualisation and exploration tool offers various types of visualisations based on the type of data. In contrast to existing semantic data visualisation solutions, ViDaX also offers non-expert users the possibility to explore semantic data based on powerful automatic visualisation and interaction techniques without the need for any low-level programming. To illustrate some of ViDaX's functionality, we present a use case based on semantic data retrieved from DBpedia, a semantic version of the well-known Wikipedia online encyclopedia, which forms a major component of the emerging linked data initiative. 0 0
A self organizing document map algorithm for large scale hyperlinked data inspired by neuronal migration Kotaro Nakayama
Yutaka Matsuo
Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011 English 2011 Web document clustering is one of the research topics that is being pursued continuously due to the large variety of applications. Since Web documents usually have variety and diversity in terms of domains, content and quality, one of the technical difficulties is to find a reasonable number and size of clusters. In this research, we pay attention to SOMs (Self Organizing Maps) because of their capability of visualized clustering that helps users to investigate characteristics of data in detail. The SOM is widely known as a "scalable" algorithm because of its capability to handle large numbers of records. However, it is effective only when the vectors are small and dense. Although several research efforts on making the SOM scalable have been conducted, technical issues on scalability and performance for sparse high-dimensional data such as hyperlinked documents still remain. In this paper, we introduce MIGSOM, an SOM algorithm inspired by a recent discovery on neuronal migration. The two major advantages of MIGSOM are its scalability for sparse high-dimensional data and its clustering visualization functionality. In this paper, we describe the algorithm and implementation, and show the practicality of the algorithm by applying MIGSOM to a huge scale real data set: Wikipedia's hyperlink data. 0 0
Modeling the effect of product architecture on mass-collaborative processes Le Q.
Panchal J.H.
Journal of Computing and Information Science in Engineering English 2011 Traditional product development efforts are primarily based on well-structured and hierarchical product development processes. The products are systematically decomposed into subsystems that are designed by dedicated teams with well-defined information flows. Over the last 2 decades, a new product development approach called mass-collaborative product development (MCPD) has emerged. The fundamental difference between a traditional product development process and a MCPD process is that the former is based on top-down decomposition while the latter is based on evolution and self-organization. The paradigm of MCPD has resulted in highly successful products such as Wikipedia, Linux, and Apache. Despite the success of various projects using MCPD, it is not well understood how the product architecture affects the evolution of products developed using such processes. Toward addressing this gap, we present an agent-based model to study the effect of product architectures in MCPD processes. The model is executed for different architectures ranging from slot architecture to bus architecture and the rates of product evolution are determined. The agent-based modeling approach allows us to study how (a) the degree of modularity of products and (b) the sequence of decoupling affect the evolution time of individual modules and overall products developed through MCPD processes. The approach is presented using the architecture of mobile phones as an illustrative example. This approach provides a simple and intuitive way to study the effects of product architecture on the MCPD processes. It is helpful in determining suitable strategies for product decomposition and module decoupling, and in identifying the product architectures that are suitable for MCPD processes. 0 0
Algorithm Visualization: The state of the field Shaffer C.A.
Cooper M.L.
Alon A.J.D.
Akbar M.
Stewart M.
Ponce S.
Edwards S.H.
ACM Transactions on Computing Education English 2010 We present findings regarding the state of the field of Algorithm Visualization (AV) based on our analysis of a collection of over 500 AVs. We examine how AVs are distributed among topics, who created them and when, their overall quality, and how they are disseminated. There does exist a cadre of good AVs and active developers. Unfortunately, we found that many AVs are of low quality, and coverage is skewed toward a few easier topics. This can make it hard for instructors to locate what they need. There are no effective repositories of AVs currently available, which puts many AVs at risk for being lost to the community over time. Thus, the field appears in need of improvement in disseminating materials, propagating known best practices, and informing developers about topic coverage. These concerns could be mitigated by building community and improving communication among AV users and developers. 0 0
Collaborative educational geoanalytics applied to large statistics temporal data Jern M. CSEDU 2010 - 2nd International Conference on Computer Supported Education, Proceedings English 2010 Recent advances in Web 2.0 graphics technologies have the potential to make a dramatic impact on developing collaborative geovisual analytics that analyse, visualize, communicate and present official statistics. In this paper, we introduce novel "storytelling" means for the experts to first explore large, temporal and multidimensional statistical data, then collaborate with colleagues and finally embed dynamic visualization into Web documents e.g. HTML, Blogs or MediaWiki to communicate essential gained insight and knowledge. The aim is to let the analyst (author) explore data and simultaneously save important discoveries and thus enable sharing of gained insights over the Internet. Through the story mechanism facilitating descriptive metatext, textual annotations hyperlinked through the snapshot mechanism and integrated with interactive visualization, the author can let the reader follow the analyst's way of logical reasoning. This emerging technology could in many ways change the terms and structures for learning. 0 0
Extending SMW+ with a linked data integration framework Christian Becker
Christian Bizer
Maike Erdmann
Greaves M.
CEUR Workshop Proceedings English 2010 In this paper, we present a project which extends a SMW+ semantic wiki with a Linked Data Integration Framework that performs Web data access, vocabulary mapping, identity resolution, and quality evaluation of Linked Data. As a result, a large collection of neurogenomicsrelevant data from the Web can be flexibly transformed into a unified ontology, allowing unified querying, navigation, and visualization; as well as support for wiki-style collaboration, crowdsourcing, and commentary on chosen data sets. 0 0
Talking about data: Sharing richly structured information through blogs and wikis Benson E.
Marcus A.
Howahl F.
Karger D.
Proceedings of the 19th International Conference on World Wide Web, WWW '10 English 2010 The web has dramatically enhanced people's ability to communicate ideas, knowledge, and opinions. But the authoring tools that most people understand, blogs and wikis, primarily guide users toward authoring text. In this work, we show that substantial gains in expressivity and communication would accrue if people could easily share richly structured information in meaningful visualizations. We then describe several extensions we have created for blogs and wikis that enable users to publish, share, and aggregate such structured information using the same workflows they apply to text. In particular, we aim to preserve those attributes that make blogs and wikis so effective: one-click access to the information, one-click publishing of content, natural authoring interfaces, and the ability to easily copy-and-paste information and visualizations from other sources. 0 0
Visualizing large-scale RDF data using subsets, summaries, and sampling in oracle Sundara S.
Atre M.
Kolovski V.
Sanmay Das
ZongDa Wu
Chong E.I.
Srinivasan J.
Proceedings - International Conference on Data Engineering English 2010 The paper addresses the problem of visualizing large scale RDF data via a 3-S approach, namely, by using, 1) Subsets: to present only relevant data for visualisation; both static and dynamic subsets can be specified, 2) Summaries: to capture the essence of RDF data being viewed; summarized data can be expanded on demand thereby allowing users to create hybrid (summary-detail) fisheye views of RDF data, and 3) Sampling: to further optimize visualization of large-scale data where a representative sample suffices. The visualization scheme works with both asserted and inferred triples (generated using RDF(S) and OWL semantics). This scheme is implemented in Oracle by developing a plug-in for the Cytoscape graph visualization tool, which uses functions defined in a Oracle PL/SQL package, to provide fast and optimized access to Oracle Semantic Store containing RDF data. Interactive visualization of a synthesized RDF data set (LUBM 1 million triples), two native RDF datasets (Wikipedia 47 million triples and UniProt 700 million triples), and an OWL ontology (eClassOwl with a large class hierarchy including over 25,000 OWL classes, 5,000 properties, and 400,000 class-properties) demonstrates the effectiveness of our visualization scheme. 0 0
2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing, CollaborateCom 2009 No author name available 2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing, CollaborateCom 2009 English 2009 The proceedings contain 68 papers. The topics discussed include: multi-user multi-account interaction in groupware supporting single-display collaboration; supporting collaborative work through flexible process execution; dynamic data services: data access for collaborative networks in a multi-agent systems architecture; integrating external user profiles in collaboration applications; a collaborative framework for enforcing server commitments, and for regulating server interactive behavior in SOA-based systems; CASTLE: a social framework for collaborative anti-phishing databases; VisGBT: visually analyzing evolving datasets for adaptive learning; an IT appliance for remote collaborative review of mechanisms of injury to children in motor vehicle crashes; user contribution and trust in Wikipedia; and a new perspective on experimental analysis of N-tier systems: evaluating database scalability, multi-bottlenecks, and economical operation. 0 0
Vispedia: On-demand data integration for interactive visualization and exploration Bryan Chan
Justin Talbot
Wu L.
Sakunkoo N.
Mike Cammarano
Pat Hanrahan
SIGMOD-PODS'09 - Proceedings of the International Conference on Management of Data and 28th Symposium on Principles of Database Systems English 2009 Wikipedia is an example of the large, collaborative, semi-structured data sets emerging on the Web. Typically, before these data sets can be used, they must transformed into structured tables via data integration. We present Vispedia, a Web-based visualization system which incorporates data integration into an iterative, interactive data exploration and analysis process. This reduces the upfront cost of using heterogeneous data sets like Wikipedia. Vispedia is driven by a keyword-query-based integration interface implemented using a fast graph search. The search occurs interactively over DBpedia's semantic graph of Wikipedia, without depending on the existence of a structured ontology. This combination of data integration and visualization enables a broad class of non-expert users to more effectively use the semi-structured data available on the Web. 0 0
Visualizing cooperative activities with ellimaps: The case of wikipedia Otjacques B.
Cornil M.
Feltz F.
Lecture Notes in Computer Science English 2009 Cooperation has become a key word in the emerging Web 2.0 paradigm. The nature and motivations of the various behaviours related to this type of cooperative activities remain however incompletely understood. The information visualization tools can play a crucial role from this perspective to analyse the collected data. This paper presents a prototype allowing visualizing some data about the Wikipedia history with a technique called ellimaps. In this context the recent CGD algorithm is used in order to increase the scalability of the ellimaps approach. 0 0
FolksoViz: A subsumption-based folksonomy visualization using wikipedia texts Kangpyo L.
Hyunwoo K.
Chungsu J.
Kim H.-J.
Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08 English 2008 In this paper, targeting tag data, we propose a method, FolksoViz, for deriving subsumption relationships between tags by using Wikipedia texts, and visualizing a folksonomy. To fulfill this method, we propose a statistical model for deriving subsumption relationships based on the frequency of each tag on the Wikipedia texts, as well as the TSD (Tag Sense Disambiguation) method for mapping each tag to a corresponding Wikipedia text. The derived subsumption pairs are visualized effectively on the screen. The experiment shows that the FolksoViz manages to find the correct subsumption pairs with high accuracy. 0 0
On visualizing heterogeneous semantic networks from multiple data sources Maureen
Aixin Sun
Lim E.-P.
Anwitaman Datta
Kuiyu Chang
Lecture Notes in Computer Science English 2008 In this paper, we focus on the visualization of heterogeneous semantic networks obtained from multiple data sources. A semantic network comprising a set of entities and relationships is often used for representing knowledge derived from textual data or database records. Although the semantic networks created for the same domain at different data sources may cover a similar set of entities, these networks could also be very different because of naming conventions, coverage, view points, and other reasons. Since digital libraries often contain data from multiple sources, we propose a visualization tool to integrate and analyze the differences among multiple social networks. Through a case study on two terrorism-related semantic networks derived from Wikipedia and Terrorism Knowledge Base (TKB) respectively, the effectiveness of our proposed visualization tool is demonstrated. 0 0
Vispedia*: Interactive visual exploration of wikipedia data via search-based integration Bryan Chan
Wu L.
Justin Talbot
Mike Cammarano
Pat Hanrahan
IEEE Transactions on Visualization and Computer Graphics English 2008 Wikipedia is an example of the collaborative, semi-structured data sets emerging on the Web. These data sets have large, non-uniform schema that require costly data integration into structured tables before visualization can begin. We present Vispedia, a Web-based visualization system that reduces the cost of this data integration. Users can browse Wikipedia, select an interesting data table, then use a search interface to discover, integrate, and visualize additional columns of data drawn from multiple Wikipedia articles. This interaction is supported by a fast path search algorithm over DBpedia, a semantic graph extracted from Wikipedia's hyperlink structure. Vispedia can also export the augmented data tables produced for use in traditional visualization systems. We believe that these techniques begin to address the "long tail" of visualization by allowing a wider audience to visualize a broader class of data. We evaluated this system in a first-use formative lab study. Study participants were able to quickly create effective visualizations for a diverse set of domains, performing data integration as needed. 0 0
Wikis as a cooperation and communication platform within product development Albers A.
Deigendesch T.
Drammer M.
Ellmer C.
Meboldt M.
Sauter C.
Proceedings of ICED 2007, the 16th International Conference on Engineering Design English 2007 Knowledge and information management gains growing importance especially in industrial product development processes. This contribution presents two concepts for structured information storage and access that also can be combined: Wikis and the Continuous Idea Storage. The objectives of Wikis and the application in product development process are discussed. Further on, the approach of the Continuous Idea Storage is introduced. All ideas, no matter if selected for realization or rejected, generated during product development are stored and annotated by additional information, e.g. why the idea was rejected. Finally, the contribution presents the application of both concepts in product development by student project teams in the context of the Karlsruhe education model for product development (KaLeP). 0 0