Hong Lin

From WikiPapers
Jump to: navigation, search

Hong Lin is an author.


Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Improving semi-supervised text classification by using wikipedia knowledge Clustering Based Classification
Semi-supervised Text Classification
Lecture Notes in Computer Science English 2013 Semi-supervised text classification uses both labeled and unlabeled data to construct classifiers. The key issue is how to utilize the unlabeled data. Clustering based classification method outperforms other semi-supervised text classification algorithms. However, its achievements are still limited because the vector space model representation largely ignores the semantic relationships between words. In this paper, we propose a new approach to address this problem by using Wikipedia knowledge. We enrich document representation with Wikipedia semantic features (concepts and categories), propose a new similarity measure based on the semantic relevance between Wikipedia features, and apply this similarity measure to clustering based classification. Experiment results on several corpora show that our proposed method can effectively improve semi-supervised text classification performance. 0 0
Visitpedia: Wiki article visit log visualization for event exploration Event Detection
Event evolution
Visual analytics
Wikipedia visit counts
Proceedings - 13th International Conference on Computer-Aided Design and Computer Graphics, CAD/Graphics 2013 English 2013 This paper proposes an interactive visualization tool, Visitpedia, to detect and analyze social events based on Wikipedia visit history. It helps users discover real-world events behind the data and study how these events evolve over time. Different from previous work based on on-line news or similar text corpora, we choose Wikipedia visit counts as our data source since the visit count data better reflect user concerns of social events. We tackle the event-based task from a time-series pattern perspective rather than semantic perspective. Various visualization and user interaction techniques are integrated in Visitpedia. Two case studies are conducted to demonstrate the effectiveness of Visitpedia. 0 0
Feature transformation method enhanced vandalism detection in wikipedia Classification
Lecture Notes in Computer Science English 2012 A very example of web 2.0 application is Wikipedia, an online encyclopedia where anyone can edit and share information. However, blatantly unproductive edits greatly undermine the quality of Wikipedia. Their irresponsible acts force editors to waste time undoing vandalisms. For the purpose of improving information quality on Wikipedia and freeing the maintainer from such repetitive tasks, machine learning methods have been proposed to detect vandalism automatically. However, most of them focused on mining new features which seem to be inexhaustible to be discovered. Therefore, the question of how to make the best use of these features needs to be tackled. In this paper, we leverage feature transformation techniques to analyze the features and propose a framework using these methods to enhance detection. Experiment results on the public dataset PAN-WVC-10 show that our method is effective and it provides another useful method to help detect vandalism in Wikipedia. 0 0
Mining a multilingual association dictionary from Wikipedia for cross-language information retrieval Information processing
Information retrieval
Web mining
Journal of the American Society for Information Science and Technology English 2012 Wikipedia is characterized by its dense link structure and a large number of articles in different languages, which make it a notable Web corpus for knowledge extraction and mining, in particular for mining the multilingual associations. In this paper, motivated by a psychological theory of word meaning, we propose a graph-based approach to constructing a cross-language association dictionary (CLAD) from Wikipedia, which can be used in a variety of cross-language accessing and processing applications. In order to evaluate the quality of the mined CLAD, and to demonstrate how the mined CLAD can be used in practice, we explore two different applications of the mined CLAD to cross-language information retrieval (CLIR). First, we use the mined CLAD to conduct cross-language query expansion; and, second, we use it to filter out translation candidates with low translation probabilities. Experimental results on a variety of standard CLIR test collections show that the CLIR retrieval performance can be substantially improved with the above two applications of CLAD, which indicates that the mined CLAD is of sound quality. 0 0
Combining multiple disambiguation methods for gene mention normalization BioCreative II
Gene mention normalization
Gene symbol disambiguation
Web-based kernel
Expert Systems with Applications English 2011 The rapid growth of biomedical literature prompts pervasive concentrations of biomedical text mining community to explore methodology for accessing and managing this ever-increasing knowledge. One important task of text mining in biomedical literature is gene mention normalization which recognizes the biomedical entities in biomedical texts and maps each gene mention discussed in the text to unique organic database identifiers. In this work, we employ an information retrieval based method which extracts gene mention's semantic profile from PubMed abstracts for gene mention disambiguation. This disambiguation method focuses on generating a more comprehensive representation of gene mention rather than the organic clues such as gene ontology which has fewer co-occurrences with the gene mention. Furthermore, we use an existing biomedical resource as another disambiguation method. Then we extract features from gene mention detection system's outcome to build a false positive filter according to Wikipedia's retrieved documents. Our system achieved F-measure of 83.1% on BioCreative II GN test data. © 2011 Elsevier Ltd. All rights reserved. 0 0
A structured Wikipedia for mathematics: Mathematics in a web 2.0 world Mathematics
Online collaboration
Web 2.0 technologies
ICSOFT 2010 - Proceedings of the 5th International Conference on Software and Data Technologies English 2010 In this paper, we propose a new idea for developing a collaborative online system for storing mathematical work similar to Wikipedia, but much more suitable for storing mathematical results and concepts. The main idea proposed in this paper is to design a system that would allow users to store mathematics in a structured manner, which would make related work easier to find. The proposed system would have users use indentation to add a hierarchical structure to mathematical results and concepts entered into the system. The hierarchical structure provided by the indentation of results and concepts would provide users with additional search functionality useful for finding related work. Additionally, the system would automatically link related results by using the structure provided by users, and also provide other useful functionality. The system would be flexible in terms of letting users decide how much structure to add to each mathematical result or concept to ensure that contributors are not overly burdened with having to add too much structure to each result. The system proposed in this paper serves as a starting point for discussion on new ideas to organize mathematical results and concepts, and many open questions remain for new research. 0 0
Building a Networked Environment in Wikis: The Evolving Phases of Collaborative Learning in a Wikibook Project Journal of Educational Computing Research English 2009 Wikis, when used as an open editing tool, can have profound and subtle effects on students' collaborative learning process. Hailed as a collaborative learning and writing tool, many questions remain regarding the pedagogical impacts of using wikis in the classroom. Do students feel comfortable editing each others' wiki articles? Do students learn collaboratively and construct knowledge for the community? What challenges did they experience in a networked environment? This study addressed these questions using qualitative methods, including multiple semi-structured interviews and student reflective journals, for analysis. The findings challenge idealistic hypotheses that wiki work, without careful design and implementation, is naturally beneficial. It was also found that collaborative writing and learning were the exception rather than the norm among participants in the early stages of wiki work. It is recommended that instructors provide highly supportive learning experiences to teach students how to use wikis and how to work collaboratively when implementing wikis to maximize the benefits of this emerging tool. 16 0