Yulan Yan

From WikiPapers
Jump to: navigation, search

Yulan Yan is an author.


Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Multi-view bootstrapping for relation extraction by exploring web features and linguistic features Lecture Notes in Computer Science English 2010 Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other. 0 0
Efficient indices using graph partitioning in RDF triple stores Proceedings - International Conference on Data Engineering English 2009 With the advance of the Semantic Web, varying RDF data were increasingly generated, published, queried, and reused via the Web. For example, the DBpedia, a community effort to extract structured data from Wikipedia articles, broke 100 million RDF triples in its latest release. Initiated by Tim Berners-Lee, likewise, the Linking Open Data (LOD) project has published and interlinked many open licence datasets which consisted of over 2 billion RDF triples so far. In this context, fast query response over such large scaled data would be one of the challenges to existing RDF data stores. In this paper, we propose a novel triple indexing scheme to help RDF query engine fast locate the instances within a small scope. By considering the RDF data as a graph, we would partition the graph into multiple subgraph pieces and store them individually, over which a signature tree would be built up to index the URIs. When a query arrives, the signature tree index is used to fast locate the partitions that might include the matches of the query by its constant URIs. Our experiments indicate that the indexing scheme dramatically reduces the query processing time in most cases because many partitions would be early filtered out and the expensive exact matching is only performed over a quite small scope against the original dataset. 0 0
Unsupervised relation extraction by mining Wikipedia texts using information from the web ACL English 2009 0 0