Adrian Popescu

From WikiPapers
Jump to: navigation, search

Adrian Popescu is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Fuzzy ontology alignment using background knowledge (fuzzy) ontology alignment
Background knowledge
Fuzzy ontologies
Information retrieval
Wikipedia
International Journal of Uncertainty, Fuzziness and Knowlege-Based Systems English 2014 We propose an ontology alignment framework with two core features: the use of background knowledge and the ability to handle vagueness in the matching process and the resulting concept alignments. The procedure is based on the use of a generic reference vocabulary, which is used for fuzzifying the ontologies to be matched. The choice of this vocabulary is problem-dependent in general, although Wikipedia represents a general-purpose source of knowledge that can be used in many cases, and even allows cross language matchings. In the first step of our approach, each domain concept is represented as a fuzzy set of reference concepts. In the next step, the fuzzified domain concepts are matched to one another, resulting in fuzzy descriptions of the matches of the original concepts. Based on these concept matches, we propose an algorithm that produces a merged fuzzy ontology that captures what is common to the source ontologies. The paper describes experiments in the domain of multimedia by using ontologies containing tagged images, as well as an evaluation of the approach in an information retrieval setting. The undertaken fuzzy approach has been compared to a classical crisp alignment by the help of a ground truth that was created based on human judgment. 0 0
Mining the web for points of interest Geo-localisation
Geographic information extraction
Location-based applications
Points of interest
SIGIR'12 - Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval English 2012 A point of interest (POI) is a focused geographic entity such as a landmark, a school, an historical building, or a business. Points of interest are the basis for most of the data supporting location-based applications. In this paper we propose to curate POIs from online sources by bootstrapping training data from Web snippets, seeded by POIs gathered from social media. This large corpus is used to train a sequential tagger to recognize mentions of POIs in text. Using Wikipedia data as the training data, we can identify POIs in free text with an accuracy that is 116% better than the state of the art POI identifier in terms of precision, and 50% better in terms of recall. We show that using Foursquare and Gowalla checkins as seeds to bootstrap training data from Web snippets, we can improve precision between 16% and 52%, and recall between 48% and 187% over the state-of-the-art. The name of a POI is not sufficient, as the POI must also be associated with a set of geographic coordinates. Our method increases the number of POIs that can be localized nearly three-fold, from 134 to 395 in a sample of 400, with a median localization accuracy of less than one kilometer. 0 0
Social media driven image retrieval Flickr
Image retrieval
Wikipedia
Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR'11 English 2011 People often try to find an image using a short query and images are usually indexed using short annotations. Matching the query vocabulary with the indexing vocabulary is a difficult problem when little text is available. Textual user generated content in Web 2.0 platforms contains a wealth of data that can help solve this problem. Here we describe how to use Wikipedia and Flickr content to improve this match. The initial query is launched in Flickr and we create a query model based on co-occurring terms. We also calculate nearby concepts using Wikipedia and use these to expand the query. The final results are obtained by ranking the results for the expanded query using the similarity between their annotation and the Flickr model. Evaluation of these expansion and ranking techniques, over the Image CLEF 2010 Wikipedia Collection containing 237,434 images and their multilingual textual annotations, shows that a consistent improvement compared to state of the art methods. 0 0
Multimodal image retrieval over a large database Lecture Notes in Computer Science English 2010 We introduce a new multimodal retrieval technique which combines query reformulation and visual image reranking in order to deal with results sparsity and imprecision, respectively. Textual queries are reformulated using Wikipedia knowledge and results are then reordered using a k-NN based reranking method. We compare textual and multimodal retrieval and show that introducing visual reranking results in a significant improvement of performance. 0 0
Spatiotemporal mapping of Wikipedia concepts English 2010 Space and time are important dimensions in the representation of a large number of concepts. However there exists no available resource that provides spatiotemporal mappings of generic concepts. Here we present a link-analysis based method for extracting the main locations and periods associated to all Wikipedia concepts. Relevant locations are selected from a set of geotagged articles, while relevant periods are discovered using a list of people with associated life periods. We analyze article versions over multiple languages and consider the strength of a spatial/temporal reference to be proportional to the number of languages in which it appears. To illustrate the utility of the spatiotemporal mapping of Wikipedia concepts, we present an analysis of cultural interactions and a temporal analysis of two domains. The Wikipedia mapping can also be used to perform rich spatiotemporal document indexing by extracting implicit spatial and temporal references from texts. 0 1
Conceptual image retrieval over a large scale database Image retrieval
Large-scale database
Query reformulation
Lecture Notes in Computer Science English 2009 Image retrieval in large-scale databases is currently based on a textual chains matching procedure. However, this approach requires an accurate annotation of images, which is not the case on the Web. To tackle this issue, we propose a reformulation method that reduces the influence of noisy image annotations. We extract a ranked list of related concepts for terms in the query from WordNet and Wikipedia, and use them to expand the initial query. Then some visual concepts are used to re-rank the results for queries containing, explicitly or implicitly, visual cues. First evaluations on a diversified corpus of 150000 images were convincing since the proposed system was ranked 4 th and 2 nd at the WikipediaMM task of the ImageCLEF 2008 campaign [1]. 0 0
Mining a multilingual geographical gazetteer from the Web Proceedings - 2009 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2009 English 2009 Geographical gazetteers are necessary in a wide variety of applications. In the past, the construction of such gazetteers has been a tedious, manual process and only recently have the first attempts to automate the gazetteers creation been made. Here we describe our approach for mining accurate but large-scale multilingual geographic information by successively filtering information found in heterogeneous data sources (Flickr, Wikipedia, Panoramio, Web pages indexed by search engines). Statistically crosschecking information found in each site, we are able to identify new geographic objects, and to indicate, for each one, its name, its GPS coordinates, its encompassing regions (city, region, country), the language of the name, its popularity, and the type of the object (church, bridge, etc.). We evaluate our approach by comparing, wherever possible, our multilingual gazetteer to other known attempts at automatically building a geographic database and to Geonames, a manually built gazetteer. 0 0
Gazetiki: Automatic creation of a geographical gazetteer Data mining
Geographic gazetteer
Information extraction
Panoramio
Thesaurus
Wikipedia
Proceedings of the ACM International Conference on Digital Libraries English 2008 Geolocalized databases are becoming necessary in a wide variety of application domains. Thus far, the creation of such databases has been a costly, manual process. This drawback has stimulated interest in automating their construction, for example, by mining geographical information from the Web. Here we present and evaluate a new automated technique for creating and enriching a geographical gazetteer, called Gazetiki. Our technique merges disparate information from Wikipedia, Panoramio, and web search, engines in order to identify geographical names, categorize these names, find their geographical coordinates and rank them. The information produced in Gazetiki enhances and complements the Geonames database, using a similar domain model. We show that our method provides a richer structure and an improved coverage compared to another known attempt at automatically building a geographic database and, where possible, we compare our Gazetiki to Geonames. Copyright 2008 ACM. 0 0