Gregory Grefenstette

From WikiPapers
Jump to: navigation, search

Gregory Grefenstette is an author.


Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Social media driven image retrieval Flickr
Image retrieval
Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR'11 English 2011 People often try to find an image using a short query and images are usually indexed using short annotations. Matching the query vocabulary with the indexing vocabulary is a difficult problem when little text is available. Textual user generated content in Web 2.0 platforms contains a wealth of data that can help solve this problem. Here we describe how to use Wikipedia and Flickr content to improve this match. The initial query is launched in Flickr and we create a query model based on co-occurring terms. We also calculate nearby concepts using Wikipedia and use these to expand the query. The final results are obtained by ranking the results for the expanded query using the similarity between their annotation and the Flickr model. Evaluation of these expansion and ranking techniques, over the Image CLEF 2010 Wikipedia Collection containing 237,434 images and their multilingual textual annotations, shows that a consistent improvement compared to state of the art methods. 0 0
Spatiotemporal mapping of Wikipedia concepts English 2010 Space and time are important dimensions in the representation of a large number of concepts. However there exists no available resource that provides spatiotemporal mappings of generic concepts. Here we present a link-analysis based method for extracting the main locations and periods associated to all Wikipedia concepts. Relevant locations are selected from a set of geotagged articles, while relevant periods are discovered using a list of people with associated life periods. We analyze article versions over multiple languages and consider the strength of a spatial/temporal reference to be proportional to the number of languages in which it appears. To illustrate the utility of the spatiotemporal mapping of Wikipedia concepts, we present an analysis of cultural interactions and a temporal analysis of two domains. The Wikipedia mapping can also be used to perform rich spatiotemporal document indexing by extracting implicit spatial and temporal references from texts. 0 1
Mining a multilingual geographical gazetteer from the Web Proceedings - 2009 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2009 English 2009 Geographical gazetteers are necessary in a wide variety of applications. In the past, the construction of such gazetteers has been a tedious, manual process and only recently have the first attempts to automate the gazetteers creation been made. Here we describe our approach for mining accurate but large-scale multilingual geographic information by successively filtering information found in heterogeneous data sources (Flickr, Wikipedia, Panoramio, Web pages indexed by search engines). Statistically crosschecking information found in each site, we are able to identify new geographic objects, and to indicate, for each one, its name, its GPS coordinates, its encompassing regions (city, region, country), the language of the name, its popularity, and the type of the object (church, bridge, etc.). We evaluate our approach by comparing, wherever possible, our multilingual gazetteer to other known attempts at automatically building a geographic database and to Geonames, a manually built gazetteer. 0 0
Gazetiki: Automatic creation of a geographical gazetteer Data mining
Geographic gazetteer
Information extraction
Proceedings of the ACM International Conference on Digital Libraries English 2008 Geolocalized databases are becoming necessary in a wide variety of application domains. Thus far, the creation of such databases has been a costly, manual process. This drawback has stimulated interest in automating their construction, for example, by mining geographical information from the Web. Here we present and evaluate a new automated technique for creating and enriching a geographical gazetteer, called Gazetiki. Our technique merges disparate information from Wikipedia, Panoramio, and web search, engines in order to identify geographical names, categorize these names, find their geographical coordinates and rank them. The information produced in Gazetiki enhances and complements the Geonames database, using a similar domain model. We show that our method provides a richer structure and an improved coverage compared to another known attempt at automatically building a geographic database and, where possible, we compare our Gazetiki to Geonames. Copyright 2008 ACM. 0 0