Browse wiki

Jump to: navigation, search
Automatic extraction of property norm-like data from large text corpora
Abstract Traditional methods for deriving property-Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car-petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.ike character of our generated properties.
Abstractsub Traditional methods for deriving property-Traditional methods for deriving property-based representations of concepts from text have focused on either extracting only a subset of possible relation types, such as hyponymy/hypernymy (e.g., car is-a vehicle) or meronymy/metonymy (e.g., car has wheels), or unspecified relations (e.g., car-petrol). We propose a system for the challenging task of automatic, large-scale acquisition of unconstrained, human-like property norms from large text corpora, and discuss the theoretical implications of such a system. We employ syntactic, semantic, and encyclopedic information to guide our extraction, yielding concept-relation-feature triples (e.g., car be fast, car require petrol, car cause pollution), which approximate property-based conceptual representations. Our novel method extracts candidate triples from parsed corpora (Wikipedia and the British National Corpus) using syntactically and grammatically motivated rules, then reweights triples with a linear combination of their frequency and four statistical metrics. We assess our system output in three ways: lexical comparison with norms derived from human-generated property norm data, direct evaluation by four human judges, and a semantic distance comparison with both WordNet similarity data and human-judged concept similarity ratings. Our system offers a viable and performant method of plausible triple extraction: Our lexical comparison shows comparable performance to the current state-of-the-art, while subsequent evaluations exhibit the human-like character of our generated properties.ike character of our generated properties.
Bibtextype article  +
Doi 10.1111/cogs.12091  +
Has author Kelly C. + , Devereux B. + , Korhonen A. +
Has keyword Entropy + , Human evaluation + , Log-likelihood + , Natural Language Processing + , Pointwise mutual information + , Property norm + , Wikipedia + , Wordnet +
Issn 3640213  +
Issue 4  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 638–682  +
Published in Cognitive Science +
Title Automatic extraction of property norm-like data from large text corpora +
Type journal article  +
Volume 38  +
Year 2014 +
Creation dateThis property is a special property in this wiki. 6 November 2014 22:22:38  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Journal articles  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 November 2014 22:22:38  +
DateThis property is a special property in this wiki. 2014  +
hide properties that link here 
Automatic extraction of property norm-like data from large text corpora + Title
 

 

Enter the name of the page to start browsing from.