Importance of semantic representation: Dataless classification
|Importance of semantic representation: Dataless classification|
|Author(s)||Chang M.-W., Ratinov L., Roth D., Srikumar V.|
|Published in||Proceedings of the National Conference on Artificial Intelligence|
|Keyword(s)||Unknown (Extra: Competitive performances, Data sets, Labeled datums, Learning protocols, Semantic concepts, Semantic representations, String of words, Supervised learning algorithms, Text categorizations, Unlabeled datums, Wikipedia, Artificial intelligence, Bionics, Classifiers, Information theory, Learning algorithms, Learning systems, Semantics, Text processing, Knowledge based systems)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Importance of semantic representation: Dataless classification is a 2008 conference paper written in English by Chang M.-W., Ratinov L., Roth D., Srikumar V. and published in Proceedings of the National Conference on Artificial Intelligence.
Traditionally, text categorization has been studied as the problem of training of a classifier using labeled data. However, people can categorize documents into named categories without any explicit training because we know the meaning of category names. In this paper, we introduce Dataless Classification, a learning protocol that uses world knowledge to induce classifiers without the need for any labeled data. Like humans, a dataless classifier interprets a string of words as a set of semantic concepts. We propose a model for dataless classification and show that the label name alone is often sufficient to induce classifiers. Using Wikipedia as our source of world knowledge, we get 85.29% accuracy on tasks from the 20 Newsgroup dataset and 88.62% accuracy on tasks from a Yahoo! Answers dataset without any labeled or unlabeled data from the datasets. With unlabeled data, we can further improve the results and show quite competitive performance to a supervised learning algorithm that uses 100 labeled examples. Copyright © 2008, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 7 time(s)