Automatically suggesting topics for augmenting text documents
|Automatically suggesting topics for augmenting text documents|
|Author(s)||West R., Precup D., Pineau J.|
|Published in||International Conference on Information and Knowledge Management, Proceedings|
|Keyword(s)||Data mining, Eigenarticles, Principal component analysis, Topic suggestion, Wikipedia (Extra: Eigenarticles, Principal Components, Qualitative analysis, Quantitative evaluation, Text document, Text input, Topic suggestion, Wikipedia, Algorithms, Data mining, Knowledge management, Quality control, Principal component analysis)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Automatically suggesting topics for augmenting text documents is a 2010 conference paper written in English by West R., Precup D., Pineau J. and published in International Conference on Information and Knowledge Management, Proceedings.
We present a method for automated topic suggestion. Given a plain-text input document, our algorithm produces a ranking of novel topics that could enrich the input document in a meaningful way. It can thus be used to assist human authors, who often fail to identify important topics relevant to the context of the documents they are writing. Our approach marries two algorithms originally designed for linking documents to Wikipedia articles, proposed by Milne and Witten  and West et al. , While neither of them can suggest novel topics by itself, their combination does have this capability. The key step towards finding missing topics consists in generalizing from a large background corpus using principal component analysis. In a quantitative evaluation we conclude that our method achieves the precision of human editors when input documents are Wikipedia articles, and we complement this result with a qualitative analysis showing that the approach also works well on other types of input documents.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 2 time(s)