Tweets mining using Wikipedia and impurity cluster measurement

From WikiPapers
Jump to: navigation, search

Tweets mining using Wikipedia and impurity cluster measurement is a 2010 conference paper written in English by Chen Q., Shipper T., Khan L. and published in ISI 2010 - 2010 IEEE International Conference on Intelligence and Security Informatics: Public Safety and Security.

[edit] Abstract

Twitter is one of the fastest growing online social networking services. Tweets can be categorized into trends, and are related with tags and follower/following social relationships. The categorization is neither accurate nor effective due to the short length of tweet messages and noisy data corpus. In this paper, we attempt to overcome these challenges with an extended feature vector along with a semi-supervised clustering technique. In order to achieve this goal, the training set is expanded with Wikipedia topic search result, and the feature set is extended. When building the clustering model and doing the classification, impurity measurement is introduced into our classifier platform. Our experiment results show that the proposed techniques outperform other classifiers with reasonable precision and recall.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 4 time(s)