TY - GEN
T1 - ActNeT
T2 - SIAM International Conference on Data Mining, SDM 2013
AU - Hu, Xia
AU - Tang, Jiliang
AU - Gao, Huiji
AU - Liu, Huan
N1 - Publisher Copyright: Copyright © SIAM.
PY - 2013
Y1 - 2013
N2 - Supervised learning, e.g., classification, plays an important role in processing and organizing microblogging data. In microblogging, it is easy to mass vast quantities of unlabeled data, but would be costly to obtain labels, which are essential for supervised learning algorithms. In order to reduce the labeling cost, active learning is an effective way to select representative and informative instances to query for labels for improving the learned model. Different from traditional data in which the instances are assumed to be independent and identically distributed (i.i.d.), instances in microblogging are networked with each other. This presents both opportunities and challenges for applying active learning to microblogging data. Inspired by social correlation theories, we investigate whether social relations can help perform effective active learning on networked data. In this paper, we propose a novel Active learning framework for the classification of Networked Texts in microblogging (ActNeT). In particular, we study how to incorporate network information into text content modeling, and design strategies to select the most representative and informative instances from microblogging for labeling by taking advantage of social network structure. Experimental results on Twitter datasets show the benefit of incorporating network information in active learning and that the proposed framework outperforms existing state-of-the-art methods.
AB - Supervised learning, e.g., classification, plays an important role in processing and organizing microblogging data. In microblogging, it is easy to mass vast quantities of unlabeled data, but would be costly to obtain labels, which are essential for supervised learning algorithms. In order to reduce the labeling cost, active learning is an effective way to select representative and informative instances to query for labels for improving the learned model. Different from traditional data in which the instances are assumed to be independent and identically distributed (i.i.d.), instances in microblogging are networked with each other. This presents both opportunities and challenges for applying active learning to microblogging data. Inspired by social correlation theories, we investigate whether social relations can help perform effective active learning on networked data. In this paper, we propose a novel Active learning framework for the classification of Networked Texts in microblogging (ActNeT). In particular, we study how to incorporate network information into text content modeling, and design strategies to select the most representative and informative instances from microblogging for labeling by taking advantage of social network structure. Experimental results on Twitter datasets show the benefit of incorporating network information in active learning and that the proposed framework outperforms existing state-of-the-art methods.
UR - http://www.scopus.com/inward/record.url?scp=84937403769&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84937403769&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972832.34
DO - 10.1137/1.9781611972832.34
M3 - Conference contribution
T3 - Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
SP - 306
EP - 314
BT - Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
A2 - Ghosh, Joydeep
A2 - Obradovic, Zoran
A2 - Dy, Jennifer
A2 - Zhou, Zhi-Hua
A2 - Kamath, Chandrika
A2 - Parthasarathy, Srinivasan
PB - Siam Society
Y2 - 2 May 2013 through 4 May 2013
ER -