TY - GEN
T1 - Integrating social media data for community detection
AU - Tang, Jiliang
AU - Wang, Xufei
AU - Liu, Huan
PY - 2012
Y1 - 2012
N2 - Community detection is an unsupervised learning task that discovers groups such that group members share more similarities or interact more frequently among themselves than with people outside groups. In social media, link information can reveal heterogeneous relationships of various strengths, but often can be noisy. Since different sources of data in social media can provide complementary information, e.g., bookmarking and tagging data indicates user interests, frequency of commenting suggests the strength of ties, etc., we propose to integrate social media data of multiple types for improving the performance of community detection. We present a joint optimization framework to integrate multiple data sources for community detection. Empirical evaluation on both synthetic data and real-world social media data shows significant performance improvement of the proposed approach. This work elaborates the need for and challenges of multi-source integration of heterogeneous data types, and provides a principled way of multi-source community detection.
AB - Community detection is an unsupervised learning task that discovers groups such that group members share more similarities or interact more frequently among themselves than with people outside groups. In social media, link information can reveal heterogeneous relationships of various strengths, but often can be noisy. Since different sources of data in social media can provide complementary information, e.g., bookmarking and tagging data indicates user interests, frequency of commenting suggests the strength of ties, etc., we propose to integrate social media data of multiple types for improving the performance of community detection. We present a joint optimization framework to integrate multiple data sources for community detection. Empirical evaluation on both synthetic data and real-world social media data shows significant performance improvement of the proposed approach. This work elaborates the need for and challenges of multi-source integration of heterogeneous data types, and provides a principled way of multi-source community detection.
KW - Community Detection
KW - Multi-source Integration
KW - Social Media Data
UR - http://www.scopus.com/inward/record.url?scp=84867631568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867631568&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33684-3_1
DO - 10.1007/978-3-642-33684-3_1
M3 - Conference contribution
SN - 9783642336836
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 20
BT - Modeling and Mining Ubiquitous Social Media - International Workshop, MSM 2011, MUSE 2011, Revised Selected Papers
T2 - 2nd International Workshop on Modeling and Mining Ubiquitous Social Media, MSM 2011, MUSE 2011
Y2 - 9 October 2011 through 9 October 2011
ER -