TY - GEN
T1 - Exploring a scalable solution to identifying events in noisy Twitter streams
AU - Kumar, Shamanth
AU - Liu, Huan
AU - Mehta, Sameep
AU - Subramaniam, L. Venkata
N1 - Funding Information: ACKNOWLEDGMENTS This work was sponsored, in part, by the Office of Naval Research grant N000141410095. Publisher Copyright: © 2015 ACM.
PY - 2015/8/25
Y1 - 2015/8/25
N2 - The unprecedented use of social media through smartphones and other web-enabled mobile devices has enabled the rapid adoption of platforms like Twitter. Event detection has found many applications on the web, including breaking news identification and summarization. The recent increase in the usage of Twitter during crises has attracted researchers to focus on detecting events in tweets. However, current solutions have focused on static Twitter data. The necessity to detect events in a streaming environment during fast paced events such as a crisis presents new opportunities and challenges. In this paper, we investigate event detection in the context of real-time Twitter streams as observed in real-world crises. We highlight the key challenges in this problem: the informal nature of text, and the high-volume and high-velocity characteristics of Twitter streams. We present a novel approach to address these challenges using single-pass clustering and the compression distance to efficiently detect events in Twitter streams. Through experiments on large Twitter datasets, we demonstrate that the proposed framework is able to detect events in near real-time and can scale to large and noisy Twitter streams.
AB - The unprecedented use of social media through smartphones and other web-enabled mobile devices has enabled the rapid adoption of platforms like Twitter. Event detection has found many applications on the web, including breaking news identification and summarization. The recent increase in the usage of Twitter during crises has attracted researchers to focus on detecting events in tweets. However, current solutions have focused on static Twitter data. The necessity to detect events in a streaming environment during fast paced events such as a crisis presents new opportunities and challenges. In this paper, we investigate event detection in the context of real-time Twitter streams as observed in real-world crises. We highlight the key challenges in this problem: the informal nature of text, and the high-volume and high-velocity characteristics of Twitter streams. We present a novel approach to address these challenges using single-pass clustering and the compression distance to efficiently detect events in Twitter streams. Through experiments on large Twitter datasets, we demonstrate that the proposed framework is able to detect events in near real-time and can scale to large and noisy Twitter streams.
UR - http://www.scopus.com/inward/record.url?scp=84962519624&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962519624&partnerID=8YFLogxK
U2 - 10.1145/2808797.2809389
DO - 10.1145/2808797.2809389
M3 - Conference contribution
T3 - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
SP - 496
EP - 499
BT - Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
A2 - Pei, Jian
A2 - Tang, Jie
A2 - Silvestri, Fabrizio
PB - Association for Computing Machinery, Inc
T2 - IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015
Y2 - 25 August 2015 through 28 August 2015
ER -