TY - GEN
T1 - A novel measure for coherence in statistical topic models
AU - Morstatter, Fred
AU - Liu, Huan
N1 - Funding Information: This work is sponsored, in part, by Office of Naval Research (ONR) grant N000141410095. Publisher Copyright: © 2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - Big data presents new challenges for understanding large text corpora. Topic modeling algorithms help understand the underlying patterns, or "topics", in data. Researchersauthor often read these topics in order to gain an understanding of the underlying corpus. It is important to evaluate the interpretability of these automatically generated topics. Methods have previously been designed to use crowdsourcing platforms to measure interpretability. In this paper, we demonstrate the necessity of a key concept, coherence, when assessing the topics and propose an effective method for its measurement. We show that the proposed measure of coherence captures a different aspect of the topics than existing measures. We further study the automation of these topic measures for scalability and reproducibility, showing that these measures can be automated.
AB - Big data presents new challenges for understanding large text corpora. Topic modeling algorithms help understand the underlying patterns, or "topics", in data. Researchersauthor often read these topics in order to gain an understanding of the underlying corpus. It is important to evaluate the interpretability of these automatically generated topics. Methods have previously been designed to use crowdsourcing platforms to measure interpretability. In this paper, we demonstrate the necessity of a key concept, coherence, when assessing the topics and propose an effective method for its measurement. We show that the proposed measure of coherence captures a different aspect of the topics than existing measures. We further study the automation of these topic measures for scalability and reproducibility, showing that these measures can be automated.
UR - http://www.scopus.com/inward/record.url?scp=84991772352&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991772352&partnerID=8YFLogxK
U2 - 10.18653/v1/p16-2088
DO - 10.18653/v1/p16-2088
M3 - Conference contribution
T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Short Papers
SP - 543
EP - 548
BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Short Papers
PB - Association for Computational Linguistics (ACL)
T2 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 7 August 2016 through 12 August 2016
ER -