TY - GEN
T1 - A balanced ensemble approach to weighting classifiers for text classification
AU - Fung, Gabriel Pui Cheong
AU - Yu, Jeffrey Xu
AU - Wang, Haixun
AU - Cheung, David W.
AU - Liu, Huan
PY - 2006
Y1 - 2006
N2 - This paper studies the problem of constructing an effective heterogeneous ensemble classifier for text classification. One major challenge of this problem is to formulate a good combination function, which combines the decisions of the individual classifiers in the ensemble. We show that the classification performance is affected by three weight components and they should be included in deriving an effective combination function. They are: (1) Global effectiveness, which measures the effectiveness of a member classifier in classifying a set of unseen documents; (2) Local effectiveness, which measures the effectiveness of a member classifier in classifying the particular domain of an unseen document; and (3) Decision confidence, which describes how confident a classifier is when making a decision when classifying a specific unseen document. We propose a new balanced combination function, called Dynamic Classifier Weighting (DCW), that incorporates the aforementioned three components. The empirical study demonstrates that the new combination function is highly effective for text classification.
AB - This paper studies the problem of constructing an effective heterogeneous ensemble classifier for text classification. One major challenge of this problem is to formulate a good combination function, which combines the decisions of the individual classifiers in the ensemble. We show that the classification performance is affected by three weight components and they should be included in deriving an effective combination function. They are: (1) Global effectiveness, which measures the effectiveness of a member classifier in classifying a set of unseen documents; (2) Local effectiveness, which measures the effectiveness of a member classifier in classifying the particular domain of an unseen document; and (3) Decision confidence, which describes how confident a classifier is when making a decision when classifying a specific unseen document. We propose a new balanced combination function, called Dynamic Classifier Weighting (DCW), that incorporates the aforementioned three components. The empirical study demonstrates that the new combination function is highly effective for text classification.
UR - http://www.scopus.com/inward/record.url?scp=84878065641&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84878065641&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2006.2
DO - 10.1109/ICDM.2006.2
M3 - Conference contribution
SN - 0769527019
SN - 9780769527017
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 869
EP - 873
BT - Proceedings - Sixth International Conference on Data Mining, ICDM 2006
T2 - 6th International Conference on Data Mining, ICDM 2006
Y2 - 18 December 2006 through 22 December 2006
ER -