Abstract
Exemplar-based clustering methods have been extensively shown to be effective in many clustering problems. They adaptively determine the number of clusters and hold the appealing advantage of not requiring the estimation of latent parameters, which is otherwise difficult in case of complicated parametric model and high dimensionality of the data. However, modeling arbitrary underlying distribution of the data is still difficult for existing exemplar-based clustering methods. We present Pairwise Exemplar Clustering (PEC) to alleviate this problem by modeling the underlying cluster distributions more accurately with non-parametric kernel density estimation. Interpreting the clusters as classes from a supervised learning perspective, we search for an optimal partition of the data that balances two quantities: 1 the misclassification rate of the data partition for separating the clusters; 2 the sum of within-cluster dissimilarities for controlling the cluster size. The broadly used kernel form of cut turns out to be a special case of our formulation. Moreover, we optimize the corresponding objective function by a new efficient algorithm for message computation in a pairwise MRF. Experimental results on synthetic and real data demonstrate the effectiveness of our method.
Original language | English (US) |
---|---|
Pages | 1204-1211 |
Number of pages | 8 |
State | Published - 2012 |
Externally published | Yes |
Event | 26th AAAI Conference on Artificial Intelligence, AAAI 2012 - Toronto, Canada Duration: Jul 22 2012 → Jul 26 2012 |
Conference
Conference | 26th AAAI Conference on Artificial Intelligence, AAAI 2012 |
---|---|
Country/Territory | Canada |
City | Toronto |
Period | 7/22/12 → 7/26/12 |
ASJC Scopus subject areas
- Artificial Intelligence