Learning homophily couplings from non-IID data for joint feature selection and noise-resilient outlier detection

Guansong Pang, Longbing Cao, Ling Chen, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Scopus citations

Abstract

This paper introduces a novel wrapper-based outlier detection framework (WrapperOD) and its instance (HOUR) for identifying outliers in noisy data (i.e., data with noisy features) with strong couplings between outlying behaviors. Existing subspace or feature selection-based methods are significantly challenged by such data, as their search of feature subset(s) is independent of outlier scoring and thus can be misled by noisy features. In contrast, HOUR takes a wrapper approach to iteratively optimize the feature subset selection and outlier scoring using a top-κ outlier ranking evaluation measure as its objective function. HOUR learns homophily couplings between outlying behaviors (i.e., abnormal behaviors are not independent - they bond together) in constructing a noise-resilient outlier scoring function to produce a reliable outlier ranking in each iteration. We show that HOUR (i) retains a 2-approximation outlier ranking to the optimal one; and (ii) significantly outperforms five stateof-the-art competitors on 15 real-world data sets with different noise levels in terms of AUC and/or P@n. The source code of HOUR is available at https://sites.google.com/site/gspangsite/sourcecode.

Original languageEnglish (US)
Title of host publication26th International Joint Conference on Artificial Intelligence, IJCAI 2017
EditorsCarles Sierra
PublisherInternational Joint Conferences on Artificial Intelligence
Pages2585-2591
Number of pages7
ISBN (Electronic)9780999241103
DOIs
StatePublished - 2017
Event26th International Joint Conference on Artificial Intelligence, IJCAI 2017 - Melbourne, Australia
Duration: Aug 19 2017Aug 25 2017

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume0

Conference

Conference26th International Joint Conference on Artificial Intelligence, IJCAI 2017
Country/TerritoryAustralia
CityMelbourne
Period8/19/178/25/17

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning homophily couplings from non-IID data for joint feature selection and noise-resilient outlier detection'. Together they form a unique fingerprint.

Cite this