TY - GEN
T1 - CoSelect
T2 - SIAM International Conference on Data Mining, SDM 2013
AU - Tang, Jiliang
AU - Liu, Huan
N1 - Publisher Copyright: Copyright © SIAM.
PY - 2013
Y1 - 2013
N2 - Feature selection is widely used in preparing high-dimensional data for effective data mining. Attribute-value data in traditional feature selection differs from social media data, although both can be large-scale. Social media data is inherently not independent and identically distributed (i.i.d.), but linked. Furthermore, there is a lot of noise. The quality of social media data can vary drastically. These unique properties present challenges as well as opportunities for feature selection. Motivated by these differences, we propose a novel feature selection framework, CoSelect, for social media data. In particular, CoSelect can exploit link information by applying social correlation theories, incorporate instance selection with feature selection, and select relevant instances and features simultaneously. Experimental results on real-world social media datasets demonstrate the effectiveness of our proposed framework and its potential in mining social media data.
AB - Feature selection is widely used in preparing high-dimensional data for effective data mining. Attribute-value data in traditional feature selection differs from social media data, although both can be large-scale. Social media data is inherently not independent and identically distributed (i.i.d.), but linked. Furthermore, there is a lot of noise. The quality of social media data can vary drastically. These unique properties present challenges as well as opportunities for feature selection. Motivated by these differences, we propose a novel feature selection framework, CoSelect, for social media data. In particular, CoSelect can exploit link information by applying social correlation theories, incorporate instance selection with feature selection, and select relevant instances and features simultaneously. Experimental results on real-world social media datasets demonstrate the effectiveness of our proposed framework and its potential in mining social media data.
UR - http://www.scopus.com/inward/record.url?scp=84942434123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84942434123&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972832.77
DO - 10.1137/1.9781611972832.77
M3 - Conference contribution
T3 - Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
SP - 695
EP - 703
BT - Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
A2 - Ghosh, Joydeep
A2 - Obradovic, Zoran
A2 - Dy, Jennifer
A2 - Zhou, Zhi-Hua
A2 - Kamath, Chandrika
A2 - Parthasarathy, Srinivasan
PB - Siam Society
Y2 - 2 May 2013 through 4 May 2013
ER -