TY - GEN
T1 - Feature selection with linked data in social media
AU - Tang, Jiliang
AU - Liu, Huan
PY - 2012
Y1 - 2012
N2 - Feature selection is widely used in preparing high- dimensional data for effective data mining. Increasingly popular social media data presents new challenges to feature selection. Social media data consists of (1) tra- ditional high-dimensional, attribute-value data such as posts, tweets, comments, and images, and (2) linked data that describes the relationships between social me- dia users as well as who post the posts, etc. The nature of social media also determines that its data is mas- sive, noisy, and incomplete, which exacerbates the al- ready challenging problem of feature selection. In this paper, we illustrate the differences between attribute- value data and social media data, investigate if linked data can be exploited in a new feature selection frame- work by taking advantage of social science theories, ex- Tensively evaluate the effects of user-user and user-post relationships manifested in linked data on feature selec- Tion, and discuss some research issues for future work.
AB - Feature selection is widely used in preparing high- dimensional data for effective data mining. Increasingly popular social media data presents new challenges to feature selection. Social media data consists of (1) tra- ditional high-dimensional, attribute-value data such as posts, tweets, comments, and images, and (2) linked data that describes the relationships between social me- dia users as well as who post the posts, etc. The nature of social media also determines that its data is mas- sive, noisy, and incomplete, which exacerbates the al- ready challenging problem of feature selection. In this paper, we illustrate the differences between attribute- value data and social media data, investigate if linked data can be exploited in a new feature selection frame- work by taking advantage of social science theories, ex- Tensively evaluate the effects of user-user and user-post relationships manifested in linked data on feature selec- Tion, and discuss some research issues for future work.
UR - http://www.scopus.com/inward/record.url?scp=84880191846&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880191846&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972825.11
DO - 10.1137/1.9781611972825.11
M3 - Conference contribution
SN - 9781611972320
T3 - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
SP - 118
EP - 128
BT - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
PB - Society for Industrial and Applied Mathematics Publications
T2 - 12th SIAM International Conference on Data Mining, SDM 2012
Y2 - 26 April 2012 through 28 April 2012
ER -