Feature selection with linked data in social media

Jiliang Tang, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

90 Scopus citations

Abstract

Feature selection is widely used in preparing high- dimensional data for effective data mining. Increasingly popular social media data presents new challenges to feature selection. Social media data consists of (1) tra- ditional high-dimensional, attribute-value data such as posts, tweets, comments, and images, and (2) linked data that describes the relationships between social me- dia users as well as who post the posts, etc. The nature of social media also determines that its data is mas- sive, noisy, and incomplete, which exacerbates the al- ready challenging problem of feature selection. In this paper, we illustrate the differences between attribute- value data and social media data, investigate if linked data can be exploited in a new feature selection frame- work by taking advantage of social science theories, ex- Tensively evaluate the effects of user-user and user-post relationships manifested in linked data on feature selec- Tion, and discuss some research issues for future work.

Original languageEnglish (US)
Title of host publicationProceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
PublisherSociety for Industrial and Applied Mathematics Publications
Pages118-128
Number of pages11
ISBN (Print)9781611972320
DOIs
StatePublished - 2012
Event12th SIAM International Conference on Data Mining, SDM 2012 - Anaheim, CA, United States
Duration: Apr 26 2012Apr 28 2012

Publication series

NameProceedings of the 12th SIAM International Conference on Data Mining, SDM 2012

Conference

Conference12th SIAM International Conference on Data Mining, SDM 2012
Country/TerritoryUnited States
CityAnaheim, CA
Period4/26/124/28/12

ASJC Scopus subject areas

  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Feature selection with linked data in social media'. Together they form a unique fingerprint.

Cite this