TY - GEN
T1 - Using a random forest to inspire a neural network and improving on it
AU - Wang, Suhang
AU - Aggarwal, Charu
AU - Liu, Huan
N1 - Funding Information: This material is based upon work supported by, or in part by, the NSF grants #1614576 and IIS-1217466, and the ONR grant N00014-16-1-2257. Publisher Copyright: Copyright © by SIAM.
PY - 2017
Y1 - 2017
N2 - Neural networks have become very popular in recent years because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well, and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data is somewhat limited, neural networks are known not to perform well with respect to traditional machine learning methods such as random forests. In this paper, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks in order to improve the generalization accuracy. Experimental results on 10 real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements.
AB - Neural networks have become very popular in recent years because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well, and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data is somewhat limited, neural networks are known not to perform well with respect to traditional machine learning methods such as random forests. In this paper, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks in order to improve the generalization accuracy. Experimental results on 10 real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements.
UR - http://www.scopus.com/inward/record.url?scp=85027852106&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85027852106&partnerID=8YFLogxK
U2 - 10.1137/1.9781611974973.1
DO - 10.1137/1.9781611974973.1
M3 - Conference contribution
T3 - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
SP - 1
EP - 9
BT - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
A2 - Chawla, Nitesh
A2 - Wang, Wei
PB - Society for Industrial and Applied Mathematics Publications
T2 - 17th SIAM International Conference on Data Mining, SDM 2017
Y2 - 27 April 2017 through 29 April 2017
ER -