Predictive Technology Lab > Papers > 2006 > Analysis of Robust Measures in Random Forest Regression

Analysis of Robust Measures in Random Forest Regression

Table of contents
No headers

Analysis of robust measures in Random Forest Regression (RFR) is an extensive empirical analysis on a new method, Robust Random Forest Regression (RRFR). The application and analysis of this tree-based method has yet to be addressed and may provide additional insight in modeling complex data. Our approach is based on the RFR with two major differences ~ the introduction of robust prediction and error statistic. The current methodology utilizes the node mean for prediction and mean squared error (MSE) to derive the in-node and overall error. Herein, we introduce and assess the use of a median for prediction and mean absolute deviation (MAD) to derive the in-node and overall error. Extensive research has shown that the median is a better prediction of the centrality of the distribution in the presence of large or unbounded outliers because the median inherently ignores these outliers basing its prediction on the ordered, central value(s) of the data. We have shown that RRFR performs well under extreme conditions; with datasets that include unbounded outliers or heteroscedastic conditions.


Files 1

FileSizeDateAttached by 
 RandomForrestRegression.pdf
No description
378.17 kB03:32, 11 Jun 2008AdminActions
You must login to post a comment.