Achieving Accurate Prediction Models: Less is Almost Always More

John D. Morris; Mary G. Lieberman

Vol. 33 No. 2 (2007), Articles

Vol. 33 No. 2 (2007)

Achieving Accurate Prediction Models: Less is Almost Always More

Articles

Published 2007-02-01

John D. Morris⁺⁻
Mary G. Lieberman⁺⁻

John D. Morris

Florida Atlantic University

Mary G. Lieberman

Florida Atlantic University

PDF

Keywords

Regression analysis

Abstract

Accurate cross-validated prediction accuracy is posited as the ultimate criterion for prediction model performance. This study investigates and demonstrates, across a wide variety of data sets, the nearly ubiquitous benefit to classification model accuracy of optimal subset selection. Unlike popular “stepwise” methods often used (and abused) in the literature, this study considers only all-possible-subset cross-validated performance as the criterion of accuracy. The superiority of variable subsets is demonstrated for predictive discriminant analysis and logistic regression. Computer programs are also made available.

PDF

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Downloads

Download data is not yet available.