Abstract
Mallow’s Cp is used herein to select maximally accurate subsets of predictor variables in a logistic regression. Across a wide variety of data sets, an examination of the cross-validated prediction accuracy, posited as the ultimate criterion for model performance, contrasts the leave-one-out performance of Mallow’s Cp selections with the accuracy afforded by optimal subsets. Losses in accuracies ranged from no loss in several data sets up to a maximum of 10%. The performance of Cp selected subsets can be viewed as promising. It is posited that one should also consider parsimony and the richness of multiple optimal models.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright (c) 2008 Mary G. Lieberman, John D. Morris (Author)