Systematically Missing Data and Multiple Regression Analysis: An Empirical Comparison of Deletion and Imputation Techniques
PDF

Keywords

Regression analysis

Abstract

The purpose of this study was to investigate, within the context of a two-predictor multiple regression analysis with systematically missing data, the effectiveness of eight missing data treatments on the sample estimate of R2 and each standardized regression coefficient. Furthermore, the study investigated whether sample size, proportion of systematically missing data above the mean of the regressor, and the percentage of missing data affected the effectiveness of the eight missing data treatments. One thousand samples of size 50, 100, and 200 were generated per data set. The percentages of missing data were 0%, 10%, 20%, 30%, 40%, 50%, and 60%, occurring either on one regressor or across both regressors. The proportions of missing data that were above the mean value of the
regressors were 0.60, 0.70, 0.80, or 0.90. The data were analyzed by computing effect sizes obtained from the missing data treatment conditions relative to the complete sample condition (i.e., 0% missing data). The results suggest that the stochastic multiple regression imputation technique was the most effective treatment of the missing data. Listwise and pairwise deletion approaches were less effective than stochastic multiple regression imputation but were superior to the other techniques examined.

PDF
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright (c) 1998 Lantry L. Brockmeier, Jeffrey D. Kromrey, Constance V. Hines (Author)

Downloads

Download data is not yet available.