Handling Missing Data: Deletion, EM and Multiple Imputation

A practical missing data analysis guide: MCAR, MAR and MNAR mechanisms, Little's test, why deletion fails, EM, multiple imputation and reporting rules.

Almost every field study comes back with gaps in the data; the real question is never whether values are missing but what you do about them. A sound missing data analysis starts by diagnosing the missingness mechanism and lets that diagnosis dictate the method. The default reflex — listwise deletion — usually throws away statistical power and biases the estimates at the same time. This guide walks through the decision chain from mechanism diagnosis to the EM algorithm and multiple imputation.

Three missingness mechanisms: MCAR, MAR, MNAR

How you should respond to missing values depends on why they are missing. The literature distinguishes three mechanisms:

  • MCAR (missing completely at random): missingness is unrelated to any variable. Example: the final page of the questionnaire was omitted from some booklets by a printing error. The most benign — and the rarest — case.
  • MAR (missing at random): missingness can be explained by observed variables in the data set. Example: older respondents skip the technology-use items more often, and age is recorded. EM and multiple imputation are valid here.
  • MNAR (missing not at random): missingness depends on the missing value itself. Example: high earners leave the income question blank precisely because their income is high. Standard methods are not enough; sensitivity analyses are required.

Little's MCAR test and diagnosing the mechanism

Little's MCAR test, available in SPSS's Missing Value Analysis (MVA) module, examines whether the missingness pattern is consistent with MCAR: p > 0.05 means MCAR cannot be rejected, while p < 0.05 signals systematic missingness. Two caveats matter. First, the test cannot distinguish MAR from MNAR — that distinction is untestable from the data and must be argued from the study design. Second, in large samples the test flags even trivial departures as significant. So supplement it by regressing missingness indicators (0/1) on the other variables and describing exactly who tends to have gaps.

Why deletion and mean imputation fail

Listwise deletion removes a participant entirely if any variable is missing. In a ten-variable model, even 5 per cent independent missingness per variable can melt away 40 per cent of the sample — a dramatic loss of power. Worse, under MAR the remaining cases form a selected subsample, so the coefficient estimates are biased. Pairwise deletion uses a different n for every analysis and can produce inconsistent, even non-positive-definite covariance matrices. Mean imputation is the worst of all: it piles an artificial spike at the centre of the distribution, shrinks the variance, pulls correlations towards zero and makes standard errors look smaller than they are. There is no defensible use case for mean imputation in the modern literature. Note, too, that listwise deletion is the default behaviour of SPSS's regression and ANOVA procedures — unless you state otherwise in the methods section, you have made that choice without noticing.

3526.2517.58.75018Listwise deletion15Pairwise deletion35Mean imputation6EM3Multiple imputation
Absolute relative bias in a regression coefficient (%) under a MAR scenario (illustrative simulation values)

How EM and multiple imputation work

The EM (Expectation–Maximisation) algorithm iterates two steps until convergence: in the E-step, expected values of the missing data are computed from the current parameter estimates; in the M-step, the parameters (means, covariances) are re-estimated using those expectations. The result is efficient parameter estimation under MAR. Its weakness is that it yields a single completed data set carrying no imputation uncertainty — standard errors computed from EM-filled data are too small.

Multiple imputation fixes that gap: missing values are filled in m times with a model that includes a random component (current advice: at least m = 20, more when the missingness rate is high), the analysis is run on each completed data set, and the results are combined using Rubin's rules. Rubin's rules add the within-imputation and between-imputation variance, so the uncertainty created by missingness is honestly reflected in the standard errors. The imputation model should include every variable used in the analysis plus auxiliary variables that predict missingness. After imputation, diagnostic plots comparing the distributions of imputed and observed values are the practical way to show the imputation model behaved sensibly.

Comparison of missing data methods
MethodValid underProsCons
Listwise deletionMCAR only (at low missingness rates)Simple, available everywhereSevere power loss; biased under MAR
Pairwise deletionMCAR onlyUses more of the dataInconsistent n across analyses; risk of broken covariance matrices
Mean imputationNone — indefensibleFastShrinks variance, distorts correlations, deflates standard errors
EM algorithmMCAR and MAREfficient parameter estimatesSingle data set; standard errors too small
Multiple imputation (MI)MCAR and MARReflects uncertainty; valid standard errorsComputational and reporting overhead; careful model-building needed

Reporting your missing data analysis

Reviewers look for a minimum set of four elements: (1) percentages of missingness per variable and the number of missing-data patterns, (2) a mechanism assessment (Little's test result plus a reasoned MAR argument), (3) the method and its details (for MI: the number of imputations m, the variables in the imputation model, the software), and (4) a sensitivity comparison against the complete-case analysis. Where MNAR is a serious concern (sensitive questions such as income or substance use), sensitivity analyses such as delta-adjusted multiple imputation strengthen the report by showing how robust the conclusions are to the missingness assumption. On the software side, SPSS handles diagnosis in the MVA module and imputation plus pooling in the Multiple Imputation menu; in R, the mice package is the field standard and its chained-equations approach handles categorical variables flexibly. If you are unsure which analysis your design calls for, see our APA 7 reporting guide for how the results should appear in print.

Deleting cases does not solve a missing data problem; it merely hides it and converts it into bias.

Frequently Asked Questions

What percentage of missing data is acceptable?

There is no single percentage threshold; scattered missingness below 5 per cent rarely changes the conclusions whatever method you use. The mechanism matters more than the rate: 3 per cent MNAR missingness can be more dangerous than 15 per cent MAR. Above roughly 10 per cent, multiple imputation plus a sensitivity analysis is the standard expectation.

Little's MCAR test came out significant — does that mean MNAR?

No. A significant result only means MCAR is rejected; the data cannot distinguish MAR from MNAR. If observed variables can explain the missingness, the MAR assumption is defensible and multiple imputation remains valid.

How many imputed data sets (m) should I generate?

The old m = 5 advice is outdated; the current standard is at least m = 20. A practical rule is to match m to the percentage of incomplete cases: with 30 per cent missingness, choosing around m = 30 stabilises the pooled standard errors.

What missing data support does Celsus offer?

Celsus provides end-to-end support: missingness pattern diagnosis and Little's test, mechanism justification, setting up multiple imputation in SPSS or R mice, pooling via Rubin's rules, and writing the methods section for your thesis or journal article. Every step is delivered as reproducible syntax.

← All posts