Journal of the Royal Statistical Society. Series B. Vol. 39, No. 1, pp. 1-38 [JSTOR link]
A. P. Dempster, N. M. Laird, and D. B. Rubin
Abstract: A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples are sketched, including missing value situations, applications to grouped, censored or truncated data, finite mixture models, variance component estimation, hyperparameter estimation, iteratively reweighted least squares and factor analysis.