Markov chain Monte Carlo

Markov chain Monte Carlo
Name	Markov chain Monte Carlo
Class	Probabilistic algorithm
Founded in	1949
Key people	Nicholas Metropolis, W. Keith Hastings, Stuart Geman, Donald Geman
Related fields	Bayesian statistics, Computational physics, Machine learning

Contents

Overview
Theoretical foundations
Common algorithms
Applications
Convergence and diagnostics
History and development

Markov chain Monte Carlo is a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain samples from that distribution by observing the chain after a number of steps. The methods are foundational in Bayesian statistics and are widely used in fields ranging from computational physics to machine learning.

Overview

The core idea involves using a Markov chain to generate a sequence of sample values, where the probability of the next value depends only on the current one. This sequence, after a sufficient burn-in period, approximates samples from a target distribution. The approach is particularly powerful for evaluating complex multidimensional integrals common in Bayesian inference, where direct sampling or analytical solutions are infeasible. Early groundbreaking work was conducted at the Los Alamos National Laboratory, leading to the famous Metropolis algorithm.

Theoretical foundations

The methodology relies on constructing a Markov chain that is ergodic and satisfies detailed balance with respect to the target distribution. Key theoretical concepts include the Chapman–Kolmogorov equation and the Perron–Frobenius theorem, which underpin the convergence behavior. The Hammersley–Clifford theorem provides a foundational link to Gibbs sampling, another prominent technique. Ensuring the chain is irreducible and aperiodic guarantees that it will eventually converge to the unique stationary distribution.

Common algorithms

The Metropolis–Hastings algorithm, developed by Nicholas Metropolis and extended by W. Keith Hastings, is the most general and widely used framework. Gibbs sampling, popularized by Stuart Geman and Donald Geman, is a special case particularly efficient for conditional distributions. More recent developments include the Hamiltonian Monte Carlo method, which incorporates concepts from Hamiltonian dynamics, and the No-U-Turn Sampler developed by Matthew D. Hoffman and Andrew Gelman. Other variants include slice sampling and reversible-jump.

Applications

These methods revolutionized Bayesian statistics, enabling practical computation of posterior distributions for complex models in fields like genetics and pharmacology. In computational physics, they are essential for simulations in lattice gauge theory and statistical mechanics. Within machine learning, they are used for training restricted Boltzmann machines and in deep learning for approximate Bayesian neural networks. Additional applications span astrophysics, ecology, and financial mathematics.

Convergence and diagnostics

Assessing convergence is critical, as chains may require a long burn-in period before sampling from the stationary distribution. Common diagnostic tools include the Gelman–Rubin statistic, which compares multiple chains, and plots of autocorrelation functions. The effective sample size quantifies the number of independent samples obtained. Theoretical work on mixing time and spectral gap provides insights into convergence rates, while practical checks often involve monitoring trace plots and Geweke's diagnostic.

History and development

The origins trace to the 1949 work by Stanislaw Ulam and John von Neumann, with the first concrete algorithm, the Metropolis algorithm, published in 1953 by Nicholas Metropolis and colleagues at the Los Alamos National Laboratory. The field expanded significantly with the 1970 paper by W. Keith Hastings generalizing the method. The introduction of Gibbs sampling in the 1980s by the Geman brothers and its adoption in Bayesian image analysis by Julian Besag spurred widespread use in statistics. The 1990s saw further refinement with the BUGS project and the advent of Hamiltonian Monte Carlo.

Category:Monte Carlo methods Category:Markov chains Category:Computational statistics