Metropolis–Hastings algorithm

Metropolis–Hastings algorithm
Name	Metropolis–Hastings algorithm
Class	Markov chain Monte Carlo
Year	1953, 1970
Authors	Nicholas Metropolis, W. Keith Hastings
Time	Varies
Space	Varies

Contents

● Overview
● Algorithm description
● Step-by-step procedure
● Convergence properties
● Relation to other algorithms
● Applications

Metropolis–Hastings algorithm. The Metropolis–Hastings algorithm is a foundational Markov chain Monte Carlo method for obtaining a sequence of random samples from a probability distribution when direct sampling is difficult. Developed from work by Nicholas Metropolis and extended by W. Keith Hastings, it constructs a Markov chain that has the desired distribution as its stationary distribution. This algorithm is a cornerstone of Bayesian statistics and computational physics, enabling inference on complex probabilistic models.

● Overview

The algorithm was first introduced in a 1953 paper by Nicholas Metropolis, Arianna W. Rosenbluth, Marshall Rosenbluth, Augusta H. Teller, and Edward Teller for statistical mechanical problems. W. Keith Hastings later generalized it in 1970, leading to its modern form. It is a key method within the broader class of Markov chain Monte Carlo techniques, which are essential for performing integration and optimization in high-dimensional spaces. The algorithm's versatility allows it to sample from distributions known only up to a normalizing constant, a common scenario in Bayesian inference.

● Algorithm description

The core idea is to construct a Markov chain whose states are samples from a target distribution. This is achieved by using a proposal distribution to suggest new states and a carefully designed acceptance probability to decide whether to transition. The acceptance rule ensures the chain satisfies detailed balance, a sufficient condition for the target to be the stationary distribution. The choice of proposal distribution, such as a Gaussian distribution or Cauchy distribution, is critical for efficiency but does not alter the theoretical convergence guarantees.

● Step-by-step procedure

Given a target density π and a proposal density q, the algorithm initializes at an arbitrary state x₀. For each iteration t, it first draws a candidate x* from q(·|xₜ). It then computes the acceptance probability α = min(1, (π(x*)q(xₜ|x*))/(π(xₜ)q(x*|xₜ))). A random number u is drawn from a Uniform(0,1) distribution. If u ≤ α, the chain moves to xₜ₊₁ = x*; otherwise, it remains at xₜ₊₁ = xₜ. This process repeats, generating a sequence {x₀, x₁, ...} that, after a burn-in period, provides correlated samples from π.

● Convergence properties

Under mild regularity conditions, the Markov chain produced is ergodic, guaranteeing convergence to the stationary distribution regardless of the starting state. The rate of convergence depends heavily on the relationship between the proposal and target distributions; poor choices can lead to high autocorrelation and slow mixing. Theoretical analysis often relies on concepts from Markov chain theory, such as geometric ergodicity. Diagnostics like the Gelman–Rubin statistic and analysis of autocorrelation function plots are used in practice to assess convergence.

● Relation to other algorithms

The Metropolis–Hastings algorithm is a generalization of several important methods. The original Metropolis algorithm assumes a symmetric proposal distribution. Gibbs sampling is a special case where proposals are drawn from full conditional distributions, leading to an acceptance probability of one. Other related methods include the Hamiltonian Monte Carlo algorithm, which uses Hamiltonian dynamics to propose distant states, and the slice sampling technique. It also forms the basis for more advanced frameworks like reversible-jump Markov chain Monte Carlo.

● Applications

Its primary application is in Bayesian statistics for sampling from posterior distributions, enabling inference for models used in fields from machine learning to computational biology. In statistical physics, it simulates systems like the Ising model. It is used in image analysis, signal processing, and econometrics for estimating complex models. The algorithm's ability to handle high-dimensional integrals makes it indispensable for problems in astrophysics, particle physics, and quantum chemistry.

Category:Monte Carlo methods Category:Markov chains Category:Statistical algorithms Category:Bayesian statistics

● Some section boundaries were detected using heuristics. Certain LLMs occasionally produce headings without standard wikitext closing markers, which are resolved automatically.