Anderson–Darling test

Anderson–Darling test
Name	Anderson–Darling test
Purpose	Assessing whether a sample comes from a specified probability distribution
Introduced	1952
Authors	Theodore Wilbur Anderson, Donald A. Darling
Type	Goodness-of-fit test

Contents

History
Definition and test statistic
Null distribution and critical values
Variants and extensions
Applications and examples
Implementation and computation

Anderson–Darling test is a statistical goodness-of-fit procedure introduced by Theodore Wilbur Anderson and Donald A. Darling in 1952 to evaluate whether a sample of data is consistent with a hypothesized distribution. It enhances sensitivity to the tails of a distribution relative to alternatives such as the Kolmogorov–Smirnov test, and has been adapted for use with the normal distribution, exponential distribution, Weibull distribution, log-normal distribution and other parametric families. Widely used across fields, it appears in analyses from biostatistics within National Institutes of Health studies to finance modeling in institutions like Goldman Sachs and JPMorgan Chase.

History

The test was proposed in a paper by Theodore Wilbur Anderson and Donald A. Darling building on earlier work in empirical distribution function theory by Andrey Kolmogorov, Nikolai Smirnov, A.N. Kolmogorov's contemporaries, and asymptotic theory developed by Jerzy Neyman and Egon Pearson. It emerged during a period of expansion in statistical inference alongside developments at Princeton University, Stanford University, University of Chicago, and institutions such as Bell Labs where applied testing saw rapid uptake. Early uptake occurred in actuarial science at Lloyd's of London and in agricultural experiments guided by researchers at Iowa State University and University of California, Davis. Extensions were motivated by needs in meteorology at National Oceanic and Atmospheric Administration and reliability engineering at General Electric.

Definition and test statistic

The Anderson–Darling statistic is derived from the integrated squared difference between the empirical distribution function and the specified cumulative distribution function, with a weighting that emphasizes tail regions. For an ordered sample x(1), x(2), ..., x(n) and hypothesized cumulative distribution function F, the statistic A^2 is computed using a summation over i that involves log F(x(i)) and log(1−F(x(n+1−i))). Its rationale relates to likelihood-ratio ideas used by Ronald A. Fisher and the score approaches developed by Harold Hotelling and William Sealy Gosset. The formula is typically presented alongside adjustments for parameter estimation as in methods influenced by Karl Pearson and Sir Francis Galton.

Null distribution and critical values

The null distribution of the Anderson–Darling statistic depends on whether parameters of the hypothesized distribution are known or estimated from the sample, a distinction treated in the literature by S. C. Choi, D. T. Loynes, and later by researchers affiliated with University of Oxford and Massachusetts Institute of Technology. Critical values and p-values are tabulated or approximated via asymptotic expansions and Monte Carlo simulation techniques popularized at Los Alamos National Laboratory and Argonne National Laboratory. For the normal family, practitioners often use adjusted statistics with tables from works associated with George E. P. Box and David Cox; modern implementations rely on computational packages developed in projects connected to R Project and Python Software Foundation.

Variants and extensions

Multiple variants extend the original formulation to composite hypotheses, censored data, multivariate distributions, and discrete distributions. Notable adaptations include the Anderson–Darling k-sample test used in comparative studies at Harvard University and Yale University, censored-sample versions used in reliability studies at NASA and European Space Agency, and multivariate generalizations inspired by work at Carnegie Mellon University and ETH Zurich. Bayesian adaptations have been proposed by researchers affiliated with Columbia University and University of Cambridge, while robust and bootstrap-based enhancements have roots in methodology from University of Toronto and University of Washington.

Applications and examples

Applications span diverse domains: in biostatistics for clinical trial diagnostics at Mayo Clinic and Johns Hopkins University; in finance for value-at-risk model validation at BlackRock and Morgan Stanley; in environmental science for extreme-value modeling at United States Geological Survey and Imperial College London; in engineering for lifetime analysis at Siemens and Boeing; and in quality control at Toyota and General Motors. Example uses include testing normality of residuals in regression analyses performed at University of California, Berkeley and validating exponential assumptions in survival analyses at Fred Hutchinson Cancer Research Center.

Implementation and computation

Computation of the Anderson–Darling statistic and its p-value is available in statistical software packages maintained by projects such as R Project (packages like 'nortest' and 'kSamples') and Python Software Foundation libraries (e.g., SciPy). Efficient routines leverage numerical linear algebra from libraries developed at Netlib and algorithm repositories associated with Lawrence Berkeley National Laboratory. For large samples, asymptotic formulas referenced in works from Princeton University Press and Cambridge University Press provide approximations; for small or complex samples, Monte Carlo resampling or parametric bootstrap executed on high-performance computing clusters at National Center for Atmospheric Research or Oak Ridge National Laboratory is common.

Category:Statistical tests