Kolmogorov–Smirnov test

Kolmogorov–Smirnov test
Name	Kolmogorov–Smirnov test
Caption	Illustration of the Kolmogorov–Smirnov statistic. The red line is an empirical distribution function, the blue line is the cumulative distribution function of a hypothesized distribution, and the black arrow is the K–S statistic.
Type	Nonparametric statistics
Inventor	Andrey Kolmogorov, Nikolai Smirnov (mathematician)
Year	1933, 1939
Purpose	Goodness-of-fit test, two-sample comparison
Based on	Empirical distribution function
Related	Anderson–Darling test, Cramér–von Mises criterion, Lilliefors test

Contents

Definition and formulation
One-sample test
Two-sample test
Kolmogorov distribution
Limitations and alternatives

Kolmogorov–Smirnov test. The Kolmogorov–Smirnov test is a nonparametric statistical procedure used to assess the goodness-of-fit of a sample to a reference probability distribution or to compare the distributions of two independent samples. Developed independently by Andrey Kolmogorov and later extended by Nikolai Smirnov (mathematician), it quantifies the distance between empirical distribution functions. The test is widely applied across fields like astronomy, econometrics, and biostatistics due to its simplicity and distribution-free nature.

Definition and formulation

The test statistic is defined as the supremum of the absolute differences between two cumulative distribution functions. For the one-sample case, it compares the empirical distribution function \(F_n(x)\) of the sample with a specified theoretical cumulative distribution function \(F(x)\). The foundational work by Andrey Kolmogorov established the limiting distribution of this statistic, a result deeply connected to the Brownian bridge process. The mathematical formulation relies on the Glivenko–Cantelli theorem, which guarantees the uniform convergence of the empirical distribution function. Key properties of the test were further elucidated by William Feller and Boris Gnedenko, linking it to the theory of stochastic processes.

One-sample test

The one-sample K–S test evaluates the null hypothesis that a data sample comes from a specified continuous probability distribution, such as the normal distribution or the exponential distribution. The procedure involves calculating the maximum vertical deviation between the empirical cumulative distribution function and the theoretical one. Critical values for determining statistical significance are derived from the Kolmogorov distribution. This test is sensitive to differences in both location and shape of the distributions. It is often applied in fields like reliability engineering to test failure data against models like the Weibull distribution. A notable modification for testing normality with estimated parameters is the Lilliefors test, developed by Hubert Lilliefors.

Two-sample test

The two-sample K–S test, extended by Nikolai Smirnov (mathematician), assesses whether two independent samples are drawn from the same underlying distribution, without specifying what that distribution is. It operates by comparing the empirical distribution functions of the two samples. The test is consistent against all alternatives where the distributions differ, making it a powerful omnibus test. It has been extensively used in domains like machine learning for comparing datasets, in cosmology to compare redshift distributions, and in clinical trials to compare treatment and control groups. The test is implemented in major statistical software packages including R (programming language) and Python (programming language).

Kolmogorov distribution

The Kolmogorov distribution is the limiting distribution of the K–S test statistic under the null hypothesis when the theoretical distribution is fully specified. Its derivation is a classic result in mathematical statistics, intimately related to the Kolmogorov–Smirnov theorem. The distribution function is expressed as an infinite series involving the Jacobian theta function. Norbert Wiener's work on the Wiener process provided a foundational connection. Tables of critical values for this distribution are published in standard statistical references like those by Frank Wilcoxon and in the NIST/SEMATECH e-Handbook of Statistical Methods. The distribution is also pivotal in the theory of empirical processes.

Limitations and alternatives

A primary limitation is its sensitivity to the center of the distribution rather than the tails, making it less powerful than specialized tests for detecting specific discrepancies like variance shifts. It is also theoretically invalid if the reference distribution parameters are estimated from the data, a problem addressed by the Lilliefors test. For heavier-tailed alternatives, the Anderson–Darling test and the Cramér–von Mises criterion are often more powerful. In the two-sample case, the test can be less powerful than the Wilcoxon rank-sum test for detecting location shifts. Recent developments in computational statistics have led to permutation-based versions to overcome some limitations. The test remains a staple in exploratory data analysis, as taught in institutions like Stanford University and featured in textbooks by Bradley Efron. Category:Nonparametric statistics Category:Statistical tests Category:Andrey Kolmogorov