Binomial distribution

Binomial distribution
AI-generated (Stable Diffusion 3.5) · CC BY 4.0 · source
Name	Binomial distribution
Type	Discrete probability distribution
Support	{0,1,2,...,n}
Parameters	n (number of trials), p (success probability)
Mean	np
Variance	np(1−p)

Contents

Definition and probability mass function
Properties and moments
Parameter estimation and inference
Relationships to other distributions
Applications and examples
Generalizations and extensions

Binomial distribution The binomial distribution models the count of successes in a fixed number of independent Bernoulli trials with constant success probability. It is central in probability theory and statistics, used in contexts from quality control at General Electric factories to clinical trials at Mayo Clinic and survey sampling by the Pew Research Center. Its mathematical development appears alongside work by Jacob Bernoulli, Abraham de Moivre, and later formalizations by Pierre-Simon Laplace and Andrey Kolmogorov.

Definition and probability mass function

For nonnegative integer n and p in [0,1], the probability mass function gives the probability of observing k successes (k = 0,1,...,n). The pmf arises by counting combinations via the binomial coefficients used by Blaise Pascal in the study that produced Pascal's triangle, which influenced work by Fermat and René Descartes. The pmf equals the number of ways to choose k labeled trials to be successes times the product of success and failure probabilities, a formulation exploited in experiments at Bell Labs and statistical protocols at Centers for Disease Control and Prevention.

Properties and moments

The distribution has mean np and variance np(1−p), results employed in analysis at Bell Labs and by theorists like Andrey Kolmogorov and Émile Borel. Higher moments and cumulants can be derived using generating functions similar to methods developed by Sofia Kovalevskaya and Brook Taylor; the probability generating function and moment generating function give compact expressions used in reliability studies at NASA and actuarial models at Lloyd's of London. Mode, skewness, and kurtosis depend on n and p; asymptotic normality as n→∞ with p fixed is a consequence of the central limit theorem proved by Aleksandr Lyapunov and developed in the work of Harald Cramér.

Parameter estimation and inference

Maximum likelihood estimation for p from observed k is k/n, a simple estimator used in surveys by Gallup and clinical research at Johns Hopkins University. Confidence intervals for p can be constructed using exact methods originating from Jerzy Neyman and Egon Pearson, or approximate Wald and Wilson intervals influenced by work from David Cox and A. Stuart. Hypothesis tests comparing proportions leverage likelihood-ratio and score tests used in randomized trials at Cochrane and regulatory analyses by the Food and Drug Administration.

Relationships to other distributions

The binomial distribution relates to the Bernoulli distribution per trial, as originally formalized by Jacob Bernoulli; it converges to the Poisson distribution under rare-event limits studied by Siméon Denis Poisson. For large n with moderate p it approximates the normal distribution, a classical result linked to the work of Pierre-Simon Laplace and Carl Friedrich Gauss. Conjugacy with the beta distribution is central to Bayesian analyses used at institutions like Oxford University and Cambridge University; this beta-binomial compound yields overdispersion models used in ecology by researchers at Smithsonian Institution.

Applications and examples

Practical applications include pass/fail testing in manufacturing at Toyota, clinical endpoint counts in trials at National Institutes of Health, and polling outcomes reported by The New York Times and BBC News. In genetics, counts of inherited alleles follow binomial models in studies by Gregor Mendel-inspired researchers at Max Planck Society; in sports statistics, batting success over n at-bats is modeled in analyses by ESPN statisticians. Quality-control charts at Western Electric and reliability testing in aerospace at European Space Agency use binomial-based decision rules.

Generalizations and extensions

Extensions include the multinomial distribution for categorical outcomes, developed in contexts like market research at Nielsen and election forecasting at FiveThirtyEight; the beta-binomial addresses overdispersion in ecological surveys by teams at US Geological Survey. Negative binomial and Poisson-binomial distributions generalize count and nonidentical-trial settings, applied in epidemiology at World Health Organization and insurance risk modeling at Munich Re. Compound and hierarchical models combining binomial components underpin Bayesian hierarchical modeling frameworks used at Stanford University and in machine learning at Google.

Category:Probability distributions