Noisy-channel coding theorem

Noisy-channel coding theorem
AI-generated (Stable Diffusion 3.5) · CC BY 4.0 · source
Name	Noisy-channel coding theorem
Field	Information theory
Introduced	1948
Founder	Claude Shannon
Disciplines	Mathematics; Electrical engineering

Contents

History and development
Formal statement
Proof outline and implications
Capacity of common channel models
Coding strategies and achievability
Converse theorem and limits

Noisy-channel coding theorem.

The noisy-channel coding theorem is a foundational result in information theory that characterizes the maximum reliable communication rate over a noisy transmission medium. Formulated by Claude Shannon in 1948 while at Bell Labs, it introduced the concept of channel capacity and separated the tasks of source coding and channel coding, influencing work at MIT, Princeton University, Bell Telephone Laboratories, Harvard University, Stanford University, Cornell University and institutions engaged in wartime and postwar research such as RAND Corporation. The theorem underpins practical developments at AT&T, Bell Labs, IBM, Nokia, Ericsson and standards bodies including IEEE and ITU.

History and development

Shannon presented the theorem in "A Mathematical Theory of Communication" (1948), building on prior work by Harry Nyquist, Ralph Hartley, Norbert Wiener, John von Neumann, Alan Turing and contemporaries at Bell Labs and Columbia University. Early reception involved exchanges with practitioners at AT&T, theoreticians at Princeton University and applied researchers at MIT Lincoln Laboratory; later extensions came from Richard Hamming, Andrew Viterbi, Peter Elias, David Slepian, Jack Wolf, Shannon's colleagues and others affiliated with University of Illinois Urbana-Champaign and University of California, Berkeley. Developments in the 1950s and 1960s by Claude Shannon's successors intersected with work by Solomon Golomb, Marcel Golay, Reed–Solomon codes authors Irving S. Reed and Gustave Solomon, and later with convolutional and turbo code advances by Claude Berrou, Gérard Battail and Robert Gallager at MIT and MIT Lincoln Laboratory. The theorem's influence reached cryptography efforts at National Security Agency and information-theoretic security research at Indiana University and University of California, Santa Barbara.

Formal statement

For a discrete memoryless channel defined by input alphabet X and output alphabet Y with transition probabilities P(Y|X), Shannon proved there exists a quantity C, the channel capacity, so that for any rate R < C and any ε > 0 there exist coding and decoding schemes that achieve probability of decoding error < ε for sufficiently large block length n; conversely, any rate R > C yields error probability bounded away from zero. The capacity C is given by C = max_{P_X} I(X;Y), where the mutual information I(X;Y) is computed relative to P_X and the channel law P(Y|X). Shannon's formulation used concepts and notation later standardized in texts from David MacKay, Thomas M. Cover, Joy A. Thomas, Joan L. Gallier and institutions such as Cambridge University Press and Princeton University Press.

Proof outline and implications

Shannon's achievability proof uses random coding arguments, typical sequences, and the asymptotic equipartition property (AEP). The argument constructs random codebooks drawn according to an input distribution achieving the capacity, employs jointly typical decoding, and bounds error probabilities using union bounds and properties of typical sets; these techniques trace to methods in probability theory developed by Andrey Kolmogorov, André Weil, and results formalized in later expositions by Thomas M. Cover and Joy A. Thomas. The converse uses Fano's inequality to relate decoding error to mutual information, bounding achievable rates by I(X;Y). Consequences impacted the design of practical systems by guiding research at Bell Labs, Nokia Research Center, Ericsson Research and cryptographic limits studied at Cryptography Research, Inc. The theorem implied separation of tasks that shaped standards bodies like IEEE 802 and 3GPP, and inspired algorithmic advances at Bell Labs Research, Mitsubishi Electric Research Laboratories and Samsung Research.

Capacity of common channel models

For the binary symmetric channel (BSC) with crossover probability p, capacity is 1 - H_2(p), where H_2 is binary entropy. For the binary erasure channel (BEC) with erasure probability ε, capacity is 1 - ε. For the additive white Gaussian noise (AWGN) channel with power constraint P and noise spectral density N_0/2, the capacity is (1/2) log2(1 + P/N) bits per channel use under average-power constraints; these formulae are central to engineering work at Bell Labs, NASA Jet Propulsion Laboratory, European Space Agency, Qualcomm, Intel, Texas Instruments and academic groups at Caltech and Georgia Institute of Technology. Extensions to fading channels, multiple-input multiple-output (MIMO) channels and network information theory draw on contributions from Erez Telatar, Emmanuel Biglieri, Gerhard Kramer and institutions including University of Southern California and University of Cambridge.

Coding strategies and achievability

Shannon's random coding existence proof motivated constructive schemes. Early explicit codes include Hamming code by Richard Hamming and Reed–Solomon code by Irving S. Reed and Gustave Solomon, convolutional codes developed by Peter Elias, and low-density parity-check (LDPC) codes rediscovered by Robert G. Gallager. Later breakthroughs—turbo codes by Claude Berrou and Alain Glavieux, LDPC revival in work by David MacKay, polar codes by Erdal Arıkan, and polarizations used at Huawei and 3GPP—approach Shannon capacity with practical complexity. Practical coding interacts with modulation schemes from Guglielmo Marconi's legacy and standards work at IEEE 802.11, ITU-T, 3GPP and companies like Nokia and Qualcomm.

Converse theorem and limits

The converse establishes that rates above capacity cannot achieve arbitrarily small error probability. Formal converse proofs use Fano's inequality and information measures; refinements include strong converses and finite-blocklength bounds developed by Yury Polyanskiy, H. Vincent Poor, Sergio Verdú and others at Princeton University and Yale University. Finite-blocklength analyses and moderate deviations connect the theorem to statistical limits characterized by results from Harald Cramér and Sergey Nagaev, influencing design trade-offs in systems at NASA, European Telecommunications Standards Institute and commercial research labs.

Category:Information theory