Generated by GPT-5-mini| Shannon entropy | |
|---|---|
| Name | Shannon entropy |
| Field | Information theory |
| Introduced | 1948 |
| Introduced by | Claude Shannon |
Shannon entropy is a foundational measure in information theory introduced in 1948 by Claude Shannon. It quantifies the average uncertainty or information content of a discrete probability distribution and underpins results in communication theory, cryptography, statistical mechanics, and computer science. The concept connects to work by contemporaries and successors in Bell Labs, MIT, and other institutions where research in telecommunications, cybernetics, and signal processing advanced.
Shannon entropy is defined for a discrete random variable with probabilities p_i as H = -∑ p_i log p_i, a formula appearing in Shannon's 1948 paper at Bell Labs and later textbooks from MIT Press and authors affiliated with Princeton University and Stanford University. The logarithm base choice relates to units such as bits (base 2) tied to binary numeral system implementations in IBM and ENIAC-era computing, or nats (base e) common in works from Princeton University and INRIA. The formalism uses expectations and sums familiar to researchers at Cambridge University and Harvard University who developed rigorous treatments in probability theory and measure theory.
Entropy is nonnegative and maximal for the uniform distribution, a fact used in analyses by scholars at Bell Labs, University of California, Berkeley, and Columbia University who studied optimal signaling and uncertainty. It is concave in the probability vector, a property exploited in optimization results from Stanford University and California Institute of Technology researchers. Entropy satisfies chain rules and conditional decompositions referenced in courses at Massachusetts Institute of Technology and University of Oxford, and it relates to mutual information studied by teams at AT&T and Bell Labs. Interpretations include expected surprisal, average code length bounds appearing in proofs by Shannon and later expositions at University of Cambridge and Yale University, and connections to thermodynamic entropy discussed in texts by authors at Princeton University and University of Chicago.
Common examples include a fair coin (uniform two-outcome distribution) with entropy 1 bit, an unbiased die with entropy log2(6) bits often calculated in problem sets from Harvard University and University of Pennsylvania, and biased distributions treated in exercises from Stanford University and ETH Zurich. Calculations use probability mass functions appearing in case studies by Bell Labs engineers and statisticians at Columbia University who analyze source models for coding. Continuous analogues lead to differential entropy studied in seminars at University of California, Berkeley and Princeton University, where Gaussian distributions maximize entropy under variance constraints, a result compared to maximum-entropy formulations in work at Imperial College London.
Entropy establishes lower bounds on average code length in lossless source coding theorems proved by Shannon and elaborated in textbooks from Wiley and Springer. It directly informs Huffman coding developed at MIT and arithmetic coding techniques applied in standards by ISO and ITU. Channel capacity theorems from Shannon's era at Bell Labs relate entropy to mutual information used by researchers at Nokia and Ericsson in designing reliable communication systems. Entropy also appears in rate-distortion theory advanced by researchers at Princeton University and McGill University for lossy compression and in universal coding methods explored by scholars at University of Cambridge and University of Tokyo.
Generalizations include Rényi entropy introduced by Alfréd Rényi and min-entropy used in cryptographic proofs at RSA-era research groups and universities such as Harvard University and MIT. Tsallis entropy, proposed in works associated with researchers at University of São Paulo and University of Rome, provides nonextensive generalizations applied in statistical physics studied at Los Alamos National Laboratory. Quantum analogues like von Neumann entropy are central in research at Caltech and Perimeter Institute for quantum information, while conditional, joint, and relative entropy (Kullback–Leibler divergence) figures in methods developed at Bell Labs and Bellcore and used in statistical inference at Princeton University.
Practically, entropy guides design and analysis across telecommunications companies and research labs such as Bell Labs, AT&T, and Nokia for source coding and channel utilization, and in cryptography by practitioners at RSA and academic groups at University of Cambridge and Stanford University for randomness assessment. In machine learning, entropy-based criteria are used in decision tree algorithms developed at Carnegie Mellon University and ensemble methods studied at Google and Microsoft Research for feature selection and impurity measures. Entropy principles inform image and audio compression standards by MPEG and ISO and underpin statistical mechanics discussions in publications from Princeton University and Los Alamos National Laboratory.