Generated by GPT-5-mini| Theil index | |
|---|---|
| Name | Theil index |
| Type | Inequality measure |
| Introduced | 1967 |
| Introduced by | Henri Theil |
| Related | Gini coefficient, Atkinson index, Shannon entropy |
Theil index The Theil index is an information-theoretic measure of statistical dispersion used to quantify inequality in distributions of income, wealth, or other nonnegative resources. It was proposed by Henri Theil and connects concepts from information theory and econometrics to produce a scalar summary of unequal allocation across individuals or groups. The index admits decompositions by population subgroups and has been applied across studies involving nations, firms, and regions such as United States, United Kingdom, Germany, India, and Brazil.
Henri Theil formulated the index in the late 1960s while working on statistical applications of entropy at the Ninth Congress of the International Statistical Institute milieu and during his tenure at institutions including Vrije Universiteit Amsterdam and the University of Chicago. Early adopters in development economics and labor studies included researchers at World Bank, OECD, and the United Nations whose comparative studies with the Gini coefficient and the Atkinson index popularized its use. Empirical work in the 1970s and 1980s by scholars at Harvard University, London School of Economics, and University of California, Berkeley expanded applications to cross-country inequality, firm-level wage dispersion, and regional analyses involving provinces and states like California and Maharashtra. Over time, methodological contributions from researchers at Stanford University, Princeton University, and Massachusetts Institute of Technology integrated the index into decomposition and bootstrap inference frameworks.
Let a population of n units have nonnegative values x_i (i = 1,...,n) with mean μ. The Theil index T is defined using a normalized log-ratio form derived from entropy. In its common form:
T = (1/n) Σ_{i=1}^n (x_i/μ) ln(x_i/μ).
Equivalently, when weights w_i (e.g., population shares) are present, the weighted form used in international comparisons is:
T = Σ_{i=1}^n w_i (x_i/μ) ln(x_i/μ),
where Σ w_i = 1. This formulation parallels the Kullback–Leibler divergence from Shannon entropy and is analytically related to the log-normal distribution moments often observed in income data. The index is nonnegative and equals zero under perfect equality (all x_i = μ). For continuous distributions, the integral analogue uses the density f(x) and integrates (x/μ) ln(x/μ) f(x) dx over x ≥ 0.
The Theil index satisfies scale invariance and population replication invariance: multiplying all x_i by a constant leaves T unchanged, and replicating the population does not change T. It is sensitive to transfers at different parts of the distribution and, unlike the Gini coefficient, is additively decomposable into within-group and between-group components. For a partition of the population into groups g = 1,...,G with group means μ_g and population shares p_g, the decomposition is:
T = Σ_{g=1}^G p_g (μ_g/μ) ln(μ_g/μ) + Σ_{g=1}^G p_g T_g (μ_g/μ),
where the first term is the between-group component and the second is the within-group component (T_g denotes the Theil index within group g). This additive property made it attractive to analysts at IMF, ILO, and national statistical offices such as ONS (United Kingdom) and Statistics Netherlands for regional and sectoral studies. The index relates to entropy measures used in ecology and information theory and can be transformed to other inequality indices under specific distributional assumptions.
Empirical estimation often uses survey weights and requires care with zero or missing values; zeros cause terms with ln(0) which are handled by limiting arguments or adding small positive offsets. Large-sample properties are usually derived under independent sampling; variance estimation can employ linearization, bootstrap, or jackknife methods implemented in software from centers at Harvard Kennedy School, University of Michigan, and Columbia University. In panel contexts analysts consider decomposition of temporal changes using indices of growth accounting used in studies by European Central Bank and Federal Reserve. When income processes are heavy-tailed, finite-sample bias can be substantial, prompting use of bias correction methods developed in econometric literature from NBER working papers and applied in evaluations by World Bank country teams.
Theil index applications span macro, micro, and meso analyses: cross-country income inequality comparisons among blocs such as European Union and BRICS, firm-level wage dispersion in sectors tracked by International Labour Organization, regional inequality within countries such as China, Nigeria, and Mexico, and decomposition of educational attainment disparities in case studies from OECD. In labor economics it is used to measure within-firm and between-firm wage inequality in data sets like the PSID and Administrative Labour Market Data from national agencies. Environmental economists have used entropy-based measures in resource allocation studies at NASA and UNEP; urban economists have applied the index to spatial segregation analyses in cities such as New York City and Mumbai.
Critics point to sensitivity to extreme values and the need to handle zeros carefully, issues discussed in literature from Econometrica, Journal of Political Economy, and policy critiques by OECD. The Theil index can overweight high incomes relative to indices built on bounded transformations like the Gini, and its interpretability as "entropy distance" may be abstract for policymaking audiences compared with Lorenz-curve-based measures used by World Bank reports. Decomposition relies on group definitions that can be arbitrary, prompting concerns raised in comparative studies by UNDP and scholars at Yale University about robustness to classification choices. Finally, while statistically tractable, results depend on survey design and imputation strategies used by statistical offices such as Statistics Canada and Eurostat.
Category:Inequality measures