LLMpediaThe first transparent, open encyclopedia generated by LLMs

Set cover problem

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: NP-completeness Hop 4
Expansion Funnel Raw 57 → Dedup 9 → NER 9 → Enqueued 4
1. Extracted57
2. After dedup9 (None)
3. After NER9 (None)
4. Enqueued4 (None)
Similarity rejected: 3
Set cover problem
Set cover problem
Jochen Burghardt · CC BY-SA 4.0 · source
NameSet cover problem
FieldTheoretical computer science
Introduced1972
NotableRichard Karp, Leonid Levin
ComplexityNP-hard, NP-complete

Set cover problem is a classical combinatorial optimization problem central to Richard Karp's list of NP-complete problems and to complexity theory developed by Leonid Levin and Stephen Cook. It asks for a smallest subcollection from a family of sets that covers a given universe, with connections to practical work in Donald Knuth's algorithmic analysis and to foundational results at institutions such as Bell Labs and MIT. The problem ties into approximation theory explored at conferences like STOC and FOCS and appears in textbooks authored by Christos Papadimitriou and Michael Garey.

Definition

Given a finite universe U and a family S of subsets of U, the task is to select a subfamily C ⊆ S of minimum cardinality such that the union of sets in C equals U. Formalizations appear in early papers by Jack Edmonds and in reductions used by Richard Karp to show NP-completeness via problems like Vertex cover and Set packing. Variants are presented in monographs by Alfred V. Aho and John Hopcroft that relate Set cover to decision problems studied at Princeton University.

Complexity and Computational Hardness

Decision version asking whether there exists a cover of size at most k is NP-complete, a proof strategy tracing to hardness frameworks developed by Stephen Cook and reductions modeled after constructions in Leonid Levin's work. Inapproximability bounds derive from PCP theorem results credited to researchers such as Subhash Khot and Sanjeev Arora and have been strengthened by hardness results from groups at IBM Research and Microsoft Research. The Set cover problem is log-hard: no polynomial-time algorithm can achieve approximation ratio (1 - o(1)) ln n unless NP is contained in time classes implicated in work by László Babai and Shafi Goldwasser.

Approximation Algorithms

Greedy algorithm gives H_n-approximation (harmonic number), a guarantee analyzed in textbooks by David Johnson and improved in algorithmic surveys by Umesh Vazirani and Vijay Vazirani. Rounding techniques using linear programming relaxations were developed in papers by Kannan Varadarajan and applied in algorithmic frameworks from Noga Alon and Avi Wigderson, while primal-dual methods link to work at Bell Labs by Jack Edmonds and R. M. Karp. Randomized rounding and multiplicative weights methods studied at Courant Institute and Stanford University further refine approximation trade-offs; conditional lower bounds based on conjectures from Richard Karp and reductions used by Uriel Feige delineate limits for polynomial-time algorithms.

Variants and Generalizations

Weighted Set cover assigns costs to sets, a modification treated in monographs by Alfred V. Aho and in thesis work at Carnegie Mellon University. Hitting set and dominating set are closely related problems studied in context by Paul Erdős and László Lovász, while geometric set cover and VC-dimension connections were developed by researchers at IBM Research and in lectures by Valtr. Online Set cover and streaming models have been explored in collaborations involving Jon Kleinberg and Éva Tardos, and parameterized complexity approaches reference work by Rod Downey and Michael Fellows. Special cases such as k-Set cover and budgeted Set cover link to optimization themes at INRIA and Max Planck Institute.

Applications

Set cover models feature in computational biology problems addressed at Broad Institute and in network design questions studied by AT&T Labs and Cisco Systems. Information retrieval and document summarization implementations draw on techniques from Google research and academic groups at Carnegie Mellon University and University of California, Berkeley. Sensor placement, facility location reductions, and resource allocation uses have been applied in industrial projects at Siemens and General Electric, while security and monitoring deployments reference standards developed with contributions from DARPA and NSF-funded research.

Examples and Formulations

Classic textbook examples include covering elements with sets drawn from incidence matrices studied by Claude Shannon and matrix covering formulations used by George Dantzig in linear programming. Integer programming formulations convert Set cover into 0–1 programs analyzable using solvers developed at IBM Research and Gurobi, while reductions to Vertex cover and Hitting set illustrate equivalences highlighted in lectures by Michael Garey and David S. Johnson. Practical instances come from test-suite minimization in software engineering projects at Microsoft Research and from route-planning datasets used in competitions hosted by ACM.

Category:Combinatorial optimization