Generated by GPT-5-mini| HistFactory | |
|---|---|
| Name | HistFactory |
| Type | Statistical model framework |
| Developed by | ATLAS Collaboration |
| Domain | High-energy physics |
| First release | 2011 |
| License | Open-source (research) |
HistFactory HistFactory is a statistical model construction framework used in high-energy physics experiments to build binned likelihoods and combine measurements for hypothesis testing and parameter estimation. It interfaces with tools from the CERN ecosystem and common analysis packages to enable reproducible inference workflows across experiments such as ATLAS, CMS, LHCb, and collaborations involving Fermilab or SLAC National Accelerator Laboratory. The framework is widely employed in searches and measurements that involve datasets from accelerators like the Large Hadron Collider and detectors such as ATLAS detector, CMS detector, and ALICE.
HistFactory provides a templated approach to specify channels, samples, and systematic uncertainties using binned histograms, allowing experiments to translate detector-level distributions from simulation and data into a coherent model compatible with statistical tools like RooFit and RooStats. It was designed to support complex combinations like those required for the discovery of the Higgs boson and precision measurements of masses and couplings in processes including ttbar production, H→γγ, and H→ZZ→4ℓ. HistFactory's design reflects practices from collaborations including ATLAS Collaboration, CMS Collaboration, LHCb Collaboration, and incorporates ideas used in analyses at Tevatron experiments such as CDF and DØ.
The core formalism encodes a binned likelihood as a product of Poisson terms for each bin, with nuisance parameters representing systematic uncertainties constrained by auxiliary measurements modeled with additional probability density functions. This construction is compatible with frequentist and Bayesian procedures implemented in frameworks like RooStats, HistFitter, and external tools such as BAT and pyhf. HistFactory supports profile-likelihood ratio tests used in procedures described by the CLs method and asymptotic formulae developed by researchers associated with Cowan, Cranmer, Gross, and Vitells; it is also used in global fits similar in spirit to those by groups such as the Gfitter and in combined electroweak fits by the LEP Electroweak Working Group.
HistFactory has reference implementations in XML schemas and bindings to libraries developed in ROOT and Python. The primary implementations interoperate with RooFit objects inside ROOT and with pure-Python stacks such as pyhf that target JSON schemas for JSON-based likelihood preservation similar to efforts by the HEPData and CERN Open Data initiatives. Integrations include tools like Combine from the CMS Collaboration, HistFitter for ATLAS analyses, and packaging with continuous integration systems by institutions like GridPP and NERSC. The software ecosystem connects to workflow managers and data services used at facilities including WLCG sites, CERN Openlab, and compute centers such as Fermilab Scientific Computing Division.
HistFactory is applied to searches for new phenomena—examples include analyses targeting supersymmetry, dark matter, and resonances such as hypothetical Z′ bosons—and to precision measurements like determinations of the top quark mass, W boson mass, and Higgs boson properties. It supports combined measurements across channels such as leptonic and hadronic final states exemplified in studies of diboson processes, single top production, and flavor physics results from LHCb and experiments at KEK like Belle II. Collaborations use HistFactory models when producing legacy likelihoods for reinterpretation by external groups working on global fits by projects like the Global and Regional Anomaly Detection efforts and phenomenology groups at institutions including CERN Theory and university groups at MIT, Harvard University, University of Oxford, and University of Chicago.
Validation of HistFactory-based models typically involves closure tests with full detector simulations from toolchains maintained by experiments such as GEANT4-based workflows, cross-checks with alternative statistical treatments like Markov Chain Monte Carlo samplers used by emcee or Stan, and reproducing published results from collaborations including ATLAS Collaboration and CMS Collaboration. Performance profiling considers memory and computational scaling for large combinations, with benchmarks run on infrastructures provided by CERN IT, national computing centers like GridPP and NCSA, and comparisons against lightweight, vectorized implementations like pyhf which exploit modern accelerators and libraries such as NumPy and TensorFlow for likelihood evaluation acceleration.
Limitations arise from the binned nature of HistFactory templates, which can incur information loss compared to unbinned approaches used in analyses by groups working on techniques like matrix element methods exemplified by Matrix Element Method studies, or machine-learning-based unbinned estimators developed by teams at DeepMind-collaborating projects and university labs. Extensions and successor projects address these issues: JSON schemas and the pyhf project enable lightweight, accelerator-friendly likelihoods; efforts to support morphing and machine-learning-based surrogate models link to research from institutions such as CERN Theory, Stanford University, Caltech, and industrial partners like Google and NVIDIA. Preservation and reproducibility initiatives by HEPData, CERN Open Data and analysis preservation frameworks from collaborations such as ATLAS Analysis Preservation continue to build on HistFactory concepts for long-term reuse.
Category:Statistical models in particle physics