HistFitter — LLMpedia

HistFitter
Name	HistFitter
Title	HistFitter
Developer	CERN collaborators, ATLAS collaboration, Institute for Research and Innovation in Software for High Energy Physics
Released	2012
Programming language	Python (programming language), ROOT (software)
Operating system	Linux
Genre	Statistical software, High energy physics
License	GPL

Contents

Overview
Design and Features
Usage and Workflow
Statistical Methods and Models
Implementation and Architecture
Validation and Performance
History and Development

HistFitter HistFitter is a statistical data-analysis framework developed for searches and measurements in high-energy physics by collaborators at CERN, principally within the ATLAS collaboration. It provides tools to build likelihood models, combine datasets, perform hypothesis tests, and compute confidence intervals using interfaces to established packages such as RooStats, RooFit, and ROOT (software). The framework has been used in analyses associated with experiments at the Large Hadron Collider and integrated into workflows involving simulation campaigns like those from GEANT4 and data reconstruction chains maintained at CERN Open Data.

Overview

HistFitter is positioned as an analysis-level framework that bridges experiment-specific data formats produced by ATLAS and statistical engines developed by the CERN software ecosystem. It orchestrates model building, systematic uncertainty handling, and hypothesis evaluation for searches similar to those for the Higgs boson and beyond-Standard-Model signals such as Supersymmetry, Dark matter (particle), and Extra dimensions. Analyses using HistFitter typically contribute to publications submitted to journals managed by Physical Review Letters, Journal of High Energy Physics, and presentations at conferences like ICHEP and Moriond. The software interacts with computing infrastructures such as GRID computing resources coordinated by Worldwide LHC Computing Grid and experiment-specific data management systems like ATLAS PanDA.

Design and Features

HistFitter's design emphasizes reproducibility and modularity, allowing physicists familiar with projects like ATLAS and CMS to encode complex analysis logic. Key features include automated construction of combined likelihoods across control, validation, and signal regions, handling of systematic uncertainties with nuisance parameters informed by inputs from tools such as Jet energy scale calibration workflows and b-tagging scale factor producers. It interfaces with statistical engines including BAT (Bayesian Analysis Toolkit), HistFactory, RooStats, and external tools used in collaborations like LHCHXSWG recommendations. The architecture supports templated analyses compatible with data products produced by Athena (software), CMSSW, and reconstruction outputs validated against datasets from Run 1 (LHC), Run 2 (LHC), and upgrade studies for High-Luminosity LHC.

Usage and Workflow

Typical workflows begin with histogram templates produced by event selection algorithms implemented in analysis frameworks such as ROOT (software) macros or Athena (software) jobs, then proceed to model specification steps guided by HistFitter’s configuration language. Users commonly integrate calibration constants and scale factors from groups like ATLAS Jet/Etmiss group and ATLAS b-tagging working groups, combine channels modeled after searches like Higgs to gamma gamma or SUSY multijet analyses, and submit fits to local clusters or federated systems like CERN batch and HTCondor. Output products—fit diagnostics, limit plots, and parameter scans—are formatted for inclusion in documentation systems such as INSPIRE-HEP records and internal review procedures of collaborations such as ATLAS Publication Committee.

Statistical Methods and Models

HistFitter encodes likelihood-based inference methods common to high-energy physics, implementing profile likelihood ratio tests, CLs procedures, and Bayesian credible intervals via wrappers to RooStats and BAT (Bayesian Analysis Toolkit). It supports the construction of HistFactory-style models for counting and binned measurements inspired by examples from LEP combinations and Tevatron searches. Systematic uncertainty parametrizations follow prescriptions similar to those in recommendations from groups like LHC Statistics Forum and implement nuisance parameter constraints (Gaussian, log-normal, gamma) used in published results by ATLAS and CMS. The framework enables frequentist hypothesis testing for signal strength parameters comparable to approaches in Asymptotic formulae for likelihood-based tests of new physics and can perform toys-based treatments for coverage studies as used in analyses presented at EPS-HEP.

Implementation and Architecture

HistFitter is implemented primarily in Python (programming language) and leverages ROOT (software) for histogram handling and fitting backends. It uses configuration files and modular classes to represent channels, samples, and systematics, integrating with model-building utilities from HistFactory and fitting engines like RooFit and RooStats. The codebase is version-controlled via systems and platforms used at CERN, with continuous integration practices aligned with software quality standards advocated by organizations such as IRIS-HEP. Interfaces allow interaction with computing stacks involving Singularity (container) or Docker images for reproducible environments and job submission through middleware like gLite and PanDA.

Validation and Performance

Validation of HistFitter models includes closure tests, pull and impact studies, and ensemble tests mirroring validation workflows used in major collaborations such as ATLAS and CMS. Performance benchmarks compare asymptotic approximations against toys-based limits and evaluate computational scaling on infrastructures like Worldwide LHC Computing Grid and local HPC clusters funded by agencies including CERN member states. Publications and internal notes often document comparisons against alternative statistical toolkits, highlighting compatibility with recommendations from the LHC Statistics Group and results reported at venues like NeurIPS for advanced computational techniques.

History and Development

HistFitter's development began in the early 2010s within analysis teams preparing searches at the Large Hadron Collider, with contributions from researchers affiliated with institutions such as CERN, University of Oxford, Lund University, University of Manchester, University of California, Berkeley, Institute of Physics (Poland), and other groups in the ATLAS collaboration. Its evolution traces interactions with software projects including ROOT (software), RooFit, RooStats, HistFactory, and community efforts like IRIS-HEP and the LHC Statistics Forum. The tool has been cited in analysis notes, conference proceedings at ICHEP and Moriond, and integrated into the analysis ecosystems that produced results such as Higgs boson property measurements and searches for Supersymmetry.

Category:Statistical software