pyhf — LLMpedia

pyhf
Name	pyhf
Developer	CERN ATLAS CMS Fermilab Brookhaven National Laboratory Lawrence Berkeley National Laboratory University of California, Berkeley Imperial College London University of Oxford
Released	2017
Latest release	1.x (example)
Programming language	Python C++ Rust NumPy
Operating system	Linux macOS Windows
License	BSD license

Contents

Overview
Design and Architecture
Usage and API
Performance and Implementation Details
Applications and Integration
Development and Community
History and Releases

pyhf

pyhf is an open-source software library implementing the HistFactory statistical model in pure Python for high-energy physics. It provides a JSON-based model specification, exposes a tensor-backed likelihood evaluation and fitting API, and targets reproducible statistical inference for experiments such as ATLAS and CMS. pyhf emphasizes interoperability with scientific ecosystems including NumPy, SciPy, TensorFlow, and JAX and is used across institutions like CERN, Fermilab, and Lawrence Berkeley National Laboratory.

Overview

pyhf implements a likelihood-based framework suitable for searches and measurements in particle physics experiments including ATLAS, CMS, and LHCb. The project encodes statistical models described by the HistFactory schema into a machine-readable JSON format, enabling integration with tools such as ROOT and workflows from Gaudi or Rivet. By providing multiple numerical backends—NumPy, TensorFlow, PyTorch, JAX—pyhf supports CPU and accelerator execution used by researchers at CERN, Brookhaven National Laboratory, Fermilab, and academic groups at University of Oxford and Imperial College London.

Design and Architecture

pyhf’s architecture separates model specification, computational backend, and optimizer selection to achieve portability across platforms like Linux, macOS, and Windows. The core design centers on a JSON schema derived from HistFactory used by collaborations such as ATLAS and CMS; models reference channels and samples akin to implementations in ROOT workspaces. Tensor abstractions allow substitution of array libraries (for example, NumPy, TensorFlow, PyTorch, JAX) and leverage automatic differentiation available in TensorFlow and JAX for gradient-based optimizers common in packages like SciPy. The modular backend allows deployment on heterogeneous compute infrastructure from local clusters at Lawrence Berkeley National Laboratory to cloud services used by projects connected with Fermilab.

Usage and API

Users construct models by loading HistFactory-style JSON, creating a pyhf model object, and performing likelihood evaluation, fitting, and hypothesis testing via a concise API. Typical workflows mirror statistical routines used in analyses by ATLAS and CMS: model import, parameter initialisation, maximum-likelihood fits, and profile likelihood ratio scans compatible with the CLs method and frequentist procedures used at CERN. The API exposes functions for expected and observed limits, Asimov datasets, and test statistics employed in publications from collaborations such as ATLAS and CMS. Interoperability layers allow using optimizers from SciPy or machine-learning optimizers from TensorFlow and PyTorch.

Performance and Implementation Details

pyhf’s pure-Python implementation emphasises predictable numerical behavior and reproducibility for analyses at CERN and national labs. Performance-critical sections are vectorised using NumPy and offloaded to TensorFlow or JAX for GPU acceleration where available on infrastructures at Fermilab and Lawrence Berkeley National Laboratory. The library supports automatic differentiation through TensorFlow and JAX to compute gradients and Hessians for profile likelihood methods, improving convergence with optimizers such as those in SciPy. Benchmarks comparing backends demonstrate trade-offs reminiscent of choices in high-performance scientific computing at institutions like Imperial College London and University of Oxford.

Applications and Integration

pyhf is applied in limit-setting, significance estimation, and parameter estimation for analyses produced by ATLAS and CMS, and in reinterpretation efforts by groups at CERN and Fermilab. The JSON model format enables sharing and preservation of statistical models used in conference notes and peer-reviewed papers from collaborations including ATLAS and CMS. Integrations exist with analysis ecosystems like ROOT workspaces, interpretation frameworks such as RECAST and tools developed within CERN and academic partners at University of Oxford.

Development and Community

The project is developed openly with contributions from researchers at CERN, Fermilab, Lawrence Berkeley National Laboratory, Imperial College London, University of Oxford, and other institutions. Development discussions and issue tracking follow modern open-source practices popular in communities around NumPy, SciPy, and TensorFlow. pyhf’s user base includes experimentalists preparing results for CERN seminars, postgraduate researchers in groups affiliated with Imperial College London and University of Oxford, and software engineers at labs like Brookhaven National Laboratory.

History and Releases

pyhf was initiated to provide a lightweight, language-agnostic implementation of the HistFactory statistical model used by ATLAS and CMS. Early releases focused on JSON interoperability with tools such as ROOT and adoption by analysis note authors at CERN and Fermilab. Subsequent versions added multiple computational backends, automatic differentiation support via TensorFlow and JAX, and performance tuning appreciated by teams at Lawrence Berkeley National Laboratory and academic groups at Imperial College London.

Category:Statistical software Category:High-energy physics software