Generated by GPT-5-mini| PyMC | |
|---|---|
![]() Unknown authorUnknown author · Public domain · source | |
| Name | PyMC |
| Developer | PyMC Community |
| Released | 2003 |
| Programming language | Python |
| Operating system | Cross-platform |
| License | BSD-like |
PyMC PyMC is an open-source probabilistic programming library for Bayesian statistical modeling and probabilistic machine learning. It provides tools for defining probabilistic models, performing Bayesian inference, and conducting posterior analysis, and is used alongside scientific computing ecosystems including NumPy, SciPy, Pandas, Matplotlib, Jupyter Notebook.
PyMC originated in the early 2000s as a research tool for Bayesian inference influenced by developments in Markov chain Monte Carlo methods and probabilistic graphical models. Early work drew on ideas from researchers connected to Stanford University, University of Oxford, University of Cambridge, Harvard University, Columbia University, and packages such as BUGS, JAGS, WinBUGS, OpenBUGS, RStan, and TensorFlow Probability. Over time contributions came from academics and engineers affiliated with University of Washington, Imperial College London, New York University, Massachusetts Institute of Technology, and industry groups at Google, Microsoft Research, Amazon Web Services, and Uber Technologies. Major milestones paralleled the release cycles of Python (programming language), NumPy, Theano, PyTensor, and the broader transition to modern probabilistic programming practices seen in projects like Edward (software), Pyro (software), TensorFlow, and JAX.
PyMC implements Bayesian modeling primitives, Markov chain Monte Carlo samplers, and variational inference methods that integrate with scientific tooling from NumPy, SciPy, Xarray, and ArviZ. Its sampler suite includes adaptations of the No-U-Turn Sampler related to work from D. J. Spiegelhalter-style MCMC research, Hamiltonian Monte Carlo methods developed at Columbia University and Princeton University, and adaptive Metropolis algorithms tied to algorithms popularized by Roberts and Rosenthal. Variational inference interfaces echo research from Michael Jordan (scientist), David Blei, Yarin Gal, and Diederik P. Kingma. For diagnostics and visualization it interoperates with plotting and diagnostics inspired by projects produced by contributors at Utrecht University, University of Oxford, Imperial College London, and Carnegie Mellon University.
PyMC's implementation is built on top of Python and numerical backends originally using Theano (software), later transitioning to backends like Aesara, PyTensor, and interoperability with JAX and Numba. Computational graphs and automatic differentiation components draw lineage from systems developed at Montreal Institute for Learning Algorithms, Google Research, University of Toronto, and Facebook AI Research. Its probabilistic programming abstractions resemble constructs found in research from Richard McElreath, Andrew Gelman, Bradley Efron, and John Tukey-influenced statistical practice. The package modularizes distributions, stochastic nodes, and deterministic transforms with data structures common to Pandas and array programming idioms from NumPy. Parallel sampling and performance improvements reflect approaches used by teams at Amazon Web Services, Google, and supercomputing centers such as Lawrence Berkeley National Laboratory.
Typical workflows combine model specification, inference, and posterior analysis using notebooks popularized at Jupyter Project, reproducibility practices from GitHub, and continuous integration techniques used by Travis CI and GitLab. Examples often mirror case studies from textbooks and courses associated with Columbia University, Harvard University, Princeton University, University College London, and MOOCs from Coursera and edX. Users implement hierarchical models, time-series models, and survival models reflecting methods taught by Andrew Gelman, Gelman et al., and applied in research at National Institutes of Health, NASA, European Space Agency, and World Health Organization. Code snippets typically rely on integrations with Pandas for data handling, Matplotlib and Seaborn for visualization, and ArviZ for diagnostics and posterior predictive checks cited in literature from University of Cambridge and Imperial College London.
PyMC development is driven by a community of academic researchers, industry practitioners, and open-source contributors affiliated with institutions such as University of California, Berkeley, University of Chicago, ETH Zurich, Max Planck Society, Microsoft Research, Google Research, and startups incubated in Silicon Valley and Cambridge (UK). The project governance and contribution model follow norms similar to those used by major open-source projects hosted on GitHub, with code review practices influenced by engineering groups at Mozilla, Red Hat, and Canonical. Regular workshops, tutorials, and conference presentations appear at venues including NeurIPS, ICML, AISTATS, JSM, USEARCH, and gatherings organized by PyData and the Python Software Foundation.
PyMC is used across scientific domains and industries including biostatistics at National Institutes of Health, epidemiology at Centers for Disease Control and Prevention, astronomy at European Southern Observatory, finance at Goldman Sachs, J.P. Morgan, and algorithmic trading firms, as well as machine learning research at DeepMind, OpenAI, and Facebook AI Research. Applied fields include ecological modeling in work associated with University of British Columbia, climate science at NASA Goddard Space Flight Center, social science studies from Princeton University and Stanford University, and clinical trials research coordinated by World Health Organization and Bill & Melinda Gates Foundation. Industries leveraging PyMC-like tools include pharmaceuticals (e.g., Pfizer, Roche), technology firms (e.g., Microsoft, Amazon), and consulting groups such as McKinsey & Company and Boston Consulting Group.
Category:Bayesian statistics software