Generated by GPT-5-mini| ArviZ | |
|---|---|
| Name | ArviZ |
| Programming language | Python |
| Operating system | Cross-platform |
| License | BSD |
ArviZ is an open-source Python library for exploratory analysis of Bayesian models, diagnostics, and visualization. It provides tools for posterior analysis, prior and posterior predictive checks, convergence diagnostics, and model comparison, integrating with probabilistic programming frameworks and scientific computing ecosystems.
ArviZ interoperates with probabilistic programming systems such as PyMC, Stan, TensorFlow Probability, Edward, Pyro, NumPyro, JAX, Greta, Turing.jl, WinBUGS, OpenBUGS, JAGS, Soss, BeanMachine and scientific libraries like NumPy, SciPy, pandas, matplotlib, seaborn, bokeh, holoviews, plotly, Altair. It emphasizes reproducible workflows and integrates with environments including Jupyter Notebook, JupyterLab, Google Colab, Visual Studio Code, PyCharm, and Binder.
ArviZ includes functionality for posterior summarization, diagnostics, and plotting with components aligned to standards used by Gelman–Rubin, R-hat, Effective sample size, LOO, WAIC, and techniques related to Bayesian model averaging. Visualization routines support traceplots, pairplots, energy plots, ridge plots, autocorrelation, posterior predictive checks, and forest plots compatible with Seaborn, matplotlib, bokeh, plotly, and Altair. Diagnostics and statistical summaries reference concepts used in work by Andrew Gelman, Donald Rubin, Bradley Efron, Herman Chernoff, David Spiegelhalter, and Aki Vehtari.
ArviZ uses an internal data structure, designed to interface with inference engines and data frames from libraries like xarray, pandas, and NumPy. The design pattern favors immutable data containers and functional transformations inspired by projects such as dask, xarray, and scikit-learn. Backends and adapters provide bridges to systems including PyMC, Stan, Pyro, and TensorFlow Probability, while plotting backends permit rendering via matplotlib and interactive backends like bokeh and plotly. The project follows contribution and governance models similar to NumPy, SciPy, and pandas to manage issues, pull requests, and continuous integration with services like GitHub, Travis CI, GitLab, and CircleCI.
Typical workflows show conversion of model outputs into an ArviZ dataset for use with functions that compute diagnostics such as R-hat, effective sample size, LOO, and WAIC, and for plotting routines that create trace, pair, and posterior predictive plots. API design reflects influences from xarray data handling and exposes functions and objects compatible with pandas Series and DataFrame usage. Integration examples often reference reproducible research tools like Jupyter Notebook and RStudio when combining ArviZ with languages such as R via interfaces like rpy2 or bridges to Stan through CmdStanPy and RStan.
Development occurs on platforms such as GitHub, with collaboration among contributors from academia and industry, including users affiliated with Columbia University, University of Oxford, Harvard University, Stanford University, Massachusetts Institute of Technology, Google, Uber, Microsoft, and research groups at institutions like Imperial College London and University College London. The community engages via channels such as Discourse, GitHub Issues, Slack, Gitter, and conference presentations at venues like PyCon, SciPy, NeurIPS, AISTATS, ISBA, JSM, and ICML. Documentation and tutorials are provided through ReadTheDocs-style sites and workshops at summer schools hosted by organizations such as The Carpentries.
ArviZ is used in applied research across fields represented by institutions like NASA, NOAA, World Health Organization, Centers for Disease Control and Prevention, Google DeepMind, and companies such as Facebook, Amazon, Airbnb, and Spotify. Case studies include hierarchical modeling in ecology with collaborators from University of Cambridge, epidemiological modeling with teams at Imperial College London and London School of Hygiene & Tropical Medicine, and neuroimaging analysis in projects affiliated with UCL Institute of Neurology and MIT McGovern Institute. It supports workflows in published work appearing in journals such as Journal of the Royal Statistical Society, Statistics and Computing, Nature Methods, Journal of Machine Learning Research, and conference proceedings from NeurIPS and ICML.
The project originated from efforts by contributors active in the Bayesian workflow community and developers connected to PyMC and Stan ecosystems, with influences from reproducible research advocates like Andrew Gelman and Aki Vehtari. Releases follow semantic versioning and are available via package managers like PyPI and channels such as conda-forge, with continuous integration and test suites modeled after practices used in NumPy and SciPy. Major releases have added support for backends including NumPyro and TensorFlow Probability and expanded plotting backends to interactive libraries such as plotly and bokeh.