LLMpediaThe first transparent, open encyclopedia generated by LLMs

autoregressive integrated moving average

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: George Box Hop 5
Expansion Funnel Raw 72 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted72
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
autoregressive integrated moving average
NameAutoregressive integrated moving average
AbbreviationARIMA
Introduced1970s
Introduced byGeorge E. P. Box; Gwilym M. Jenkins
ApplicationsForecasting; Signal processing; Econometrics

autoregressive integrated moving average

ARIMA is a class of statistical models for analyzing and forecasting time-ordered data. It combines autoregressive terms, differencing (integration), and moving average components to model temporal dependence in univariate series; it is widely used across fields including Federal Reserve System, International Monetary Fund, World Bank policy analysis, National Aeronautics and Space Administration telemetry processing, and Nobel Prize-winning econometric research. Developed and popularized by practitioners such as George E. P. Box and Gwilym M. Jenkins, the approach interfaces with methods from Karl Pearson-inspired statistics, Norbert Wiener-style signal processing, and John von Neumann-era stochastic modelling.

Introduction

ARIMA sits within a lineage of stochastic models that include autoregressive (AR), moving average (MA), and seasonal forms such as SARIMA; it traces methodological ancestry to work by Andrey Markov, G. Udny Yule, and S. K. Z. Kalman-related filtering ideas. Practitioners in institutions like the U.S. Bureau of Labor Statistics, Organisation for Economic Co-operation and Development, European Central Bank, Bank of England, and corporations such as IBM employ ARIMA for short- to medium-term forecasting tasks. The methodology also connects to model-selection theory advanced by H. Akaike and Gideon E. P. Box collaborators, integrating information-criterion principles used by agencies including National Institute of Standards and Technology.

Definition and Mathematical Formulation

An ARIMA(p,d,q) model expresses a time series {X_t} via autoregressive polynomial φ(B), differencing operator (1−B)^d, and moving average polynomial θ(B): φ(B)(1−B)^d X_t = θ(B)ε_t, where ε_t is typically assumed white noise. The AR component parallels earlier work by Andrey Markov and G. Udny Yule, while the MA component connects to innovations processes used in Norbert Wiener's signal theories and Wold decomposition-related results. Seasonal extensions such as SARIMA incorporate seasonal polynomials and were applied in contexts involving International Civil Aviation Organization forecasting and United Nations statistical services.

Model Identification and Estimation

Identification uses correlograms and partial correlograms, inspired by techniques from George E. P. Box and Gwilym M. Jenkins methodology, often augmented by information criteria like the Akaike information criterion and Bayesian information criterion developed by Hirotugu Akaike and Gideon Schwarz. Estimation methods include maximum likelihood estimation, conditional sum-of-squares, and innovations algorithms related to work by Norbert Wiener and Wiener-Kolmogorov filters; estimation in large-scale settings has been influenced by computational advances from John von Neumann and algorithms popularized by Donald Knuth. State-space formulations permit use of the Kalman filter (see Rudolf E. Kálmán) for recursive estimation and smoothing.

Diagnostic Checking and Model Selection

Diagnostic checking draws on residual analysis, Ljung–Box tests, and out-of-sample validation strategies used by agencies like U.S. Census Bureau and Federal Reserve Board. The Ljung–Box statistic relates to hypothesis-testing traditions exemplified by Sir Ronald Fisher and Jerzy Neyman, while cross-validation strategies echo designs from C.R. Rao-influenced statistical theory. Model selection also leverages ensemble and shrinkage ideas popularized in work by Leo Breiman and Robert Tibshirani, and modern practice integrates automated selection routines seen in platforms developed by Microsoft and Google.

Extensions and Variants

Extensions include seasonal ARIMA (SARIMA), ARIMAX (with exogenous regressors), and fractional integration models linked to long-memory research by Benoît Mandelbrot and Clive Granger; stochastic volatility and GARCH-style hybrids draw on concepts from Tim Bollerslev and Robert Engle. State-space equivalents enable structural time series decompositions used by Christopher A. Sims and Angus Deaton in macroeconomic research. Multivariate generalizations (VARMA) connect to vector autoregression frameworks advanced by Sims and Hendry-style modeling, while Bayesian treatments have been developed following the work of Thomas Bayes and modern computational methods by Radford Neal and Andrew Gelman.

Applications

ARIMA and its variants are applied in macroeconomic forecasting at institutions like the International Monetary Fund and World Bank, in demand forecasting at Walmart and Amazon, in energy load forecasting for National Grid (Great Britain), and in epidemiological time series analyses employed by World Health Organization. They underpin short-term forecasting for commodities traded on exchanges such as New York Stock Exchange and Chicago Mercantile Exchange and have been used in climatology studies involving datasets from National Oceanic and Atmospheric Administration and European Space Agency.

Practical Implementation and Software

ARIMA modeling is implemented in software packages across ecosystems: the R ecosystem (packages like forecast and stats) influenced by work at University of Auckland and Monash University; Python libraries such as statsmodels and pmdarima maintained by contributors from NumFOCUS and companies like Anaconda, Inc.; and commercial packages from SAS Institute, StataCorp, and MATLAB. Cloud platforms from Amazon Web Services, Google Cloud Platform, and Microsoft Azure offer managed services and APIs for time-series forecasting that integrate ARIMA-based pipelines.

Category:Time series models