Gaussian process

Gaussian process
Name	Gaussian process
Field	Statistics, Machine learning
Introduced	19th century
Notable	Carl Friedrich Gauss, Andrey Kolmogorov, Norbert Wiener

Contents

Introduction
Mathematical definition and properties
Covariance functions and kernels
Inference and learning (regression and classification)
Computational methods and approximations
Applications and examples
Extensions and related models

Gaussian process A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution. It is used as a prior over functions in Bayesian inference and appears across statistics, signal processing, and machine learning, connecting classical work by Carl Friedrich Gauss, Andrey Kolmogorov, Norbert Wiener, Ronald Fisher, and modern developments in institutions such as Massachusetts Institute of Technology, University of Cambridge, and Google DeepMind. Gaussian processes underpin methods in spatial analysis, time series, and nonparametric regression, and they interface with approaches from Thomas Bayes-based inference to kernel machines pioneered alongside work at Bell Labs, University of Toronto, and Stanford University.

Introduction

The Gaussian process concept builds on the multivariate normal distribution studied by Carl Friedrich Gauss, extended by foundational probability work at Kolmogorov's axioms in Andrey Kolmogorov's framework and applied to continuous-time processes by Norbert Wiener in the context of the Wiener process. In modern practice, researchers at University of Cambridge, Imperial College London, Princeton University, University of Oxford, and companies like Google and Microsoft Research apply Gaussian processes to problems inspired by work from David MacKay, Christopher Bishop, and Carl Rasmussen. The model is specified by a mean function and a covariance function, and it naturally yields closed-form posteriors for regression and conjugate models studied at Bell Labs and in textbooks used at Harvard University and ETH Zurich.

Mathematical definition and properties

Formally, a Gaussian process is defined by a mean function m(x) and a covariance function k(x,x'), extending finite-dimensional Gaussian vectors studied by Carl Friedrich Gauss and generalized by Andrey Kolmogorov's work on stochastic processes. Key properties relate to marginalization and conditioning analogous to results from the multivariate normal used in analyses at Princeton University and University of Chicago. The Karhunen–Loève expansion links Gaussian processes to eigenfunction decompositions studied in the context of David Hilbert's work and spectral methods used at Institute for Advanced Study; Mercer’s theorem connects kernels to reproducing kernel Hilbert spaces familiar from research at University of Toronto and Cornell University. Consistency, stationarity, and ergodicity properties reference results developed in the literature connected to Andrey Kolmogorov, Norbert Wiener, and researchers at Columbia University.

Covariance functions and kernels

Covariance functions (kernels) such as the squared exponential, Matérn, rational quadratic, periodic, and linear kernels are central and were formalized in contexts studied by researchers at Bell Labs, University of Cambridge, and Carnegie Mellon University. The Matérn family connects to work on smoothness and regularity studied by Rudolf M. Martin and others in applied mathematics departments at ETH Zurich and University of California, Berkeley. Kernel design and composition techniques draw on ideas from the theory of reproducing kernel Hilbert spaces developed at Princeton University and applied in kernel methods alongside work at University of Toronto and Royal Institute of Technology. Hyperparameterization and priors for kernels have been advanced in Bayesian treatments by scholars associated with University College London, University of Oxford, and Stanford University.

Inference and learning (regression and classification)

Gaussian process regression yields closed-form posterior mean and covariance conditioned on Gaussian noise, a result used in statistical modeling at Harvard University, Stanford University, and Imperial College London. For classification, approximate inference methods such as Laplace approximation, expectation propagation, and variational inference were advanced by researchers including those at Cambridge University Press-affiliated groups, Microsoft Research, and Max Planck Institute labs. Marginal likelihood optimization for hyperparameter learning follows practices from Bayesian model selection pioneered by Thomas Bayes's lineage and developed in modern contexts at Massachusetts Institute of Technology and University of Cambridge. Cross-validation and information criteria used alongside Gaussian processes are standard in curricula at Princeton University and Columbia University.

Computational methods and approximations

Exact inference scales cubically in data size, motivating scalable approximations developed at Google DeepMind, University of Cambridge, Carnegie Mellon University, and University of California, Berkeley. Sparse Gaussian process methods using inducing points, variational inducing approaches, stochastic variational inference, and structured kernel interpolation link to work at Google, University of Toronto, and ETH Zurich. Matrix identities and numerical linear algebra from research at Courant Institute and Stanford University inform Krylov methods, conjugate gradients, and preconditioning strategies. Approximations such as subset of regressors, FITC, PITC, and kernel approximations via random Fourier features trace to contributions from groups at Bell Labs, University of California, Los Angeles, and University of Washington.

Applications and examples

Gaussian processes are applied in geostatistics (kriging) originating in studies connected to Daniel G. Krige and practiced in institutions like United States Geological Survey and British Geological Survey. They are used for surrogate modeling in engineering projects at NASA, European Space Agency, and Airbus, for Bayesian optimization in experimental design employed by teams at Google DeepMind, Uber ATG, and OpenAI, and in time-series modeling tasks pursued at Bloomberg LP and National Institute of Health. Other domains include spatial epidemiology at Centers for Disease Control and Prevention, climate modeling at National Oceanic and Atmospheric Administration, and neuroscience studies at Max Planck Institute for Brain Research.

Extensions include deep Gaussian processes influenced by deep learning advances at Google Brain and University of Oxford, multi-output and convolutional Gaussian processes with links to work at University of Cambridge and Imperial College London, and connections to support vector machines and kernel ridge regression developed at University of Toronto and ETH Zurich. Combinations with state-space models draw on literature from Kalman filter research and control theory at MIT Lincoln Laboratory and Stanford Research Institute, while stochastic process generalizations relate to Lévy processes and Markov random fields studied at Courant Institute and Institut des Hautes Études Scientifiques.

Category:Probability theory