TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Gaussian Process Regression

5 min readJan 6, 2021

--

Gaussian processes (GPs) are a flexible class of nonparametric machine learning models commonly used for modeling spatial and time series data. A common application of GPs is regression. For example, given incomplete geographical weather data, such as temperature or humidity, how can one recover values at unobserved locations? If one has good reason to believe the data is normally distributed, then a using a GP model could be a judicious choice. In what follows, we introduce the mechanics behind the GP model and then illustrate its use in recovering missing data.

The GP Model

Formally, a GP is a stochastic process, or a distribution over functions. The premise is that the function values are themselves random variables. When modeling a function as a Gaussian process, one makes the assumption that any finite number of sampled points form a multivariate normal distribution.

Why is this assumption useful? It turns out that Gaussian distributions are nice to work with because of their tractability. Indeed, the Gaussian family is self-conjugate and enjoys a number of properties such as being closed under marginalization and closed under conditioning. As such, GPs naturally jibe with Bayesian machine learning, which usually involves specification of priors — asserting one’s prior belief…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Alex Powell
Alex Powell

Written by Alex Powell

I write about data science, stats, ML, software, programming, and computing.

Responses (1)