Sparse and Variational Gaussian Process  What To Do When Data is Large

Big data is the cure for many machine learning problems. But one person’s cure can be another’s poison. Big data causes many Bayesian methods to be unpractically expensive. We need to do something or Bayesian methods are left behind the big data revolution.

In this article, I will explain why big data makes a very popular Bayesian machine learning method — Gaussian Process —unaffordably expensive. Then I will present Bayesian’s solution — the Sparse and Variational Gaussian Process model (SVGP model), that brings Gaussian Process back in the game.

Some notations

Medium supports Unicode in text. This allows me to write many math subscript notations such as X₁ and _Xₙ. _But I could not write down some other subscripts. For example:

So in the text, I will use an underscore “” to lead such subscripts, such as X* and X*1_.

If some math notations render as question marks on your phone, please try to read this article from a computer. This is a known issue with some Unicode rendering.

The Gaussian Process regression model

Suppose we have some training data _(X, Y). _Both X and Y are float vectors of length n. So n is the number of training data points. We want to find a regression function from X to Y. This is a typical regression task. We can use the Gaussian Process regression model (GPR) to find such a function.

