How do you deal with a distribution over an infinite number of functions? When first people get introduced to Gaussian Processes, they would hear something like “Gaussian Processes allow you to work with an infinite space of functions in regression tasks”.

When first people get introduced to **Gaussian Processes**, they would hear something like “Gaussian Processes allow you to work with an infinite space of functions in regression tasks”. This is quite a hard thing to process. In fact, Gaussian Processes are very simple in a nutshell and it all starts with the (multivariate) normal (Gaussian) distribution that has certain nice properties that we can use for GPs . As a rule of thumb for this article, I would suggest when in doubt, take a look at the GP equations, then the text should become clear(er).

**What are Gaussian Processes good for? * Well,from a theory perspective they are universal function approximators, so the answer would be for anything where you need to fit some kind of function. But *there are some computational considerations to be made** to which we will get later that influence the decision of wether to use them or not, and also the availability of data.

**Where do Gaussian Processes fit in the big picture of Machine Learning? **First of all, GPs are non-parametric models, meaning that we don’t have a certain update rule for some kind of model parameters based on training data. Another example of a non-parametric algorithm is k-Nearest Neighbors. So all in all, here (in the most basic case of GPs) we have no gradients, no objective function that we optimize over directly.

Second of all, GPs yield themselves nicely to the Bayesian perspective in machine learning (I advise when seeing Bayesian, think quantifying uncertainty in prediction, or amount of information).

So before we jump into GPs, we need to cover these nice properties of Gaussian distributions. As you will see later, GPs don’t bring much more to the table except using well known results for multivariate Gaussian distributions. For simplicity, let’s just look at the bivariate case of the Gaussian which is defined with a mean vector μ and covariance matrix Σ:

data-science machine-learning research artificial-intelligence mathematics

Enroll now at CETPA, the best Institute in India for Artificial Intelligence Online Training Course and Certification for students & working professionals & avail 50% instant discount.

In this tutorial on "Data Science vs Machine Learning vs Artificial Intelligence," we are going to cover the whole relationship between them and how they are different from each other.