Estimation implies finding the optimal parameter using historical data whereas prediction uses the data to compute the random value of the unseen data.

The highlighted words in the above statement need some context setting before we proceed further:

We need lot of historical data to learn dependencies for machine learning and modelling. The data typically involves multiple observations, where each observation consists of multiple variables. This multivariate observation x belongs to random variable X whose distribution lies in the realm of a finite set of possible distributions called as **‘the states of nature’**.

Estimation is the process of optimizing the true state of nature. Loosely speaking, estimation is related to model building i.e. finding the most appropriate parameter that best describes the multivariate distribution of historical data, for e.g. if we have five independent variables, X1, X2….X5 and Y as the target variable. Then, estimation involves the process of finding f(x) which is the closest approximation of the true state of nature denoted by g(θ).

Estimation, Prediction and Forecasting
