Estimation implies finding the optimal **parameter** using historical data whereas prediction uses the data to compute the **random value** of the unseen data.

The highlighted words in the above statement need some context setting before we proceed further:

We need lot of historical data to learn dependencies for machine learning and modelling. The data typically involves multiple observations, where each observation consists of multiple variables. This multivariate observation x belongs to random variable X whose distribution lies in the realm of a finite set of possible distributions called as****‘the states of nature’.

**Estimation is the process of optimizing the true state of nature**. Loosely speaking, estimation is related to model building i.e. **finding the most appropriate parameter that best describes the multivariate distribution of historical data**, for e.g. if we have five independent variables, X1, X2….X5 and Y as the target variable. Then, estimation involves the process of finding f(x) which is the closest approximation of the true state of nature denoted by g(θ).

