Loss functions are used to calculate the difference between the predicted output and the actual output. To know how they fit into neural networks, read :
In this article, I’ll explain various loss functions for regression, their advantages, and disadvantages using which, you can select the right one for your project.
Loss functions are fundamentally dependent on the nature of our dependent variables and so, to select a loss function, we must examine if our dependent variables are numeric (in regression task) or probabilistic (in a classification task).
When we are dealing with numeric variables, we have to measure the losses numerically, meaning, just knowing if the predicted value is wrong is not enough, we have to calculate the amount of deviation of our prediction from the actual value, so we can train our network accordingly.
The different loss functions for this are :
MAE is the simplest error function, it literally just calculates the absolute difference (discards the sign) between the actual and predicted values and takes it’s mean.
MAE Equation from Data Vedas
The following figure shows that the MAE increases linearly with an increase in error.
Image by author
MAPE is similar to that of MAE, with one key difference, that it calculates error in terms of percentage, instead of raw values. Due to this, MAPE is independent of the scale of our variables.
MAPE Equation from JIBC
The following figure shows that the MAPE also increases linearly with an increase in error.
Image by author
In MSE, we calculate the square of our error and then take it’s mean. This is a quadratic scoring method, meaning, the penalty is proportional to not the error (like in MAE), but to the square of the error, which gives relatively higher weight (penalty) to large errors/outliers, while smoothening the gradient for smaller errors.
#machine-learning #loss-function #deep-learning #artificial-intelligence #deep learning