Developing an efficient machine learning algorithm with a skewed dataset can be tricky. For example, the dataset is about fraudulent activities in the bank or cancer detection. What happens is you will see in the dataset that, 99% of the time there are no fraudulent activities or there is no cancer. You can easily cheat and just predict 0 all the time (predicting 1 if cancer and 0 if no cancer) to get a 99% accuracy. If we do that we will have a 99% accurate machine learning algorithm but we will never detect cancer. If someone has cancer, s/he will never get treatment. In the bank, there will be no action against fraudulent activities. So, accuracy alone cannot decide for a skewed dataset like that if the algorithm is working efficiently or not.

#artificial-intelligence #data-science #machine-learning #machine-intelligence #towards-data-science

A Complete Understanding of Precision, Recall, and F Score Concepts
1.45 GEEK