It’s fair to say that machine learning as a science is as remarkable as it is complex. At its core, it’s highly technical and mathematical. But you can’t begin to scratch the surface of machine learning if you can’t understand technical properties in a non-technical way. For example, if you go for a data scientist interview, you’ll inevitably be asked, “can you explain underfitting and overfitting concerning machine learning?”. They’re not looking for you to pull out graphs; instead, they want to know you understand the concept. So, what exactly is underfitting and overfitting in machine learning?

Overfitting

Let’s start with an easy to understand example. Let’s say you train your dog to raise its paw when you wave your left hand. You’ve got a smart dog, and he learns it quickly and gets it right every time. However, there’s a problem. When your friend tries to emulate your left-handed wave, your dog looks blankly at them. Confused, you demonstrate the trick again, only this time you use your right hand. Your dog is still staring blankly, but why? It turns out your dog only learned how to do the trick when he sees your left hand make the motion. Additionally, he only learned to do it when you make the gesture, so he will never respond to your friend, no matter what hand your friend uses. This is overfitting.

Put simply, overfitting is when you have a machine learning model that tries to adapt itself too much to the data that you have. It occurs when a machine learning algorithm or statistical model captures the noise of the data and shows low bias but high variance. In machine learning, bias refers to the error that occurs from overly simplistic or faulty assumptions. So a high bias means the algorithm has missed significant trends among the features. In a high variance situation, the model fits the data very well but isn’t good at working with new data sets because it captured all of the randomness in the training data. To put it another way, it happens when the model or algorithm fits the data too well for the model to be effective at its goal.

#machine-learning #data-science #overfitting #underfitting

What Exactly Is Underfitting and Overfitting in Machine Learning?
2.50 GEEK