Techniques for handling underfitting and overfitting in Machine Learning

Techniques for handling underfitting and overfitting in Machine Learning

In one of my earlier articles, I talked about the bias-variance trade-off. We talked about the bias-variance relation to model complexity and how underfitting and overfitting looks like. I would encourage you to read the article if you don’t understand these terms:

I’ll be talking about various techniques that can be used to handle overfitting and underfitting in this article. I’ll briefly discuss underfitting and overfitting, followed by the discussion about the techniques for handling them.

Introduction

In one of my earlier articles, I talked about the bias-variance trade-off. We talked about the bias-variance relation to model complexity and how underfitting and overfitting looks like. I would encourage you to read the article if you don’t understand these terms:

*Underfitting *happens when the model has a very high bias and is unable to capture the complex patterns in the data. This leads to higher training and validation errors since the model is not complex enough to classify the underlying data. In the above example, we see that the data has a second-order relation but the model is a linear model so it won’t be able to

*Overfitting *is the opposite in the sense that the model is too complex (or higher model) and captures even the noise in the data. Therefore in this case one would observe a very low test error value. However, when it would fail to generalise to both the validation and test sets.

We want to find the optimal fitting situation where the model has a smaller gap between the training and validation error values. It should better generalization than in the other two cases.

How to handle underfitting

  1. In this situation, the best strategy is to increase the model complexity by either increasing the number of parameters of your deep learning model or the order of your model. Underfitting is due to the model being simpler than needed. It fails to capture the patterns in the data. Increasing the model complexity will lead to improvement in training performance. If we use a large enough model it can even achieve a training error of zero i.e. the model will memorize the data and suffer from over-fitting. The goal is to hit the optimal sweet spot.
  2. Try to train the model for more epochs. Ensure that the loss is decreasing gradually over the course of the training. Otherwise, it is highly likely that there is some kind of bug or problem in the training code/logic itself.
  3. If you aren’t shuffling the data after every epoch, it can harm the model performance. Ensuring that you are shuffling the data is a good check to perform at this point.

machine-learning artificial-intelligence coffee2021 deep-learning data-science

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Deep Learning vs Machine Learning vs Artificial Intelligence vs Data Science

This "Deep Learning vs Machine Learning vs AI vs Data Science" video talks about the differences and relationship between Artificial Intelligence, Machine Learning, Deep Learning, and Data Science.

Cheat Sheets for Artificial Intelligence, Neural Networks, Machine Learning, Deep Learning

This cheat sheet helps you to choose the proper estimate for the task that is the hardest portion of the work. With modern computer technology, today’s machine learning isn’t like machine learning from the past.

How are deep learning, artificial intelligence and machine learning related

What is the difference between machine learning and artificial intelligence and deep learning? Supervised learning is best for classification and regressions Machine Learning models. You can read more about them in this article.

Artificial Intelligence (AI) vs Machine Learning vs Deep Learning vs Data Science

Artificial Intelligence (AI) vs Machine Learning vs Deep Learning vs Data Science: Artificial intelligence is a field where set of techniques are used to make computers as smart as humans. Machine learning is a sub domain of artificial intelligence where set of statistical and neural network based algorithms are used for training a computer in doing a smart task. Deep learning is all about neural networks. Deep learning is considered to be a sub field of machine learning. Pytorch and Tensorflow are two popular frameworks that can be used in doing deep learning.

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant