Introduction

This Article is in the continuation of my Previous Article in which I have shown you How Multiple Linear Regression is prepared and using the information obtained from its diagnostic plot, how we proceed towards Orthogonal Polynomial Regression and obtain a better model for the given data set (I have used Advertising Data Set).

In the previous article, I have created Orthogonal Polynomial model to avoid the problem of multicollinearity. But now, In this article I will first create problem of multicollinearity by introducing polynomial features of predictors TV and Radio and then show you how to tackle this multicollinearity problem using Ridge, Lasso and Elastic-Net Regression techniques.

This Article consists of the following sections -

  1. Loading Required Libraries
  2. Loading Outlier Free Data set
  3. Recap (of Previous Article)
  4. Creating Multicollinearity Problem
  5. Fitting Polynomial Regression (Note : Not Orthogonal Polynomial)
  6. Checking Assumptions
  7. Making Predictions Using Polynomial Regression Model
  8. Average Performance of Polynomial Regression Model
  9. Comparison Between Polynomial and Orthogonal Polynomial Model
  10. Data Preparation for further analysis
  11. Tackling Multicollinearity by Ridge/Lasso/Elastic-Net Regression
  12. Comparison of Different Models (Polynomial, Orthogonal Polynomial, Ridge, Lasso, Elastic-Net)
  13. Obtaining Best Model
  14. Conclusion

I am going to use kaggle online platform for analysis work. You may use any software like R-studio or R-cran version.

1. Loading Required Libraries

It is not necessary to load all libraries in the beginning but I am doing it for simplicity. I am loading one more library glmnet for Ridge/Lasso/Elastic-Net Regression.

## Loading Libraries
library(tidyverse)
library(caret)
library(car)
library(lmtest)
library(olsrr)
library(glmnet)       ## For Ridge/Lasso/Elastic-Net Regression

2. Loading Outlier Free Data set

Link to download Outlier free data set already stored in R-objects-

  1. data
  2. train.data1 and test.data

in my previous notebook.

Don’t know how to load data in online kaggle R-session, Read from here.

## Loading Outlier free data set
data = read.csv("../input/outlier-free-advertising-data-set/outlier free advertising data.csv" , header = T)

## Loading outlier free train and test data already splitted in previous notebook
train.data1 <- read.csv("../input/traindata1-and-testdata-for-further-analysis/train.data1.csv", header = T)
test.data <- read.csv("../input/traindata1-and-testdata-for-further-analysis/test.data.csv", header = T)

#data-science #data-analysis #machine-learning #regression #statistics #data analysis

Multicollinearity / Ridge / Lasso / Elastic-Net Regression using R
1.75 GEEK