# Back to Machine Learning Basics - Regularization

In this article, we explore Clustering algorithms, implement them from scratch with Python and learn how to use Sci-Kit Learn implementation.

In this article, we focus on machine learning algorithm performance and its improvement. We explore terms such as biasand variance, and how to balance them in order to achieve better performance. We learn about overfitting and underfitting, ways to avoid them and improve machine learning efficiency with regularization techniques such as Lasso_and _Ridge.

### Dataset and Prerequisites

Data that we use in this article is the famous Boston Housing Dataset. This dataset is composed 14 features and contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It is a small dataset with only 506 samples.

For the purpose of this article, make sure that you have installed the following _Python _libraries:

Once installed make sure that you have imported all the necessary modules that are used in this tutorial.

``````import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import Lasso, Ridge, ElasticNet, SGDRegressor, LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.base import clone``````

Apart from that, it would be good to be at least familiar with the basics of linear algebracalculus and probability.

