Logistic regression is a very popular machine learning technique. We use logistic regression when the dependent variable is categorical. This article will primarily focus on the implementation of logistic regression. I am assuming that you already know how to implement a binary classification with Logistic Regression. If not, please see the links at the end to learn the concepts of machine learning and the implementation of the basic logistic regression.

The implementation of Multiclass classification follows the same ideas as the binary classification. As you know in binary classification, we replace two classes with 1 and 0 respectively. In one vs all method, when we work with a class, that class is denoted by 1 and the rest of the classes becomes 0. It will be more understandable to you when you will implement it. I suggest, you keep coding and running the codes as you read.

Python Implementation

Here I am going to show the implementation step by step.

  1. Import the necessary packages and the dataset. I took the dataset from Andrew Ng’s Machine Learning course in Coursera. This is a handwriting recognition dataset. There are digits from 1 to 10. From the dataset of pixels, we need to recognize the digit.
import pandas as pd
import numpy as np
xl = pd.ExcelFile('ex3d1.xlsx')
df = pd.read_excel(xl, 'X', header=None)
y = pd.read_excel(xl, 'y', hearder = None)

2. Define the hypothesis that takes the input variables and theta. It returns the calculated output variable.

def hypothesis(theta, X):
    return 1 / (1 + np.exp(-(np.dot(theta, X.T)))) - 0.0000001

3. Build the cost function that takes the input variables, output variable, and theta. It returns the cost of the hypothesis. That means it gives the idea about how far the prediction is from the original outputs.

def cost(X, y, theta):
    y1 = hypothesis(X, theta)
    return -(1/len(X)) * np.sum(y*np.log(y1) + (1-y)*np.log(1-y1))

#data-science #multiclass-classification #logistic-regression #programming #machine-learning

Multiclass Classification With Logistic Regression
2.40 GEEK