In this article, we will build a logistic regression model for classifying whether a patient has diabetes or not. The main focus here is that we will only use python to build functions for reading the file, normalizing data, optimizing parameters, and more. So you will be getting in-depth knowledge of how everything from reading the file to make predictions works.
If you are new to machine learning, or not familiar with logistic regression or gradient descent, don’t worry I’ll try my best to explain these in layman’s terms. There are more tutorials out there that explain the same concepts. But what makes this tutorial unique is its short and beginners friendly high-level description of the code snippets. So, let’s start by looking at some theoretical concepts that are important in order to understand the working of our model.
Logistic Regression is the entry-level supervised machine learning algorithm used for classification purposes. It is one of those algorithms that everyone should be aware of. Logistic Regression is somehow similar to linear regression but it has different cost function and prediction function(hypothesis).
It is the activation function that squeezes the output of the function in the range between 0 and 1 where values less than 0.5 represent class 0 and values greater than or equal to 0.5 represents class 1.
Cost function finds the error between the **_actual value _**and predicted value of our algorithm. It should be as minimum as possible. In the case of linear regression, the formula is:-
But this formula cannot be used for logistic regression because the hypothesis here is a nonconvex function that means there are chances of finding the local minima and thus avoiding the global minima. If we use the same formula then our plot will look like this:-
So, in order to avoid this, we have smoothened the curve with the help of log and our cost function will look like this:-
where m=number of examples or rows in the dataset, xᶦ=feature values of iᵗʰ example, yᶦ=actual outcome of iᵗʰ example. After using this, our plot for the cost function will look like this:-
#machine-learning #python #artificial-intelligence #programming #logistic-regression