The Naive Bayes algorithm is a classification technique based on Bayes Theorem. It assumes that a feature in a class is unrelated to the presence of any other feature. The algorithm relies on the posterior probability of the class given a predictor, as shown in the following formula:

where:

  • P(c|x) — the posterior probability of class given a predictor
  • P(x|c) — the probability of the predictor given the class. Also known as Likelihood
  • P© — the prior probability of the class
  • P(x) — the prior probability of predictor.

The Naive Bayes classifier is easy to implement and performs well, even with a small training data set. It is one of the best fast solutions when it comes to predicting the class of the data. Scikit-learn offers different algorithms for various types of problems. One of them is the Gaussian Naive Bayes. It is used when the features are continuous variables, and it assumes that the features follow a Gaussian distribution. It is straightforward to apply the open-source model on data, but a good analyst has to understand how the model is built so that he can use it to the appropriate data.

The best way to understand a model is to build one from scratch. All the following methods are defined in a GaussianNBClassifier class. Let’s have some fun!

1. Instantiate the class

We will use only the numpy library for arithmetical operations.

import numpy as np
class GaussianNBClassifier:
    def __init__(self):
        pass

2. Separate classes

According to the Bayes Theorem, we need to know the class prior probability. To calculate it, we have to assign the feature values to the specific class. We can do this by separating the classes and saving them into a dictionary.

def separate_classes(self, X, y):
    separated_classes = {}
    for i in range(len(X)):
        feature_values = X[i]
        class_name = y[i]
        if class_name not in separated_classes:
            separated_classes[class_name] = []
        separated_classes[class_name].append(feature_values)
    return separated_classes

#python #gaussian-distribution #classification #algorithms #naive-bayes

Naive Bayes Classifier From Scratch
9.50 GEEK