We all face the problem of spam in our inboxes. So I had an idea 💡!

Let’s build a spam classifier program in python which can tell whether a given message is spam or not!

We can do this by using a simple, yet powerful theorem from probability theory called Baye’s Theorem. Mathematically it is expressed as:

Image for post

Baye’s Theorem

Problem Statement

We have a message m = (w1, w2, . . . . , wn), where (w1, w2, . . . . , wn) is a set of unique words contained in the message. We need to find

Image for post

If we assume that the occurrence of a word is independent of all other words, we can simplify the above expression to

Image for post

In order to classify we have to determine which is greater

Image for post

#machine-learning #data-science #python #nlp #artificial-intelligence

Building Spam Classifier-NLP in Python From Scratch
1.45 GEEK