 1596465840

# The Ironic Sophistication of Naive Bayes Classifiers

#### Filtering spam with Multinomial Naive Bayes (From Scratch)

In the first half of 2020 more than 50% of all email traffic on the planet was spam. Spammers typically receive 1 reply for every 12,500,000 emails sent which doesn’t sound like much until you realize more than 15 billion spam emails are being sent each and every day. Spam is costing businesses 20–200 billion dollars per year and that number is only expected to grow. What can we do to save ourselves from spam???

#### Naive Bayes Classifiers

In probability theory and statistics, Bayes’ theorem (alternatively Bayes’s theoremBayes’s law or Bayes’s rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

For example, if the risk of developing health problems is known to increase with age, Bayes’s theorem allows the risk to an individual of a known age to be assessed more accurately than simply assuming that the individual is typical of the population as a whole. Bayes Theorem Explained

#### A Naive Bayes Classifier is a probabilistic classifier that uses Bayes theorem with strong independence (naive) assumptions between features.

• Probabilistic classifier: a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to.
• Independence: Two events are **independent **if the occurrence of one does not affect the probability of occurrence of the other (equivalently, does not affect the odds). That assumption of independence of features is what makes Naive Bayes naive! In real world, the independence assumption is often violated, but naive Bayes classiﬁers still tend to perform very well.

#naive-bayes-classifier #python #naive-bayes #naive-bayes-from-scratch #naive-bayes-in-python

## Buddha Community  1596465840

## The Ironic Sophistication of Naive Bayes Classifiers

#### Filtering spam with Multinomial Naive Bayes (From Scratch)

In the first half of 2020 more than 50% of all email traffic on the planet was spam. Spammers typically receive 1 reply for every 12,500,000 emails sent which doesn’t sound like much until you realize more than 15 billion spam emails are being sent each and every day. Spam is costing businesses 20–200 billion dollars per year and that number is only expected to grow. What can we do to save ourselves from spam???

#### Naive Bayes Classifiers

In probability theory and statistics, Bayes’ theorem (alternatively Bayes’s theoremBayes’s law or Bayes’s rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

For example, if the risk of developing health problems is known to increase with age, Bayes’s theorem allows the risk to an individual of a known age to be assessed more accurately than simply assuming that the individual is typical of the population as a whole. Bayes Theorem Explained

#### A Naive Bayes Classifier is a probabilistic classifier that uses Bayes theorem with strong independence (naive) assumptions between features.

• Probabilistic classifier: a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to.
• Independence: Two events are **independent **if the occurrence of one does not affect the probability of occurrence of the other (equivalently, does not affect the odds). That assumption of independence of features is what makes Naive Bayes naive! In real world, the independence assumption is often violated, but naive Bayes classiﬁers still tend to perform very well.

#naive-bayes-classifier #python #naive-bayes #naive-bayes-from-scratch #naive-bayes-in-python 1592882100

## Naive Bayes Classifier

Introduction
Naïve Bayes algorithm is a machine learning supervised classification technique based on Bayes theorem with strong independence assumptions between the features. It is mainly used for binary or multi class classification and still remains one of the best method for Text categorization and document categorization.
For example, a vegetable may be considered to be tomato if it is red, round and 2 inches in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this Vegetable is a tomato, regardless of any possible correlations between the color, roundness, and diameter features.

#naive-bayes-in-python #machine-learning #artificial-intelligence #naive-bayes-classifier #data-science 1593865980

## What is the NaïveBayes Algorithm?

Naïve Bayes Algorithm is one of the popular classificationmachine learning algorithms and is included in supervised learning. that helps to classify the data based upon the conditional probability values computation. This algorithm is quite popular to be used in Natural Language Processingor NLP also real-time prediction, multi-class prediction, recommendation system, text classification, and sentiment analysis use cases. the algorithm is scalable and easy to implement for the large data set. Thomas Bayes

The algorithm based on **Bayes theorem. **Bayes Theorem helps us to find the probability of a hypothesis given our prior knowledge.

Let’s look at the equation for Bayes Theorem, Bayes Theorem

Naïve Bayes is a simple but surprisingly powerful predictive modeling algorithm. Naïve Bayes classifier calculates the probabilities for every factor. Then it selects the outcome with the highest probability.

## Applications of Naïve Bayes Algorithm

1. Real-time prediction: Naïve Bayes Algorithm is fast and always ready to learn hence best suited for real-time predictions.

2. Multi-class prediction: The probability of multi-classes of any target variable can be predicted using a Naïve Bayes algorithm.

3. Text Classification where Naïve Bayes is mostly used is Spam Filtering in Emails (Naïve Bayes is widely used for text classification)

4. Text classification/ Sentiment Analysis/ Spam Filtering: Due to its better performance with multi-class problems and its independence rule, Naïve Bayes algorithm perform better or have a higher success rate in text classification, Therefore, it is used in Sentiment Analysis and Spam filtering.

5.**Recommendation System: **Naïve Bayes Classifier and Collaborative Filtering together build a Recommendation System that uses machine learning and data mining techniques to filter unseen information and predict whether a user would like a given resource or not.

#data-driven-investor #data-science #naive-bayes-classifier #machine-learning #python #naïve bayes 1596948540

## Conditional Probability | Bayes Theorem | Naïve Bayes Classifier

Have you ever wondered :

How your suspected/ marketing emails are automatically put in the junk / spam box and not the primary inbox?

How we can predict if today it will rain or not ?

If you have these questions, then just go through the article and you will be able to get a fair idea how these predictions are made which is very useful in day to day life.

In real life we all face circumstances where based on certain conditions and prior knowledge we want to predict happening of a certain event, for eg. I am a cricket fan, based on today’s weather conditions like temperature and humidity levels and prior knowledge of rains happening in these conditions I would like to calculate probability if we will have a match or not, based on that I would like to place my bets :D

With this article I will try to de-mystify how a Naive Bayes Classifier works. We will proceed as :

• Basic terminologies, events and probabilities
• Bayes theorem, basics and formula derivation
• Naive Bayes Classification with examples
• Merits, Demerits and assumptions of using Naive Bayes method

### Basic Definitions and terminology:

• **Independent events: **If events take place in series in such a way that happening of first event does not impact the success/ failure of second event
• For example: If we roll a dice 3 times and we are interested in calculating probability of getting 3 6’s in a row. It will be 1/6 * 1/6* 1/6 , first roll does not impact the probability of getting a 6 in subsequent rolls.
• **Dependent events: **If happening of one event impacts the happening of second event then we call them dependent events
• For example: If we draw four cards randomly without replacement from a deck of 52 cards, if we want calculate the probability of getting for queens in a row it will be 4/52 * 3/51 * 2/50 * 1/49. Here the probability of drawing a queen changes from 4/52 to 3/51 as we already removed a card and that too a queen, similarly it goes down to 1/49 in the 4th draw
• **Conditional Probability: **When we try to calculate probability on a condition, i.e. probability of happening of event A when event B has already taken place
• For example if we are draw 2 cards one by one without replacement from a deck and interested in calculating probability of drawing a queen the second draw when we know that first card was a queen

## Equation of Conditional Probability: Let’s go through an example to have a clear picture.

Problem:

A purse contains 4,5 and 3 coins of denominations of Rs. 2,5 and 10 respectively. Two coins are drawn without replacing the first drawn coin . Find P( drawing Rs.2, then Rs.2)

Solution:

There are four Rs 2 coins and in total we have 12 coins so

P(Rs.2 coin in first draw) = 4/12 i.e. 1/3

The result of the first draw affected the probability of the second draw as after the first draw we are left with 3 coins of Rs.2 and in total now we have 11 coins.

P(Rs.2 coin for second draw) = 3/11

Finally P(drawing Rs. 2, then Rs 2 ) = (4/12)*(3/11) = 1/11

#predictions #data-science #towards-data-science #naive-bayes-classifier #data analysis 1602766800

## Naive Bayes Classification

lity, Conditional Probability, and Bayes theorem.

### Random Experiment

It is an experiment or a process for which the outcome cannot be predicted with certainty.

E.g.- While tossing a coin we can’t conclude in advance if the output will be heads or tails.

### Sample Space

The set consisting of all possible outcomes of a random experiment is known as Sample space.

E.g.-

The output of tossing a coin can consist of the following values = {H, T}

The output of rolling a dice S = {1,2,3,4,5,6}

### Event

The subset of Sample space which consists of all the available outcomes is known as an event.

For e.g-The output Head which is obtained after tossing a coin is known as the event.

#bayes-theorem #naive-bayes-classifier #machine-learning #supervised-learning