An Introductory Guide to Data Science and Machine Learning/

...

Naive bayes Part-1

Naive Bayes algorithms are based on Bayes’ Rule, which we discussed in the previous lessons, and it works very well for Natural Language Problems like Document Classification and Spam Filtering. We’ll uncover more of the details behind it in this lesson.

We'll cover the following...

Naive Bayes
Mathematical intuition

Assumption of Naive Bayes
Applying Bayes’ Theorem
Applying the Independence Assumption

Applying the Mathematical Intuition

Calculations

Naive Bayes

The Naive Bayes Theorem is based on Bayes’ Rule which is stated as below.

“Bayes’ theorem (alternatively Bayes’ law or Bayes’ rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.”

Bayes theorem is stated as below.

$P(A | B) = \frac{P(B | A) P(A)}{P(B)}$

$P(B)$ is the probability of $B$ . It is called Evidence.
$P(A | B)$ is the conditional probability of $A$ , given $B$ has occurred. It is called the Posterior Probability, meaning the probability of an event after evidence is seen.
$P(B | A)$ is the conditional probability of $B$ , given $A$ has occurred. It is called the Likelihood.
$P(A)$ is the probability of $A$ . It is called the Prior Probability, meaning the probability of an event before evidence is seen.

Naive Bayes methods go with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

Mathematical intuition

We will be going with a fictional dataset for the playing of Golf Game, as seen below.

Outlook	Temperature	Humidity	Windy	Play Golf
Rainy	Hot	High	False	No
Rainy	Hot	High	True	No
Overcast	Hot	High	False	Yes
Sunny	Mild	High	False	Yes
Sunny	Cool	Normal	False	Yes
Sunny	Cool	Normal	True	No
Overcast	Cool	Normal	True	Yes
Rainy	Mild	High	False	No
Rainy	Cool	Normal	False	Yes
Sunny	Mild	Normal	False	Yes
Rainy	Mild	Normal	True	Yes
Overcast	Mild	High	True	Yes
Overcast	Hot	Normal	False	Yes
Sunny	Mild	High	True	No

In the above dataset the independent features( $X$ ) are Temperature, Humidity, Outlook, and Windy.
In the above dataset the dependent feature( $y$ ) is Play Golf.

Assumption of Naive Bayes

Naive Bayes algorithms assume that each input feature is independent, and they make an equal contribution to the outcome (Play Golf). The assumptions made by the Naive Bayes algorithms are generally not true in the real world examples but they work well in practice.

Applying Bayes’ Theorem

Applying the Bayes Theorem we get the following representation.

$P(y|X) = \frac{P(X|y)P(y)}{P(X)}$

where, $y$ is class variable and $X$ is a dependent feature vector (of size $n$ ) where:

$X = (x_1, x_2, x_3, ..., x_n)$

From the above table, taking the first row.

$X = (Overcast, Hot, High, False)$
$y = Y e s$ ...

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Naive bayes Part-1

Naive Bayes

Mathematical intuition

Assumption of Naive Bayes

Applying Bayes’ Theorem