Naive Bayes: Part-1
Explore the Naive Bayes classification method, focusing on Bayes' theorem and the independence assumptions. Understand how to calculate class probabilities and apply these concepts to real datasets for making predictions.
Naive Bayes
The Naive Bayes Theorem is based on Bayes’ Rule, which is stated as follows:
Bayes’ theorem (alternatively Bayes’ law or Bayes’ rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.
Bayes theorem is stated as below:
-
is the probability of . It is called evidence.
-
is the conditional probability of , given has occurred. It is called the posterior probability, meaning the probability of an event after evidence is seen.
-
is the conditional probability of , given has occurred. It is called the likelihood.
-
is the probability of . It is called the prior probability, meaning the probability of an event before evidence is seen.
Naive Bayes methods go with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.
Mathematical intuition
We will use a fictional dataset for playing a golf game, as shown below:
| Outlook | Temperature | Humidity | Windy | Play Golf |
|---|---|---|---|---|
| Rainy | Hot | High | False | No |
| Rainy | Hot | High | True | No |
| Overcast | Hot | High | False | Yes |
| Sunny | Mild | High | False | Yes |
| Sunny | Cool | Normal | False | Yes |
| Sunny | Cool | Normal | True | No |
| Overcast | Cool | Normal | True | Yes |
| Rainy | Mild | High | False | No |
| Rainy | Cool | Normal | False | Yes |
| Sunny | Mild | Normal | False | Yes |
| Rainy | Mild | Normal | True | Yes |
| Overcast | Mild | High | True | Yes |
| Overcast | Hot | Normal | False | Yes |
| Sunny | Mild | High | True | No |
-
In the above dataset, the independent features() are Temperature, Humidity, Outlook, and Windy.
-
In the above dataset the dependent feature() is Play Golf.
Assumption of Naive Bayes
Naive Bayes algorithms assume that each input feature is independent, and they make an equal contribution to the outcome (Play Golf). The assumptions made by the Naive Bayes algorithms are generally not true in real-world examples, but they work well in practice.
Applying Bayes’ theorem
Applying the Bayes Theorem, we get the following representation:
where, is class variable and is a dependent feature vector (of size ) where:
From the above table, take the first row:
<br> ...