Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

classification
algorithms
machine learning

# What are Naive Bayes classifiers? Educative Answers Team

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

In machine learning, Naive Bayes classifiers are widely used for classification because, when the assumption of independence holds, they are easy to implement and yield better results than other sophisticated predictors. Naive Bayes classifiers are based on Bayes’ theorem and assume that the occurrence or absence of a feature does not influence the presence or absence of some other feature.

## Types

• Gaussian Naive Bayes classifier: used when features are not discreet.

• Multinomial Naive Bayes Classifier: used when features follow a multinomial distribution.

• Bernoulli Naive Bayes classifier: used when features are of the boolean type.

## Derivation

Let’s take a look at the Mathematics behind Naive Bayes classifiers.

The equation for Bayes theorem is:

$P(class|X) = P(X|class)P(class)/P(X)$

A class variable is something that the classifier is trying to classify. For instance, when trying to classify an email as spam or not, “is spam” is the class variable.

In the equation above, $class$ is the class variable and $X$ is the set of features. $X = (x_1, x_2, ... x_n)$

The above formula can be rewritten as:

$P(class|x_1 ... x_n) = P(x_1|class)... P(x_n|class)P(class)/P(x_1)...P(x_n)$

Notice that for all entries in the given dataset, the denominator will not change. Hence, the denominator can be ignored.

$P(class|x_1 ... x_n) \propto P(x_1|class)... P(x_n|class)P(class)$

For all outcomes of the class variable, the class variable with the maximum probability needs to be found using:

$class = argmax(P(x_1|class)... P(x_n|class)P(class))$

Note: Different Naive Bayes classifiers make different assumptions regarding the distribution of $P(x_i | class)$.

## Applications

Some​ applications that use Naive Bayes classifiers are:

• Spam Filtering

• Text Analysis

• Recommendation Systems

RELATED TAGS

classification
algorithms
machine learning 