Related Tags

naive bayes
python
machine learning
course related

# What are Bayesian networks?

Abdul Mateen

“A Bayesian network signifies the causal probabilistic connection among a set of random variables, their conditional dependencies, and it provides a compact representation of a joint probability distribution.” (Murphy, 1998).

### Overview

A Bayesian network consists of two essential parts: a directed acyclic graph and a set of conditional probability distributions.

• The directed acyclic graph is a collection of random variables outlined by nodes.
• A conditional probability distribution is set for each node in the graph. Specifically, the conditional probability distribution of a node (random variable) is restricted for every possible outcome of the preceding causal node(s).

Bayesian networks formulate on the same intuition as the Naïve Bayes classifier, but in contradiction to Naïve Bayes, Bayesian networks are not limited to only representing independent features. Bayesian networks enable us to add as many dependencies that seem consistent.

### Example

For clarification, consider the following example.

Assume we try to turn on our computer, but the computer does not start (observation/evidence). We want to understand which of the viable causes of computer failure is more likely. This simplified illustration only assumes two possible causes of this misfortune: electricity failure and computer malfunction.

Directed Acyclic Graph

The two causes in this example are considered independent because there is no edge between them, but this assumption is not generally necessary. Unless there is a cycle in the graph, Bayesian networks can capture as many causal relations as is essential to credibly describe the real-life situation.

The aim is to estimate the posterior conditional probability distribution of viable unobserved causes given the empirical evidence, i.e., P [Cause | Evidence].

The entire idea of Bayesian networks is formed on the Bayes theorem, which helps us to signify the conditional probability distribution of cause, provided the experimental evidence using the converse conditional probability of observing evidence given the reason:

$P (Cause | Evidence) = P (Evidence | Cause) . \frac{P (Cause)}{P (Evidence)}$

Given a node’s parents, any node in a Bayesian network is conditionally independent of all nondescendants. Therefore, the joint probability distribution of all random variables in the graph factorizes into a sequence of conditional probability distributions of random variables given their parents.

### Implementation

Here’s the practical implementation of Bayesian networks.

First, let’s look at how to initialize a Bayesian network by quickly implementing the Monty Hall Problem. The Monty Hall problem arose from the game show Let’s Make a Deal, where a guest had to pick which one of three doors had a reward behind it. The twist was that after the guest chose, the host (originally Monty Hall) would then open one of the doors the guest did not pick and ask if the guest wanted to switch which door they had chosen.

To create the Bayesian network in pomegranate, we first design the distributions that live in each node in the graph. For a discrete Bayesian network, we use Discrete Distribution objects for the root nodes and Conditional Probability Table objects for the inner and leaf nodes. The columns in a Conditional Probability Table correspond to the order in which the parents (the second argument) are specified. The last column is the value the Conditional Probability Table takes itself. In the case below, the first column corresponds to the value guest takes, then the value prize takes, and then the matter that monty takes. B, C, and A refer to the probability that Monty reveals door A, given that the guest has chosen door B, and that the prize is actually behind door C, or P(Monty=A|Guest=B, Prize=C).

from pomegranate import *
# The guests initial door selection is completely random
guest = DiscreteDistribution({'A': 1./3, 'B': 1./3, 'C': 1./3})

# The door the prize is behind is also completely random
prize = DiscreteDistribution({'A': 1./3, 'B': 1./3, 'C': 1./3})

# Monty is dependent on both the guest and the prize.
monty = ConditionalProbabilityTable([
['A', 'A', 'A', 0.0],['A', 'A', 'B', 0.5],
['A', 'A', 'C', 0.5],['A', 'B', 'A', 0.0],
['A', 'B', 'B', 0.0],['A', 'B', 'C', 1.0],
['A', 'C', 'A', 0.0],['A', 'C', 'B', 1.0],
['A', 'C', 'C', 0.0],['B', 'A', 'A', 0.0],
['B', 'A', 'B', 0.0],['B', 'A', 'C', 1.0],
['B', 'B', 'A', 0.5],['B', 'B', 'B', 0.0],
['B', 'B', 'C', 0.5],['C', 'B', 'A', 1.0],
['C', 'B', 'B', 0.0],['C', 'B', 'C', 0.0],
['C', 'C', 'A', 0.5],['C', 'C', 'B', 0.5],
['C', 'C', 'C', 0.0],['B', 'C', 'A', 1.0],
['B', 'C', 'B', 0.0],['B', 'C', 'C', 0.0],
['C', 'A', 'A', 0.0],['C', 'A', 'B', 1.0],
['C', 'A', 'C', 0.0],],[guest, prize])

# State objects hold both the distribution, and a high level name.
s1 = Node(guest, name="guest")
s2 = Node(prize, name="prize")
s3 = Node(monty, name="monty")

# Create the Bayesian network object with a useful name
model = BayesianNetwork("Monty Hall Problem")

# Add the three states to the network

# Add edges which represent conditional dependencies, where the second node is
# conditionally dependent on the first node (Monty is dependent on both guest and prize)
# Finding the probalibities
model.bake()
# Probability for Valid Case
print("The Probability if User said door A, then Monty opened door B, but the car was behind door C : ",model.probability([['A', 'B', 'C']]))
# Probability for Invalid Case
print("The Probability if User said door A, then Monty opened door B, but the car was behind door B : ",model.probability([['A', 'B', 'B']]))

RELATED TAGS

naive bayes
python
machine learning
course related

CONTRIBUTOR

Abdul Mateen