Bayesian Network

Rupesh Sharma
Brillio Data Science
5 min readJun 28, 2022

A lot of us would be familiar with the Naïve Bayes algorithm used for classification problems.

Naïve Bayes works with strong assumptions of independence amongst the features which can be unrealistic many a times.

In some cases, we might want to build a model where the variables are not necessarily independent. Bayesian Networks help us to build a model that can help us to find the relationships between such variables.

Before deep-diving into the concept of Bayesian Network, let’s look at a few of the concepts needed to understand the network.

Conditional Probability

Conditional Probability is the probability of an event A that is based on the occurrence of another event B.

Joint Probability

Joint Probability is the probability of two more events occurring together at the same time.

For events x1, x2, x3,…..xn, the joint probability of all these events occurring together at the same time is:

P[x1,x2,x3,x4,…….,xn] = P[x1| x2, x3,….., xn]P[x2, x3,….., xn]

= P[x1| x2, x3,….., xn]P[x2|x3,….., xn]….P[xn-1|xn]P[xn].

Bayes’ theorem

Bayes’ theorem describes the probability of an event that is based on some event that has already occurred. It includes two conditional probabilities.

It states that the conditional probability of an event A, given the occurrence of another event B, is equal to the product of the likelihood of B, given A and the probability of A. It is given as:

Directed Acyclic Graph

Directed Acyclic Graph (DAG) refers to a directed graph that has no directed cycles. In other words, it’s made up of a set of vertices(nodes) and edges (also called arcs), with each edge pointing from one vertex to the next in such a way that following those directions would never lead to a closed-loop as depicted in below picture. Each arc represents conditional probabilities between 2 nodes.

A&C and B&C are independent of each other. B is dependent on A. This can be written as conditional probability: P(B|A). D is also dependent on B & C. It can be written as P(D|B,C)

D is considered as conditionally independent of A as it is not directly dependent on A. So, P(D|B,A) = P(D|B)

Let’s discuss Bayesian network in details now!

Bayesian Networks

Bayesian Networks (Bayes network, Bayes net, belief network, or judgment network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies (based on Bayes Theorem) using a directed acyclic graph (DAG) and are often used to aid in decision making. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor.

As an example, Bayesian network can help us understand probabilistic relationship between weather condition and wind speed to compute the probability of getting rain on a particular day.

Bayesian Network Use Case

Jacob has a smart watch that monitors health. The watch has an alarm that responds to abnormal heartbeat. It also responds to major panic attack. As soon as the alarm receives a signal of abnormal heartbeat or panic attack, it gives call to nearby hospital and emergency number. However, sometimes it misses to call due to network issue.

We need to find out the probability that alarm senses some abnormality, but there was neither abnormal heartbeat nor panic attack, and gives call to both hospital and emergency contact.

Directed Acyclic Graph for Bayesian network of the above use case is below:

In the graph, Abnormal heartbeat and Panic attack are parent node of Alarm. Emergency call and call to hospital depend on alarm probability.

We need to find out conditional probabilities for each node as conditional probabilities table (CPT).

A node with K parent nodes contains 2^k probabilities. Hence, if there are two parents, then CPT will have 4 conditional probabilities.

Let’s assume probabilities of panic attack and abnormal heartbeat as:

P(AH=True) = 0.002, Probability of having abnormal heartbeat

P(AH=False) = 0.998, Probability no abnormal heartbeat

P(PA=True) = 0.001, Probability of having panic attack

P(PA=False) = 0.999, Probability of no panic attack

Let’s assume conditional probabilities for Alarm A, P(A):

Let’s assume conditional probabilities for Emergency Number Call E, P(E):

Let’s assume conditional probabilities for Hospital Call H, P(H):

Using joint distribution, the probability distribution that alarm senses some abnormality, but there was neither abnormal heartbeat nor panic attack, and gives call to both hospital and emergency contact can be written as:

P[H, E, A, ¬AH, ¬PA]= P[H | E, A, ¬AH, ¬PA]. P[E, A, ¬AH, ¬PA]

= P[H | E, A, ¬AH, ¬PA].P[E | A, ¬AH, ¬PA].P[A, ¬AH, ¬PA]

= P [H| A]. P [ E| A]. P[A, ¬AH, ¬PA]

= P [H| A]. P [ E| A].P[A|¬AH, ¬PA].P[¬AH, ¬PA]

= P [H| A]. P [ E| A].P[A|¬AH, ¬PA]. P[¬AH |¬PA]. P[¬PA]

= P [H| A]. P [ E| A].P[A|¬AH, ¬PA]. P[¬AH]. P[¬PA]

= 0.79* 0.93* 0.01* 0.998*0.999

= 0.0073249

Conclusion

We have discussed how Bayesian Network works and how DAG is used to represent conditional relationships between variables in Bayesian Network.

Bayesian networks can be used for spam filtering, disease diagnosis, optimized web search, biomonitoring (in monitoring the quality of chemical dozes) and many other domains where we want to find out conditional dependency as well as independence between variables.

Finally, if we want to build Bayesian Network using Python, we can use libraries like pybbn, pgmpy, pomegranate , bnlearn, or PyMC

Thank you for reading!

References:

https://www.sciencedirect.com/topics/mathematics/bayesian-network#:~:text=A%20Bayesian%20network%20(BN)%20is,corresponding%20random%20variables%20%5B9%5D.

https://www.javatpoint.com/bayesian-belief-network-in-artificial-intelligence

https://en.wikipedia.org/wiki/Bayesian_network

https://www.cuemath.com/data/bayes-theorem/

https://towardsdatascience.com/bbn-bayesian-belief-networks-how-to-build-them-effectively-in-python-6b7f93435bba

https://analyticsindiamag.com/a-guide-to-inferencing-with-bayesian-network-in-python/

--

--