Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Probabilistic Graphical Models — Introduction

4 min readOct 13, 2017

--

Probabilistic graphical models or PGM are frameworks used to create probabilistic models of complex real world scenarios and represent them in compact graphical representation. This definition in itself is very abstract and involves many terms that needs it’s own space, so lets take these terms one by one.

Model
A Model is a declarative representation of a real world scenario or a problem that we want to analysis. When I say declarative it means it’s not derived but declared or defined either by a domain expert using its domain knowledge of the problem or by using statical learning algorithms from historic data set, and then represented using any mathematical tools like graph or even simply by an equation.
For ex in Liner Regression

Press enter or click to view image in full size
Where Y is the outcome we want to predict given the feature vector X that affects the output. So here we assume (model) Y as a linear function of the input X which is parameterised by Theta.

Why we need to model the problem? Because of two main reasons
1. It allows us to translate an unstructured real world problem in to a structured mathematical representation
2. It allows us to isolate the problem (model representation) with it’s solution (algorithm), meaning once we have a mathematical model for our problem we can apply any algorithm to solve it, e.g. for the above model we can either solve it using gradient decent or any other method, model doesn’t change.

Press enter or click to view image in full size

Probabilistic
The nature of the problem that we are generally interested to solve or the type of queries we want to make are all probabilistic because of uncertainty. There are many reasons that contributes to it. For example, incomplete knowledge of the problem, noisy observations or some attributes that contributes to the problem but can’t/hasn’t included in the model.

Graphical
In the example above we used a mathematical representation of the model but in real world scenarios are complex and often involves large no of variables. So many times a graphical representation helps us to visualise better and then we use Graph Theory to reduce the no of relevant combinations of all the participating variables to represent the high dimensional probability distribution model more compactly.

There are some basic terms & concepts related to probability and statistics I will be using to discuss PDMs. Those are covered in a separate post Basic Probability Theory and Statistics.

A model is generally useful if it helps us to understand the real world phenomena allows us to make useful predictions about how the world will behave under certain set of conditions. The kind of quires we often want to do on our models are of nature is to predict what will likely to happen when something else is already happened/observed. To answer such queries we need to consider all components (random variables) of that state of the world and represent them some how. We already learned “joint distribution”, we represents all possible outcomes of an event. So it means we got the tool to represent our model which can cover all possible scenarios and that is ‘Joint Probability Distribution Table’ (For time being restrict ourself to discrete random variables only). Then why we are discussing PGMs?
The problem is even for a simple model the number of variables n can be in the range of 100s and if each can obtain on an average d values, the size of the joint distribution table can be very-2 large. So tables may not be a feasible way to represent complex models though the conceptually they are sufficient, computational limitation don’t allow them. We need some other representation, and it seems Graphs because of it’s sparse nature can be a good fit. There are number of advantages of using graphs to represent such probabilistic models.

  1. Even though there are large number of random variables involved not necessarily they all depends on each other. And if the two random variables are independents we don’t need to consider their all combinations in join distribution table, which means ideally there will be lesser rows in the table. Graph Theory has inherent capability to represent such dependence and independences.
  2. Graphs representation is flexible in a sense that one doesn’t require to obtain all the knowledge about world to build a model. One can start with his understanding of the world and build a model and it’ll behave based on the knowledge it has. When one acquire more knowledge it can apply the same incrementally to the model (Add/Update nodes, edges) and model will produce improved results based on new information.
  3. As graph are standard mathematical structure it not only allows to encode probability distribution but also provide a very clear interface to query model for prediction queries.

As PGMs are based on Graph Theory they can be categories in to two categories as Directed Graphs or Undirected Graphs. There is one family of PGM uses Directed Graphs is Bayesian Networks and other that uses undirected graphs is Markov Networks. Even though they both derived from Graph Theory there are so many differences in both of their intuitions and each require it’s own space ‘ll discuss them in separate posts.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Responses (2)