Deep Belief Networks — An Introduction

Himanshu Singh
4 min readJul 30, 2018

--

In this article we will be looking at what DBNs are, what are their components, and their small application in Python, to solve the handwriting recognition problem (MNIST Dataset).

Before understanding what a DBN is, we will first look at RBMs, Restricted Boltzmann Machines.

Restricted Boltzmann Machines

If you know what a factor analysis is, RBMs can be considered as a binary version of Factor Analysis. So instead of having a lot of factors deciding the output, we can have binary variable in the form of 0 or 1.

For Example: If you a read a book, and then judge that book on the scale of two: that is either you like the book or you do not like the book. In this kind of scenarios we can use RBMs, which will help us to determine the reason behind us making those choices

RBMs take a probabilistic approach for Neural Networks, and hence they are also called as Stochastic Neural Networks.

If we decompose RBMs, they have three parts:-

  1. One Input Layer aka Visible Unit
  2. One Hidden Layer aka Hidden Unit
  3. One Bias Unit

In the example that I gave above, visible units are nothing but whether you like the book or not. Hidden Unit helps to find what makes you like that particular book. Bias is added to incorporate different kinds of properties that different books have.

Let us visualize the RBMs:

Red is Visible Unit, Blue is Hidden Unit

Let us look at the steps that RBN takes to learn the decision making process:-

  1. Compute Activation Energy
  2. Calculate Sigmoid of Activation Energy
  3. This will give us a probability. Using this probability Hidden unit can turn on or turn off any of the nodes in visible unit.

Now that we have basic idea of Restricted Boltzmann Machines, let us move on to Deep Belief Networks

Deep Belief Networks

DBNs have two phases:-

  1. Pre-train Phase
  2. Fine-tune Phase

Pre-train phase is nothing but multiple layers of RBNs, while Fine Tune Phase is a feed forward neural network. Let us visualize both the steps:-

credit: Codeburst

How DBNs work?

  1. Find the features of Visible Units using Contrastive Divergence Algorithm
  2. Find the Hidden Unit Features, and the feature of features found in above step
  3. When the hidden layer learning phase is over, we call it as a trained DBN

Practical Application on MNIST Dataset

Step 1 is to load the required libraries. dbn.tensorflow is a github version, for which you have to clone the repository and paste the dbn folder in your folder where the code file is present. Link to code repository is here.

from dbn.tensorflow import SupervisedDBNClassification
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics.classification import accuracy_score

Step 2 is to read the csv file which you can download from kaggle.

digits = pd.read_csv("train.csv")

Step 3, let’s define our independent variable which are nothing but pixel values and store it in numpy array format, in the variable X. We’ll store the target variable, which is the actual number, in the variable Y.

X = np.array(digits.drop(["label"], axis=1))

Y = np.array(digits["label"])

Step 4, let us use the sklearn preprocessing class’s method: standardscaler. This is used to convert the numbers in normal distribution format.

from sklearn.preprocessing import standardscaler
ss=standardscaler()
X = ss.fit_transform(X)

Step 5, Now that we have normalized the data, we can split it into train and test set:-

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0)

Step 6, Now we will initialize our Supervised DBN Classifier, to train the data.

classifier = SupervisedDBNClassification(hidden_layers_structure =       [256, 256],
learning_rate_rbm=0.05,
learning_rate=0.1,
n_epochs_rbm=10,
n_iter_backprop=100,
batch_size=32,
activation_function='relu',
dropout_p=0.2)

Step 7, Now we will come to the training part, where we will be using fit function to train:

classifier.fit(X_train, Y_train)

It may take from 10 minutes to one hour to train on the dataset. Once the training is done, we have to check for the accuracy:

Y_pred = classifier.predict(X_test)
print('Done.\nAccuracy: %f' % accuracy_score(Y_test, Y_pred))

The output that I got was:

Final Accuracy

So, in this article we saw a brief introduction to DBNs and RBMs, and then we looked at the code for practical application. Hope it was helpful!

--

--

Himanshu Singh

ML Consultant, Researcher, Founder, Author, Trainer, Speaker, Story-teller Connect with me on LinkedIn: https://www.linkedin.com/in/himanshu-singh-2264a350/