# Detecting fake banknotes using TensorFlow

**Material sourced from:** Python for Data Science and Machine Learning Bootcamp

TensorFlow is an open source library built by Google, widely used in the field of machine learning and deep learning. The library is popular for its use of data-flow graphs to carry out numeric computations.

Today, let’s use TensorFlow to build an artificial neural network that detects fake banknotes.

**The dataset**

Our dataset is a CSV file that contains information extracted from (wavelet transformed) images of banknotes. There are 1,372 banknotes, each with the following attributes:

**Image.Var**(Variance of Wavelet Transformed image (WTI))**Image.Skew**(Skewness of WTI)**Image.Curt**(Curtosis of WTI)**Entropy**(Entropy of image)**Class**(Whether or not the banknote was authentic)

Let’s see how we can explore this data using Pandas and Seaborn, and make predictions from it using TensorFlow.

**Importing the dataset**

Firstly, let’s import the necessary Python libraries.

importpandasaspd

importnumpyasnp

importmatplotlib.pyplotasplt

importseabornassns

importtensorflowastffromsklearn.preprocessingimportStandardScaler

fromsklearn.model_selectionimporttrain_test_split

fromsklearn.ensembleimportRandomForestClassifier

fromsklearn.metricsimportclassification_report,confusion_matrix

Next, let’s import the bank notes CSV file and store it in a Pandas dataframe called `bank_notes`

. We can get the dimensions of the dataset using `.shape`

.

bank_notes = pd.read_csv('bank_note_data.csv')bank_notes.shape(1372, 5)

Output:

We can also use the `.head()`

, `.info()`

, and `.describe()`

methods to learn more about our data.

`bank_notes.`**head**()

`bank_notes.`**info**()

`bank_notes.`**describe**()

**Exploring the dataset**

Now that we’ve imported our data, let’s plot some graphs to see how our data is distributed. First, we can use Seaborn’s Countplot to see how many fake and real banknotes there are in the dataset.

`sns.`**countplot**(x='Class', data=bank_notes)

It seems like we have a lot more fake banknotes in our dataset.

Next, let’s try to find relationships between the other attributes in our dataset (in relation to our target class). We can use a Pairplot from Seaborn, with the hue set to the `Class`

attribute. This way we can easily see how the relationships differ between real and fake banknotes.

`sns.`**pairplot**(data=bank_notes, hue='Class')

**Data preparation**

When using neural network and deep learning-based systems, it’s usually a good idea to standardise our data. We don’t need to standardise the `Class`

attribute, so let’s create a separate dataframe to store the other features.

**bank_notes_without_class** = bank_notes.**drop**('Class', axis=1)

Next, let’s fit a StandardScaler object from the Scikit-learn library on the independent variables and store the transformed data in a new dataframe called `scaled_features`

.

scaler =StandardScaler()

scaler.fit(bank_notes_without_class)scaled_features = pd.DataFrame(data=scaler.transform(bank_notes_without_class), columns=bank_notes_without_class.columns)

We can take a look at the scaled features by calling `.head()`

.

Also, since we have 2 classes (authentic and forged) for our dependent variable, we can separate these into two different columns. Let’s rename `Class`

to `Authentic`

, and create a new `Forged`

column.

# Rename 'Class' to 'Authentic'bank_notes = bank_notes.rename(columns={'Class': 'Authentic'})# 'Forged'bank_notes.loc[bank_notes['Authentic'] == 0, 'Forged'] = 1

bank_notes.loc[bank_notes['Authentic'] == 1, 'Forged'] = 0

**Independent and dependent variables**

Our `X`

will be the scaled features, and our `y`

will be both the `Authentic`

and `Forged`

attributes. Since Numpy arrays are compatible with TensorFlow, we can convert `X`

and `y`

into Numpy arrays using `.as_matrix()`

.

# X and yX = scaled_features

y = bank_notes[['Authentic', 'Forged']]# Convert X and y to Numpy arraysX = X.as_matrix()

y = y.as_matrix()

**Training data and test data**

Now that we have our independent and dependent variables, let’s use Scikit-learn’s train_test_split to split our data into a training and a test set. We will use **20%** of the original dataset for testing.

`X_train, X_test, y_train, y_test = `**train_test_split**(X, y, test_size=**0.2**, random_state=42)

Let’s also print out the shapes of `X_train`

, `X_test`

, `y_train`

, and `y_test`

. It will help us when defining parameters for our neural network.

`TRAINING SET SHAPES`

X_train : (1097, 4)

y_train : (1097, 2)

TEST SET SHAPES

X_test : (275, 4)

y_test : (275, 2)

**Parameters**

Before setting up our neural network, it is important to define the parameters for our model. We may want to adjust these a bit later on, depending on how our model performs. Let’s first set the learning rate, the number of training epochs, and the batch size.

The **learning rate** of the model is a value between 0 and 1. It can be thought of as a measure of how quickly our model abandons old beliefs for new ones. A high rate means that the network changes its mind more quickly, and a lower rate means that it is reluctant to change. Here we will choose a learning rate of 0.01.

**One epoch means one pass of the training set.** We want our model to go through the training set more than just once, to improve accuracy. However it is important to note that a very high number of epochs results in the risk of overfitting. Overfitting reduces the performance of our neural net on unseen data. Let’s set the number of training epochs to 100.

Finally, we can set the **batch size** to 100. We will be using batch learning, and a batch size of 100 means that we will update our weights using back-propagation after every 100 predictions.

`learning_rate = 0.01`

training_epochs = 100

batch_size = 100

It is also important to set the parameters for our network, and not just for the training. This includes the number of nodes for each layer in our model (namely the input layer, the hidden layer(s), and the output layer).

**n_hidden_1 **= 4 *# # nodes in first hidden layer*

**n_hidden_2 **=** **4 *# # nodes in second hidden layer*

**n_input **= 4 *# input shape*

**n_classes **= 2 *# total classes (authentic / forged)*

**n_samples **=** **X_train**.shape**[0] *# # samples*

**TensorFlow graph input**

Now that we’ve defined our parameters, let’s define the inputs we will feed into our TensorFlow graph. `x`

and `y`

can be defined as matrix (or tensor) placeholders.

`x = tf.`**placeholder**(tf.**float32**, [**None**, n_input])

y = tf.**placeholder**(tf.**float32**, [**None**, n_classes])

Note:‘None’ means that the first dimension can be of any size.

**Weights and biases**

Next, we need to define the weights and bias for each layer in our network. We will create dictionaries of weights and biases using the parameters we’ve already defined.

weights = {

'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),

'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),

'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))

}biases = {

'b1': tf.Variable(tf.random_normal([n_hidden_1])),

'b2': tf.Variable(tf.random_normal([n_hidden_2])),

'out': tf.Variable(tf.random_normal([n_classes]))

}

Our network will have **3 layers** (2 hidden layers and an output layer, excluding the input layer).

We can set the predictions to be a tensor called `preds`

, and it will contain the output from our neural network.

`preds = `**multilayer_perceptron**(x, weights, biases)

**Cost and optimisation**

Let’s use a softmax cross-entropy function for calculating the loss, and the adam optimiser to minimise cost.

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=preds))optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

**Constructing our neural network**

We’re finally ready to set up our neural network! We will create a function that accepts the input `x`

, a dictionary of weights, and a dictionary of biases. Let’s use the ReLU activation function for each layer.

`def `**multilayer_perceptron**(x, weights, biases):

* '''*

*x**: Placeholder for data input*

*weights**: Dictionary of weights*

*biases**: Dictionary of biases*

'''

layer_1 = tf.**add**(tf.**matmul**(x, **weights**['h1']), **biases**['b1'])

layer_1 = tf.nn.**relu**(layer_1)

layer_2 = tf.**add**(tf.**matmul**(layer_1, **weights**['h2']), **biases**['b2'])

layer_2 = tf.nn.**relu**(layer_2)

out_layer = tf.**matmul**(layer_2, **weights**['out'] + **biases**['out'])

**return** out_layer

We have finally set up the data flow graph. It is now time to train our model!

**Training the network**

In TensorFlow, graphs aren’t executed unless a Session is created and run. The session allocates resources for the graph, and holds the actual values of intermediate results and variables.

Let’s have two loops:

- The
**outer loop**runs the**epochs**, and - The
**inner loop**runs the**batches**for each epoch.

After each epoch, we can print out the cost and append it to a list of costs. The way we can plot a line graph after training to visualise how our cost has been minimized.

sess = tf.InteractiveSession()

sess.run(tf.initialize_all_variables())costs= []forepochin range(training_epochs):avg_cost= 0.0

total_batch= int(n_samples/batch_size) forbatchin range(total_batch): batch_x = X_train[batch*batch_size : (1+batch)*batch_size]

batch_y = y_train[batch*batch_size : (1+batch)*batch_size] _, c = sess.run([optimizer,cost], feed_dict={x:batch_x, y:batch_y})avg_cost+= c / total_batch

costs.append(avg_cost)

Here’s the output:

**Output:**

Epoch: 1 cost=1.0476

Epoch: 2 cost=0.6022

Epoch: 3 cost=0.4642

...

Epoch: 98 cost=0.0009

Epoch: 99 cost=0.0008

Epoch: 100 cost=0.0008

Model has completed 100 epochs of training.

Below is a graph of the cost over time, created using the list of costs.

**Model evaluation**

Our model has now been trained! To see how well it performs on the test set, let’s count the number of correct predictions on the test set. We can then define the accuracy as the mean percentage of correct predictions.

`correct_predictions = tf.`**cast**(tf.**equal**(tf.**argmax**(preds, 1), tf.**argmax**(y, 1)), tf.**float32**)

To get the accuracy, we have to use the `.eval()`

method and pass in a dictionary for the placeholders `x`

and `y`

.

accuracy = tf.reduce_mean(correct_predictions)eval(feed_dict={x:X_test, y:y_test}))Output:1.0

**Wow, it looks like our model has achieved a 100% accuracy with the test set! **Maybe the dataset was a little easy for our model to classify. Although we performed well, it’s important to take a step back and think about what may have caused an accuracy this high.

**Comparing models**

Since our neural network was pretty much spot on with its predictions, it’s important that we compare it with another model for a reality check.

We will use a random forest classifier. Let’s train it on the same dataset, and store the predictions in a separate dataframe called `preds_rfc`

.

rfc =RandomForestClassifier(n_estimators=10)

rfc.fit(X_train, y_train)preds_rfc = rfc.predict(X_test)

Next, let’s evaluate our predictions using a classification report and a confusion matrix.

`print(`**classification_report**(y_test, preds_rfc))

# Get only the 'Forged' column values from y_test and preds_rfcy_test_forged= [item[1]foriteminy_test]preds_rfc_forged= [item[1]foriteminpreds_rfc]# Print confusion matrixprint(confusion_matrix(y_test_forged, preds_rfc_forged))Output:[[125 2]

[ 1 147]]

The random forest classifier was able to achieve a 99% accuracy (only 1% lower than our neural net), so it’s safe to conclude that our dataset was probably just easy to classify.

**References**

Links to the primary sources I used are linked below:

**Dataset obtained from:**UCI Banknote Authentication