A Simple Neural Network Classifier using PyTorch, from Scratch

Jeril Kuriakose
Analytics Vidhya
Published in
4 min readJan 31, 2022
Neural network
Source: https://i0.wp.com/msalimali.com/wp-content/uploads/2019/06/connected-artificial-neural-network-nodes-msalimali.jpg?ssl=1

In this article we will buld a simple neural network classifier model using PyTorch. In this article we will cover the following:

  • Step 1: Generate and split the data
  • Step 2: Processing generated data
  • Step 3: Build neural network classifier from scratch
  • Step 4: Training the neural network classifier
  • Step 5: Saving the trained model
  • Step 6: Loading the saved model
  • Step 7: Testing the trained model

Dependencies

  • PyTorch v1.10.0
  • Scikit-learn v1.0.2
  • Numpy v1.19.5

Step 1: Generate and split the data

Lets make or generate our classification dataset using Scikit-learn

from sklearn.datasets import make_classificationX, Y = make_classification(
n_samples=100, n_features=4, n_redundant=0,
n_informative=3, n_clusters_per_class=2, n_classes=3
)

We generate only very few samples 100 , this can be increased by changing the n_samples parameter

Next let’s split the data into training and testing. 33 % of the data is used for testing.

from sklearn.model_selection import train_test_splitX_train, X_test, Y_train, Y_test = train_test_split(
X, Y, test_size=0.33, random_state=42)

Step 2: Processing generated data

Once after getting the training and testing dataset, we process the data using PyTorch Dataset and DataLoader . Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.

import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
class Data(Dataset): def __init__(self, X_train, y_train):
# need to convert float64 to float32 else
# will get the following error
# RuntimeError: expected scalar type Double but found Float
self.X = torch.from_numpy(X_train.astype(np.float32))
# need to convert float64 to Long else
# will get the following error
# RuntimeError: expected scalar type Long but found Float
self.y = torch.from_numpy(y_train).type(torch.LongTensor)
self.len = self.X.shape[0]

def __getitem__(self, index):
return self.X[index], self.y[index]
def __len__(self):
return self.len

We created a classes inheriting the properties of torch.utils.data.Dataset . The training data is then passed as the following:

traindata = Data(X_train, Y_train)

Now the training data can be easily accessed using index:

print(traindata[25])
'''
# Output
(tensor([-0.9528, 1.6890, -0.6810, 0.7165]), tensor(1))
'''

We can also slice the training data as follows:

print(traindata[25:34])
'''
# Output
(tensor([[-0.9528, 1.6890, -0.6810, 0.7165],
[ 0.6994, 4.5166, 0.5078, -2.0575],
[ 0.8508, 1.6109, 0.3014, 0.9455],
[ 1.1293, -0.8988, 1.6426, -0.0171],
[-0.2316, 1.9337, -0.9727, -0.1864],
[-1.0156, 1.1438, -0.0883, 0.6976],
[ 1.2509, -1.6992, 1.8562, -1.7159],
[ 1.1714, 0.9062, -1.5627, -0.5184],
[-1.1780, -2.7274, -1.0570, 1.9610]]),
tensor([1, 2, 0, 0, 1, 1, 0, 0, 1]))
'''

Next we load the trainingdata using the DataLoader , we set batch_size to 4.

batch_size = 4
trainloader = DataLoader(traindata, batch_size=batch_size,
shuffle=True, num_workers=2)

Step 3: Build neural network classifier from scratch

Now lets build our neural network classifier

import torch.nn as nn# number of features (len of X cols)
input_dim = 4
# number of hidden layers
hidden_layers = 25
# number of classes (unique of y)
output_dim = 3
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.linear1 = nn.Linear(input_dim, hidden_layers)
self.linear2 = nn.Linear(hidden_layers, output_dim)
def forward(self, x):
x = torch.sigmoid(self.linear1(x))
x = self.linear2(x)
return x

We can initilize the classifier by just invoking it:

clf = Network()

We can also list the network parameters as the following:

print(clf.parameters)
'''
# Output
<bound method Module.parameters of Network(
(linear1): Linear(in_features=4, out_features=25, bias=True)
(linear2): Linear(in_features=25, out_features=3, bias=True)
)>
'''

Next lets define our loss function and the optimizer

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(clf.parameters(), lr=0.1)

Step 4: Training the neural network classifier

Now we are all set for our training, let code our training :

epochs = 2
for epoch in range(epochs):
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
inputs, labels = data
# set optimizer to zero grad to remove previous epoch gradients
optimizer.zero_grad()
# forward propagation
outputs = clf(inputs)
loss = criterion(outputs, labels)
# backward propagation
loss.backward()
# optimize
optimizer.step()
running_loss += loss.item()
# display statistics
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.5f}')

For demonstation purpose I am training for 2 epochs, it can be changed as required. The output will look like the following:

[1,    17] loss: 0.00522
[2, 17] loss: 0.00508

Step 5: Saving the trained model

Now lets save our trained model:

# save the trained model
PATH = './mymodel.pth'
torch.save(clf.state_dict(), PATH)

Step 6: Loading the saved model

The locally saved model can be then loaded for inference, using the following:

clf = Network()
clf.load_state_dict(torch.load(PATH))
'''
# Output
<All keys matched successfully>
'''

Step 7: Testing the trained model

Once the model is loaded, we can test our trained model. Lets test for a single mini-batch.

testdata = Data(X_test, Y_test)
testloader = DataLoader(testdata, batch_size=batch_size,
shuffle=True, num_workers=2)

Get a single mini-batch from the DataLoader

dataiter = iter(testloader)
inputs, labels = dataiter.next()

The test inputs will look like the following:

print(inputs)
'''
# Output
tensor([[ 1.6876, -1.2382, 1.5971, -2.2628],
[ 0.7683, 0.3534, 0.0460, -1.2109],
[-1.0097, 1.1584, -0.0593, 0.7738],
[ 1.7332, 0.1764, 0.5259, -2.3073]])
'''

The test labels will look like the following:

print(labels)
'''
# Output
tensor([0, 2, 1, 0])
'''

Now lets do the inference

outputs = clf(inputs)
__, predicted = torch.max(outputs, 1)
print(predicted)
'''
# Output
tensor([0, 0, 1, 0])
'''

Looks like our code is working as expected, lets do the inference for the entire test dataset.

correct, total = 0, 0
# no need to calculate gradients during inference
with torch.no_grad():
for data in testloader:
inputs, labels = data
# calculate output by running through the network
outputs = clf(inputs)
# get the predictions
__, predicted = torch.max(outputs.data, 1)
# update results
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f'Accuracy of the network on the {len(testdata)} test data: {100 * correct // total} %')
'''
# Output
Accuracy of the network on the 33 test data: 75 %
'''

The model can be further changed to improve the accuracy.

Happy Coding !!!

--

--