Demystifying Neural Networks: ResNet

The Fast Lane for Deep Learning

Dagang Wei
4 min readMar 16, 2024
Source

This article is part of the series Demystifying Neural Networks.

Introduction

Deep neural networks are remarkable. They power everything from smart image recognition software to the voice assistants on our phones. Yet, there was a time when creating really deep neural networks — those with many layers — was incredibly difficult. That all changed with the advent of ResNet.

What is ResNet?

ResNet, short for Residual Network, is a type of deep neural network architecture introduced in 2015. It solved a crucial problem that prevented the creation of extremely deep networks and unleashed a new era in computer vision and other deep learning tasks.

Why Does ResNet Matter?

  • The Vanishing Gradient Problem: Picture neural networks like a complex assembly line. Early information is processed and passed down the line through multiple stages (layers). Unfortunately, as networks got deeper, the information passed down the line could become diluted, like a game of telephone gone wrong. This is the vanishing gradient problem — it made training deep networks very difficult, often leading to worse performance, not better.
  • ResNet to the Rescue: ResNet introduced a clever solution: shortcut connections. Think of these as bypass routes within the assembly line. These shortcuts let information skip layers, directly flowing from earlier stages to later ones. This preserves crucial details and helps combat the vanishing gradient problem.
  • Better Performance: ResNet’s innovation meant that deeper networks could finally be trained effectively. And, as it turns out, deeper networks often lead to significantly better results in complex tasks like image classification, object detection, and even language processing.

How Does ResNet Work?

Let’s break down ResNet’s key components:

  • Building Blocks: ResNet uses ‘residual blocks.’ These are like mini neural networks within the larger ResNet. Each block has a couple of layers for processing and the shortcut connection that allows information to hop over it.
  • The Shortcut Advantage: Instead of expecting each block to learn everything, ResNet lets the block focus on what’s different compared to the information coming via the shortcut. It’s like learning only the changes on top of what you already know, making things more efficient.
  • Stacking it Up: ResNet lets you easily build very deep networks by stacking many residual blocks. More blocks often mean richer representations and the ability to understand more intricate patterns in data.

Impact and Beyond

ResNet revolutionized the field of computer vision. It powered landmark image recognition models, enabling breakthroughs in tasks like accurately identifying thousands of different objects. Its influence extends to many other domains within deep learning where tackling complex problems demands very deep network architectures.

Example: Image Classification with ResNet

Here’s an example Keras code implementing a basic ResNet block and a small ResNet architecture for image classification. This is a simplified example for demonstration purposes. Real-world ResNets use more complex architectures with techniques like bottleneck blocks for efficiency. The code is available in this colab notebook.

from tensorflow import keras
from keras import layers
from keras.datasets import cifar10
from keras.utils import plot_model
from IPython.display import Image
import matplotlib.pyplot as plt

# Define the residual block
def residual_block(x, filters, kernel_size=(3, 3), strides=(1, 1)):
y = layers.Conv2D(filters, kernel_size, strides=strides, padding="same")(x)
y = layers.BatchNormalization()(y)
y = layers.Activation("relu")(y)

y = layers.Conv2D(filters, kernel_size, padding="same")(y)
y = layers.BatchNormalization()(y)

# Shortcut connection
shortcut = layers.Conv2D(filters, kernel_size=(1, 1), strides=strides, padding="same")(x)
y = layers.add([shortcut, y])
return y

# Define the ResNet model
def ResNet(num_classes, input_shape=(32, 32, 3)):
inputs = keras.Input(shape=input_shape)
x = layers.Conv2D(64, (7, 7), strides=(2, 2), padding="same")(inputs)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.MaxPooling2D((3, 3), strides=(2, 2), padding="same")(x)

x = residual_block(x, 32)
x = residual_block(x, 64, strides=(2, 2))
x = residual_block(x, 128, strides=(2, 2))

x = layers.GlobalAveragePooling2D()(x)
outputs = layers.Dense(num_classes, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model

# Load and preprocess CIFAR-10 data
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
num_classes = 10
input_shape = (32, 32, 3)

X_train = X_train.astype("float32") / 255.0
X_test = X_test.astype("float32") / 255.0
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# Compile and Train
model = ResNet(num_classes, input_shape)
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))

# Evaluate the model
loss, accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", accuracy)

# Plot the model
plot_model(model, show_shapes=True, show_layer_names=True)

# Visualization of test data
num_images = 5 # Number of images to display
plt.figure(figsize=(10, 5)) # Adjust figure size as needed

for i in range(num_images):
index = i # Pick a random index if you want
image = X_test[index]
true_label = y_test[index].argmax() # Get the true label

# Predict the class
predictions = model.predict(image.reshape(1, 32, 32, 3))
predicted_label = predictions.argmax()

# Display the image and labels
plt.subplot(1, num_images, i+1)
plt.imshow(image)
plt.title(f"True: {true_label}, Pred: {predicted_label}")
plt.axis("off")

plt.show()

Output:

Test Accuracy: 0.7358999848365784

Conclusion

ResNet is an elegant solution to a fundamental problem hindering deep learning progress. ResNet’s legacy is undeniable — it paved the way for today’s cutting-edge AI models and fueled amazing technological advancements.

--

--