Demystifying Neural Networks: Self-Organizing Maps

The Art of Clustering

5 min readJan 27, 2024

Source: https://www.superdatascience.com/blogs/the-ultimate-guide-to-self-organizing-maps-soms

This article is part of the series Demystifying Neural Networks.

Introduction

In the ever-evolving world of Machine Learning (ML), Self-Organizing Maps (SOMs) emerge as a fascinating and unique approach to unsupervised learning. Often surrounded by a shroud of complexity, this blog post aims to demystify SOMs, explaining what they are, why they are used, how they work, and provides a hands-on example of clustering MNIST images using NumPy.

What are Self-Organizing Maps?

Self-Organizing Maps, also known as Kohonen maps (after their inventor, Teuvo Kohonen), are a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples. SOMs are different from other artificial neural networks in the sense that they use a neighborhood function to preserve the topological properties of the input space.

Why Use Self-Organizing Maps?

The key reasons to use SOMs in ML include:

1. Dimensionality Reduction: SOMs can reduce the dimensions of complex data while preserving the most important topological and metric relationships of the primary data, making the data easier to visualize and interpret.
2. Feature Mapping: They help in mapping high-dimensional data into simpler geometric relationships in a lower-dimensional space.
3. Clustering and Data Segmentation: SOMs can be used for clustering complex datasets, revealing patterns that might not be immediately apparent.
4. Data Visualization: They provide a means to visualize high-dimensional data in lower-dimensional spaces.

How Do Self-Organizing Maps Work?

The basic steps involved in the working of a SOM are:

Initialization: Start by creating a grid of nodes, each representing a neuron. Each node is initialized with a weight vector randomly assigned or taken from a sample of the input data.
2. Competition: For each input vector, the distances between the input and all weight vectors are computed. The neuron with the weight vector most similar to the input vector (usually in terms of Euclidean distance) is declared the winner.
3. Cooperation: The neighborhood of the winning neuron is identified using a neighborhood function, typically a Gaussian or Mexican hat function. The size of the neighborhood decreases over time.
4. Adaptation: The weight vectors of the winning neuron and the neurons in its neighborhood are adjusted to make them more similar to the input vector. This process is repeated for each input vector for a number of iterations.

Example: Clustering MNIST Images

Now, let’s dive into a practical example. We will use the MNIST dataset, which consists of 28x28 pixel images of handwritten digits (0–9). Our goal is to cluster these images using a SOM implemented with NumPy.

import numpy as np
from keras.datasets import mnist
import matplotlib.pyplot as plt

# Load MNIST data
print('Loading data...')
(x_train, _), (x_test, y_test) = mnist.load_data()
x_train = x_train[:1000]

# Normalize and flatten the data
x_train = x_train.astype('float32') / 255.
print('x_train shape:', x_train.shape)
x_train = x_train.reshape((len(x_train), -1))  # Flatten each 28x28 image into a 784-dimensional vector
print('x_train reshaped:', x_train.shape)

x_test = x_test.astype('float32') / 255.
x_test = x_test.reshape((len(x_test), -1))  # Flatten each 28x28 image into a 784-dimensional vector

# Normalize and flatten the data
x_train = x_train.astype('float32') / 255.
print('x_train shape:', x_train.shape)
x_train = x_train.reshape((len(x_train), -1))  # Flatten each 28x28 image into a 784-dimensional vector
print('x_train reshaped:', x_train.shape)

# Define the model
class SOM:
    def __init__(self, m, n, dim, epochs, alpha, sigma):
        # Initialize the dimensions of the grid, input dimension, learning parameters
        self.m = m  # Number of rows in the SOM
        self.n = n  # Number of columns in the SOM
        self.dim = dim  # Dimension of the input vectors
        self.epochs = epochs  # Number of iterations for training
        self.alpha = alpha  # Initial learning rate
        self.sigma = sigma  # Initial neighborhood radius
        self.weights = np.random.random((m * n, dim))  # Initialize weights randomly

    def find_bmu(self, x):
        # Find the best matching unit for a given vector, x
        bmu_index = np.argmin(np.linalg.norm(self.weights - x, axis=1))
        return np.array([bmu_index // self.n, bmu_index % self.n])

    def update_weights(self, x, bmu, t):
        # Update the weights of the SOM neurons
        learning_rate = self.alpha * np.exp(-t / self.epochs)  # Decay learning rate
        sigma_decay = self.sigma * np.exp(-t / self.epochs)  # Decay neighborhood radius
        for i in range(self.m):
            for j in range(self.n):
                neuron_pos = np.array([i, j])
                distance = np.linalg.norm(neuron_pos - bmu)  # Calculate distance from BMU
                if distance <= sigma_decay:
                    influence = np.exp(-distance**2 / (2 * sigma_decay**2))  # Calculate influence
                    self.weights[i*self.n+j] += learning_rate * influence * (x - self.weights[i*self.n+j])  # Update weights

    def train(self, data):
        # Train the SOM with the given data
        for epoch in range(self.epochs):
            if epoch % 10 == 0:
              print(f"Training epoch {epoch}/{self.epochs}")
            for x in data:
                bmu = self.find_bmu(x)  # Find the BMU for each sample
                self.update_weights(x, bmu, epoch)  # Update weights based on the BMU

    def map_vects(self, data):
        # Map each input vector to the closest neuron in the SOM grid
        bmu_list = []
        for x in data:
            bmu = self.find_bmu(x)
            bmu_list.append(bmu)
        return bmu_list

# SOM parameters
som = SOM(m=20, n=20, dim=784, epochs=100, alpha=0.3, sigma=10.0)

# Train SOM
print("Starting training...")
som.train(x_train)
print("Training completed.")

# Visualize the trained SOM
plt.imshow(som.weights.reshape(20, 20, 28, 28).transpose(0, 2, 1, 3).reshape(20*28, 20*28))
plt.title('Trained SOM on MNIST dataset')
plt.show()

# Define a function to visualize the test sample and its mapping on SOM
def visualize_test_sample(test_sample, label, mapped_position, som_weights):
    plt.figure(figsize=(10, 5))

    # Display the test sample image
    plt.subplot(1, 2, 1)
    plt.imshow(test_sample.reshape(28, 28), cmap='gray')
    plt.title(f"Test Sample (Digit: {label})")
    plt.axis('off')

    # Display the SOM and mark the position of the test sample
    plt.subplot(1, 2, 2)
    plt.imshow(som_weights.reshape(20, 20, 28, 28).transpose(0, 2, 1, 3).reshape(20*28, 20*28))
    plt.title('SOM with Test Sample Mapped')
    plt.scatter(mapped_position[1]*28 + 14, mapped_position[0]*28 + 14, color='red', s=50)  # Mark the position
    plt.axis('off')

    plt.show()

# Evaluate and visualize a few test samples
test_samples = 10  # Number of test samples to evaluate
for i in range(test_samples):
    test_sample = x_test[i]
    test_label = y_test[i]
    mapped_position = som.map_vects([test_sample])[0]
    visualize_test_sample(test_sample, test_label, mapped_position, som.weights)

We can see that different parts of the SOM map formed clusters based on similarity, and the test sample for digit 7 was correctly mapped to the right position on the SOM map.

Conclusion

Self-Organizing Maps offer a unique approach to understanding and clustering high-dimensional data. They are particularly useful in scenarios where the preservation of the topological and metric structure of the data is crucial. With this guide, we hope you have gained a clearer understanding of SOMs and their practical application in a project like clustering MNIST images. Happy exploring!

References

https://www.superdatascience.com/blogs/the-ultimate-guide-to-self-organizing-maps-soms