Self-Organizing Maps Explained

Becaye Baldé
5 min readMay 15, 2023

--

Photo by Andrew Neel on Unsplash

Self-organizing maps (SOMs), also known as Kohonen maps, are a type of artificial neural network that are used for clustering, dimensionality reduction, and more.

In the first part, I provide an overview of SOMs. In the second part, I explain how they learn. Finally, in the last part, I demonstrate how to train a SOM on the iris dataset. Feel free to skip ahead to the final section if you prefer.

TL;DR:

  • SOMs map high-dimensional input data onto a lower-dimensional grid.
  • SOMs can be used for dimensionality reduction, visualization, pattern recognition, anomaly detection, etc.
  • The code can be found here.

1. Overview

The goal of a SOM is to map high-dimensional input data onto a lower-dimensional grid while preserving the topological relationships — neurons that are spatially close on the map are also similar in terms of the input data they represent — between the input data points.

The output of a SOM is a 2D grid (of neurons) where each cell (neuron) is like a kid who knows one piece of the information. Kids that are close to each other in the grid know similar things. By looking at the big picture, we can see how all the kids relate to each other.

The following is a hexagonal SOM. Neurons that are close to each other have similar colors and represent similar countries.

Source: SEG Wiki

Notice how Canada and the USA are close to each other.

⚠️ Self-Organizing Maps are unsupervised, their neurons (nodes) are not connected and they do not use backpropagation.

Applications

We can use SOMs for:

  • Clustering: grouping similar entities.
  • Data Visualization: visualizing high dimensional space in 2D.
  • Anomaly detection: identifying uncommon behavior.
  • etc.

More concretely, we can use SOMs to:

  • Build recommender system: grouping similar movies, music, product reviews, etc.
  • Identify trends and patterns: find similar employees, customers, social media topics, stock sector performances, etc.
  • Detect outliers: Detect and prevent fraud, security breaches, and other abnormal behavior.

Limitations

SOMs have some limitations, such as:

  • Poor performance with categorical data.
  • Lack of interpretability.
  • Limited ability to handle high-dimensional data.
  • Limited generalization ability: not well-suited for unseen data.
  • Difficulty with non-linear relationships.

2. How do SOMs Learn?

A SOM is a grid of neurons. Each neuron has N weights, where N is the number of dimensions (features) in the data.
The training process can be summarized in seven steps:

  1. Assign random weights to the neurons.
  2. Select an input data point.
  3. Calculate the distance between the data point and each neuron.
  4. Identify the neuron with the smallest distance to the data point.

This neuron is referred to as the Best Matching Unit (BMU) or winning neuron. A small distance means that the neuron is very close to the data point.

5. Drag the winning neuron closer to the data point.

When you drag a neuron, its neighbors (in the radius) are also dragged. The closer the neuron, the more drastically it is dragged, resulting in a larger change in its weight vector.

6. Decrease the radius gradually, so that neurons farther away from the BMU are less affected by the updates.

The radius of the BMU: too big and should be reduced

A radius that is too small could lead to overfitting while a bigger one could underfit the data.

7. Repeat steps 2 to 7

In the resulting map, each winning neuron is assigned to a particular cluster based on its location and the distribution of the input data. Neurons that are close to each other in the SOM are likely to be assigned to the same cluster, reflecting similarities in the underlying data patterns.

Resulting SOM

3. Example on the iris dataset

Here is an example of using SOMs on the Iris dataset to classify different species based on their petal and sepal measurements.

You can check the notebook below which explains the code further.

Train the SOM

The Minisom package allows us to create SOMs easily:

  • (x, y) represent the dimensions of the map.
  • input_len is the number of features in the data

A recommended map size is: 5 * sqrt(number of samples). If you have 150 samples, the map size is 64 or 8x8.

som = MiniSom(x=map_height, y=map_width, input_len=n_features, sigma=1.5, learning_rate=0.5, 
neighborhood_function='gaussian', random_seed=123)

som.pca_weights_init(features)
som.train(data=features, num_iteration=1000, verbose=True)

When creating a SOM, we can use the distance map to visualize the results. The distance map is a 2D array with the same size as the grid (e.g. 12x12), where each element represents the average distance between a neuron and its neighbors.

u_matrix = som.distance_map().T
print(u_matrix)
[[0.081933 0.165542 0.177204 0.587590 0.667328 0.431389 0.356502 0.183952]
[0.124169 0.240682 0.338728 0.826771 1.000000 0.640208 0.617496 0.432710]
[0.144016 0.252570 0.461191 0.948480 0.854452 0.647242 0.627267 0.458241]
[0.138695 0.365867 0.675865 0.845140 0.637664 0.536530 0.560507 0.291140]
[0.192298 0.572329 0.831626 0.578588 0.394775 0.467025 0.422079 0.244352]
[0.428055 0.803671 0.713910 0.407951 0.385777 0.431611 0.434741 0.235023]
[0.513925 0.745877 0.520288 0.347792 0.403915 0.440234 0.439856 0.243294]
[0.226807 0.369797 0.231559 0.262100 0.241235 0.302932 0.252746 0.163932]]

We can visualize the results, with light shades indicating the clustering and darker shades representing the boundaries between clusters.
However, some may prefer to plot their distance map the other way around with dark shades representing the clusters instead.

plt.pcolor(u_matrix, cmap='bone_r')
plt.colorbar()
The red, green and orange dots represent the plant species “setosa”, “versicolor” and “virginica”

Notice the three different clusters.

Note the code above only shows the first plot, which is on the left.

--

--

Becaye Baldé

Becaye is a Junior Data Scientist with a Master's in AI. He loves discovering new things be it in tech or everyday life :)