How does Self Organizing Algorithm works?
A simple step- by-step guide with mathematical example
Self Organizing Map(SOM) proposed by Teuvo Kohonen is a data visualization technique. It helps to understand high dimensional data by reducing the dimensions of data to a map. And also it showcases clustering by grouping similar data together. This clustering ability was well received to after a variant of SOM, Growing Self Organizing ma was introduced.
We can summarize SOM as:
SOM is unique as it deviates from most generic ‘error-correction’ approach and rather follow competitive learning.
What actually happens?
Data points will be competing to get a representation in the Network. First the network is initialized with weights. A data from training datasets is randomly chosen and distance between nodes (weights) and sample vector is calculated. The shortest of the distances is considered the BMU. Then the neighborhood of winner node or BMU is found. Neighbor weights or nodes get updated. Repeat the process till the network is trained for every training data.
Best Matching Unit(BMU) : The winner node(or weight) which has the shortest distance from itself to the sample vector. There are numerous ways to determine the distance, however, the most commonly used method is the Euclidean Distance, is the most common metric to calculate distance between weights and sample vector.
Algorithmic Break Down
- Initiation (randomization)
- Competition (Select winner node)
- Cooperation (Identify neighborhood)
- Adaptation (Adapt weights of winners & neighbors)
- Smoothing (Reduce NR as iterations increase/ introduce an optional smoothing phase)
Parameters
R/C — No.of rows/columns
N- No.of Neurons
MaxIter — No. of iterations
MaxNR — Maximum radius of Grid
nr(t) — neighborhood radius
ර(t) — learning rate
A — no.of dimensions
Distance between Neuron i and j => D(wi,wj)
Input -> All_inputs={x1,x2….}
wk= all weights or nodes
Trying SOM algorithm for a particular data
Initial weights be w1 = (0.45,0.89) , w2 = (0.55,0.83) , w3 = (0.95,0.32) and w4 = (0.62,0.78). And let the four Neurons(N=4) N1, N2, N3 and N4 be located at the grid in Cartesian plane positions (0,0), (1,0) , (1,0) and (1,1) respectively. And suppose the network take 2-dimensional (A=2) input vector (x1,x2). Let w_i= (w1_i,w2_i) be the weight for the neuron i . nr = 0.6. Input vector (3,1). Assume learning rate = 0.5 and it is constant throughout. Assume initial weights as noted in the diagram.
1st Iteration
- Calculate neighborhood radius = > nr = 0.6 (since first iteration)
- Calculate learning rate => ර(t) = 0.5 (and constant)
3. Find the nearest neighbor or the winner node for the given vector input (use Euclidean distance)
d (V,N1) = (3–0.45)² + (1–0.89)² = 6.514
d (V,N2) = (3–0.55)² + (1–0.83)² = 6.03
d (V,N3) = (3–0.62)² + (1–0.78)² = 5.71
d (V,N4) = (3–0.95)² + (1–0.32)² = 4.66
4. Since d(V,N4) is the smallest distance the winner node is N4.
5. Calculate distance between winner node and other nodes
d(N4,N1) = sqrt ( (0.45–0.95)² + (0.89–0.3)²) = 0.75 (> nr)
d(N4,N2) = sqrt ( (0.55–0.95)² + (0.89–0.32)²) = 0.648 (> nr)
d(N4,N3) = sqrt ( (0.62–0.95)² + (0.89–0.78)²) = 0.566 (< nr)
d(N4,N4) = sqrt ( (0.95–0.95)² + (0.89–0.89)²) = 0 (<nr)
6. Choose nodes to Update weights
Since d(N4,N3), d(N4,N4) < nr => N3,N4 chosen for updating their weights
7. update their weights using
w(t+1) = w(t) + Q(t) * ර(t) (x-w(t))
Updating weight for N3:
Updating weight for N4:
In this blog have discussed a small example of how the weights are added for a single input. Manual calculation for every input and for each epoch is complex and hence have devoid of that. But Pretty hopeful that anything can be explained through an apt example and tried this. Hope the example gives more understanding of SOM algorithm and how it works. Happy to know of any issues here and would like to add more examples in the same blog if possible.