Softmax Function Summary

LZP Data Science
Geek Culture
Published in
2 min readAug 7, 2022

--

  • The softmax function is an activation function in the final layer of a neural network.
  • It is a multi-category equivalent of a sigmoid function and is used whenever there are more than two outcomes (e.g. non-binary). The probabilities of the categories must sum up to 1.
  • Allows for predicting probabilities for n possible outcomes compared to a binary result.
  • In natural language processing (NLP), the softmax function will allow us to quantify the probability of proximity between each word in our corpus and the input word, as seen below.
Example of a neural network with a softmax function application
  • Softmax function equation
  • Defining a softmax function via Python is as follows:
def softmax(x): return exp(x)/exp(x).sum(dim=1, keepdim=True)
  • Taking the exponential ensures all the outcomes are positive, and dividing by the total sum ensures we get numbers that add up to 1.
  • Amplify the values exponentially. For example, if one of the probabilities is slightly…

--

--