Softmax Function Summary
Published in
2 min readAug 7, 2022
- The softmax function is an activation function in the final layer of a neural network.
- It is a multi-category equivalent of a sigmoid function and is used whenever there are more than two outcomes (e.g. non-binary). The probabilities of the categories must sum up to 1.
- Allows for predicting probabilities for n possible outcomes compared to a binary result.
- In natural language processing (NLP), the softmax function will allow us to quantify the probability of proximity between each word in our corpus and the input word, as seen below.
- Softmax function equation
- Defining a softmax function via Python is as follows:
def softmax(x): return exp(x)/exp(x).sum(dim=1, keepdim=True)
- Taking the exponential ensures all the outcomes are positive, and dividing by the total sum ensures we get numbers that add up to 1.
- Amplify the values exponentially. For example, if one of the probabilities is slightly…