Understand the Softmax Function in Minutes

Published in

Data Science Bootcamp

14 min readJan 30, 2018

Understanding Softmax in Minutes by Uniqtech

Learning machine learning? Specifically trying out neural networks for deep learning? You likely have run into the Softmax function, a wonderful activation function that turns numbers aka logits into probabilities that sum to one. Softmax function outputs a vector that represents the probability distributions of a list of potential outcomes. It’s also a core element used in deep learning classification tasks. We will help you understand the Softmax function in a beginner friendly manner by showing you exactly how it works — by coding your very own Softmax function in python.

If you are implementing Softmax in Pytorch and you already know Pytorch well, scroll down to the Deep Dive section and grab the code. Prefer watching a youtube video? Scroll down to the youtube video.

This article has gotten really popular: 5800+ claps. It is updated constantly. Latest update Jan 2020 added a TL;DR section for busy souls. Dec 2019 (Softmax with Numpy Scipy Pytorch functional. Visuals indicating the location of Softmax function in Neural Network architecture.) and full list of updates below. Your feedback is welcome! You are welcome to translate it and cite it. We would appreciate it if the English version is not reposted elsewhere. A link back is always appreciated. Comment below and share your links so that we can link to you in this article. Clap for us on…

Understand the Softmax Function in Minutes

Written by Uniqtech