Precision is defined as the number of true positives divided by the number of true positives plus the number of false positives. False positives are cases the model incorrectly labels as positive that are actually negative, or in our example, individuals the model classifies as terrorists that are not. While recall expresses the ability to find all relevant instances in a dataset, precision expresses the proportion of the data points our model says was relevant actually were relevant.
As I had mentioned in Part I, the core skill sets required of a PM do not change whether you work in a machine learning driven solution space or not. Product managers typically use five core skills — customer empathy/design chops, communication, collaboration, business strategy and technical understanding. Working on ML will continue to leverage these skills. One area that does get stretched more is technical understanding, specifically of the machine learning space. It’s not to say that you cannot be a ML PM unless you have deep technical chops. But you do need to understand how a machine learning system operates in order to make good product decisions. You can lean on your engineers or shore up your knowledge through books and courses, but if you don’t have a good understanding of the system, your product may lead to bad outcomes.
Back-Propagation — After forward propagation we get an output value which is the predicted value. To calculate error we compare the predicted value with the actual output value. We use a loss function (mentioned below) to calculate the error value. Then we calculate the derivative of the error value with respect to each and every weight in the neural network. Back-Propagation uses chain rule of Differential Calculus. In chain rule first we calculate the derivatives of error value with respect to the weight values of the last layer. We call these derivatives, gradients and use these gradient values to calculate the gradients of the second last layer. We repeat this process until we get gradients for each and every weight in our neural network. Then we subtract this gradient value from the weight value to reduce the error value. In this way we move closer (descent) to the Local Minima(means minimum loss).