Optimization algorithms: the Newton Method
A geometric interpretation with Python
Predictive Statistics and Machine Learning aim at building models with parameters such that the final output/prediction is as close as possible to the actual value. This implies the optimization of an objective function, which might be either minimized (like loss functions) or maximized (like Maximum Likelihood function).
The idea behind the optimization routine is starting from an initial random value of our target function’s variable, and then updating this value according to a given rule. The iteration stops when the distance between two values is approaching zero, or maybe after a maximum number of iterations arbitrarily decided.
In this article, I’m going to provide an intuitive explanation of the Newton method, proposing a geometric interpretation. The assumption behind this method is that our target function f(x), the one we want to optimize, is twice differentiable and f”(x) is not equal to zero. Here, we will be working with a smooth and regular polynomial:
import numpy as np
import math
import matplotlib.pyplot as plt
def f(x):
return x**3-6*x**2+4*x+2
x = np.linspace(-1, 1)
fig, ax = plt.subplots()
ax.plot(x, f(x), label='target')
ax.grid()