From Singular Value decomposition (SVD) in recommender systems for Non-math-statistics-programming… by Maher Malaeb

… From a high level perspective SGD starts by giving the parameters of the equation we are trying to minimize initial values and then iterating to reduce the error between the predicted and the actual value each time correcting the previous value by a small factor. This algorithm uses a factor called learning rate γ which determines the ratio of the old value and the new computed value after each iteration. Practically, when using high γ one might skip the optimal solution whereas when using low γ values a lot of iterations are needed to reach optimal value (more here)