PageRank itself isn’t machine learning — it is a feature (a particularly predictive one) that is employed in a broader machine learning system that Google used and (in a different form) continues to use to power its search. The point is that Google’s privileged access to AOL and Yahoo’s data sets was one of the key factors that enabled it to become better at search than anyone else because it provided it with the highest-volume closed-loop learning system where clicks could be used to verify search relevance and thus determine the quality of its output. Google was super-aggressive in buying similar privileged access wherever it could — with Myspace, in embedded search distributions, etc. — because it wanted the highest possible volume of data to learn from, which one would expect in any machine learning-driven system and which wouldn’t be necessary if all there was to Google search was PageRank.

