Goodhart’s Law — and how Google breaks it
The historic white-paper opens with:
The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages.
Indeed, much can be said — and is — in the PageRank white paper. Much can be done too, and was — by Sergey Brin and Larry Page — by building that patent into Google. And much (money) can be made — and continues to be — after another algorithm, AdWords, enabled the monetization of PageRank. And not just by Brin and Page: an entire industry, SEO, sprang up around gaming PageRank. All thanks to Goodhart’s Law.
Named for economist Charles Goodhart, this “law” isn’t new, but has renewed and ever-increasing importance in an algorithmically controlled world. It says:
Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes
Written differently: “When a measure becomes a target, it ceases to be a good measure.”
Things go even further sideways when quantitative replaces qualitative. For example, running a mile in the least time makes someone the fastest miler — not the best runner. Yet humans make that spurious correlation regularly.
Google’s objective, quantitative measure of a pages importance was not only better than prior search engines like Alta Vista and Lycos, but in users’ eyes, its ranking of inherently subjective material proved extremely useful. Context is incredibly valuable (pre-Web, the most profitable publications were TV Guide and major Yellow Pages). Because PageRank provided more useful context than other search engines, Google earned trust that others couldn’t.
Even so, while PageRank was good, it wasn’t good enough to build a sustainable business around. “For the first two years of Google we were cold-calling people, trying to get them to buy keywords,” recalls an early Google salesperson. Google’s cash-bleed was worrisome — but before it became critical, Google found the missing element. AdWords — inspired by Google rival Goto.com — enabled the ad-auction model that reinforced PageRank and cemented Google as one of history’s most important companies. With that combination, Google transcended trust — defying Goodhart’s law to achieve the ultimate status: Authority.
Authority is rarely earned. It's usually granted — sometimes usurped — and leads to absurd proofs of Goodhart’s Law. My personal favorite is a Soviet-era factory that produced useless, twenty-ton tractors — the result of a total-weight-based production quota. A story heard over vodka in a Moscow pub, I’ve retold it many times in the process of defining or assessing KPI’s. Maybe it’s apocryphal — while researching Goodhart, I read a similar tale about a Russian factory manufacturing giant nails. Even if apocryphal, it’s still plausible, and baked into any sales quota. As one example, a tiny tweak to customer acquisition bonuses unleashes the creativity in ACN’s 250,000 sales reps to achieve the new target via any means possible.
Traditionally, directory listings follow Goodhart’s law in their own strange way (at its foundation, Google is a directory). Yellow Pages transform an irrelevant descriptor — company name — into a quantitative ranking metric (alphabetical order), leading to the absurd glut of business names starting with AAA — AAA Printing, AAA Vacuum Repair, AAA TV and Radio. As of 3IR directories fade away, do does this artifact of gaming alphabetical ranking.
PageRank’s initial magic was sorting by relevance — algorithmically, rather than alphabetically — returning far better results than name. Google’s extraordinary brilliance is enhancing both monetization and ranking quality by linking them together. To earn high quality scores, AdWords ads must drive clicks. High quality scores earn higher placement — higher placement drives clicks. Fueling this self-reinforcing loop are continual lockstep improvements in search quality and ad targeting precision. The hyper-growth of the Web combined with inherent network effects has created awesome growth for many firms — but Google’s unique flywheel effect compounded growth of traffic and profit to an unprecendted level.
The monetary value of a high PageRank puts it in the crosshairs of Goodhart’s law, since the market over-allocates resources to pursue good PageRank scores. In Google’s case, this created an entire industry: SEO, which is a two-sided coin. Good SEO combines with the purest measure of a market — spending — to improve search results. With a focus on quality over profit, Google used revenue from legitimate ads to fund R&D spending. The flip-side is “Black Hat” SEO — for a time, spammy mediation via SPLOGs and other devious advertising and e-commerce arbitrage threatened Google’s virtuous cycle. Google’s response — the Panda update — restored trust and created sufficient momentum to defy Goodhart. It propelled Google to Authority status for web search. As a result, for search purposes, PageRank has become synonymous with web-page quality.
Adwords generated over 90% of Alphabets’ $9 Billion 1Q18 profit.
PageRank defies Goodhart because of the webs astronomical growth, its enormous scale, and the speed and sophistication of Googles machine learning. An example: a month ago I Googled “aircraft coffee maker cost,” and the 3rd-ranked page was a cheesy 2012-ish SPLOG with links to home coffee makers. A week later the SPLOG was gone. And the results now include this informative article on how coffee-maker malfunctions are causing commercial flight delays.
A different side of Goodhart’s Law is Yelp, who’s algorithmic methods are less transparent and more controversial.
Most people know (I think) that a Yelp search returns a list that is default-ordered by Yelp’s “Best Match” — the order Yelp wants you to see, with no representation of quality rank, or any transparent ranking method. Advertising is probably a factor in high Yelp search ranking. Many believe that advertising also plays a role in aggregated star-rating, and also in which reviews are hidden. Yelp says otherwise, and so far lawsuits have failed to prove this alleged link. Whatever the motivation, Yelp illustrates the moral peril of algorithmic decision-making in a barbell-shaped world: a single bad Yelp review can shut a small business down.
The Shed at Dulwich illustrates the peril of algorithmically gaming Goodhart by taking it to an absurd degree. Over six months, the establishment climbed to become the top restaurant, on the worlds top travel review site, in one of the world’s top cities: number 1 of 18,092 London restaurants ranked by TripAdvisor.
Yet The Shed at Dulwich doesn’t exist — never did.
This fun joke and publicity stunt highlights the “danger zone” created by Goodhart’s Law and algorithmic amplification — ready to be exploited when the right skills combine with the wrong incentives:
Increasing reliance on ranking scores (and the underlying content) creates increasing incentive to game the system — destroying the original correlation between ratings and quality. Enormous scale and momentum has allowed web-giants to survive this quality breakdown. So far anyway.
Google’s motto is “do no evil.” When it comes to web search, they’re a (relatively) benign dictator. If it’s s a difficult position, it’s made far easier by the mind-boggling cashflow generated from successfully harnessing Goodhart’s law.