Google's secret algorithm exposed via leak to GitHub…

The Secret Behind Google’s Search Ranking Algorithm Revealed

AurelCode
4 min readJun 1, 2024

One of the most tightly held secrets in all technology is how the Google search ranking algorithm actually works. If the secret ever got out, Google would implode because SEO experts would get every keyword to link to a landing page for fake Viagra pills.

Unfortunately, Google accidentally pushed thousands of documents to GitHub, a website owned by their Bing rival Microsoft, that provide an unprecedented look behind the curtain of Google search. As a bit of an SEO Guru myself, I was left shocked and utterly devastated when I found out that Google has not been totally honest about the algorithm.

The Anatomy of a Large Scale Hyper Textural Web Search Engine

When Google was founded in the late ’90s by Larry and Sergey at Stanford, it was all based on the idea that a search engine could be handled entirely with an algorithm, which at the time was a radical idea that differed from search engines like Altavista and Yahoo, which relied on unscalable human curation.

They wrote a legendary paper called “The Anatomy of a Large Scale Hyper Textural Web Search Engine” that detailed something called the PageRank algorithm. Every web page has an initial Rank, and that ranking grows and improves based on the number of high-quality incoming backlinks. This worked pretty well at first, but eventually SEO gurus realized that all you had to do was spam a bunch of backlinks with the anchor text of your keyword to dominate the extremely valuable top search result placement.

The Algorithm Has Become More Complex

However, over the years, the algorithm has become more complex. Nowadays, you actually have to make really good content to get the top ranking. But that’s too hard, and SEO gurus still need to put food on their families. Sadly, many of the statements Google has made about how the algorithm works appear to be lies.

The Leaked Code

It’s important to point out that although Google has confirmed that these documents are real, we still don’t really know exactly what they are. They could be internal training documents, they could be old and outdated, or it could be a false flag in Google’s 5D chess game to protect the algorithm.

Officially

Although Google has implied that these documents are out of context, outdated, and incomplete, another interesting point is that the leaked code uses the Elixir programming language, which is not a language that Google would normally use internally.

The True Lies in the Past

Google has denied the use of domain authority for ranking. However, in these documents, there’s a site Authority metric that seems to contradict that claim. Another highly suspicious thing Google has said in the past is that clicks are not a direct ranking factor. Well, we actually learned a while ago that that’s a fib.

Nav Boost or Glue

During Google’s antitrust lawsuit, which revealed a system called Nav Boost or Glue, and Aggregates a bunch of different interactions like clicks, hovers, Scrolls, swipes, etc., what’s a unicorn click? Nav Boost was confirmed once again in the leaked documents, which it defines as “click” and “impression signals for craps.” So, it looks like clicks are actually important.

Not Surprising

Another potential fib is that it looks like based on these documents that data collected from users in the Chrome browser affects search rankings. Not surprised.

Backlinks Still Matter

It’s not the simple PageRank algorithm that it used to be, but getting those high-quality backlinks is still important.

Finally

The most shockingly unsurprising thing is that actual humans are used for rating and whitelisting critical content fields like co-authority or election authority are used for this. And through my investigation, I also found this one called Web ref compact flat property value that appears to be hiding in the true shape of the earth.

The Real Tragedy

I’m no urologist, but overall, this leak looks pretty bad. I can’t believe a big corporation would lie to us. But the real tragedy here is the web itself. In the early days, Google was the best way to find interesting websites and forums created by random weirdos.

But Nowadays

The top rankings are almost entirely dominated by authoritative sites like Wikipedia and Reddit, in addition to paid advertisers. And it’s what is even the point of a website nowadays? AI is just going to summarize your website anyway.

And Never Get You a Click

Through SEO has been dead for a long time, but that’s not the end of the story. The real tragedy here is the web itself.

--

--

AurelCode

This account writes posts about the content of the best YouTube videos in the tech game.