Medium’s Got a Large Problem

Ziemek Bućko
Onely
Published in
4 min readDec 4, 2019

Medium is a fashionable choice for content creators who are looking for an easy and accessible way to publish articles.

Although Medium is #152 in terms of popularity among content management systems (CMSs), it is used by 2.4% of the most popular websites, which indicates that it’s the medium of choice for large brands in particular.

At Onely, we too use Medium to publish long-form content, and it has served its purpose well. However, if we had to rely solely on Medium to get our content out there, we would probably choose a different CMS.

Why?

Because of the massive issues that Medium has with getting indexed by Google.

Our data speaks for itself: from a random sample of 1,011 articles extracted from Medium’s sitemap, 166 aren’t indexed.

That’s over 16%!

Unindexed URLs are the uncharted territory of your website. Exciting for some, but not very useful.

Managing the Sitemap

The sitemap is one of the primary ways for Google to find new links belonging to a website.

It’s a simple XML file that every website should have, and it contains all (as decided by the webmaster) of the important URLs on that website, as well as some basic additional information.

Googlebot, the crawling algorithm used by Google to map out the web, sets out every day to visit as many pages as it can. As well as following the links it finds in every crawled HTML file, it also consults the sitemap as a reference for which links the website owners consider important for it to crawl, and for Google to index.

When it comes to Medium, their sitemap is… well, it’s a hot mess.

For some reason, every comment made under every article that’s published on Medium exists under a separate URL, and all of those URLs are in their sitemap.

To put it mildly, putting all those additional links in the sitemap is a waste of Googlebot’s time.

A significant part of the resources assigned by Google to crawling Medium is spent crawling through pages that offer little value. Because of that, so many articles that should be prioritized for indexing are waiting in line behind a series of worthless comments.

One of the basic elements of technical SEO is helping business owners prioritize which parts of their pages should be indexed, and making sure that web crawlers reach them in a timely manner.

Watch Out if You’re Using JavaScript

Another problem that Medium has is with the “More from Medium” section — on the very bottom of every article page.

This section contains links to related articles that are generated with JavaScript.

For the user, this is a feature where readers can see the images and then simply click on the link for the article they’re interested in reading next.

For Googlebot, it must perform an extra step and render the JavaScript in order to discover the link.

On the left is a Medium article with JS disabled and on the right is the same Medium article with JS enabled. You can see the “More from Medium” section is missing on the left, which is how Googlebot initially “sees” the page. Want to try this with your website? Check out Onely’s free tool — What Would JavaScript Do.

Unfortunately, this is a step Googlebot is not always willing to take. When there are thousands of other URLs waiting to be visited, the crawler prefers all of the links to be visible in a plain HTML file.

This is one more issue that we can correlate with Medium’s difficulties in getting all of their content indexed by Google.

And it’s not just Medium — most large domains that we’ve tested struggle with similar issues to a varying degree.

You can read more about this topic in Bartosz Goralewicz’s “How Much Content is NOT Indexed in Google in 2019?” where you can watch the video, read the transcript and browse the deck.

Wrapping Up

This article isn’t meant to single out Medium.com.

Because the web is so large, it’s very hard for search engines to keep up with all the content that’s perpetually being released.

Factor in other issues like JavaScript and crawl budget, well, it’s safe to say that even a search engine as ubiquitous as Google has its work cut out for it.

That’s why any website as big and dynamic as Medium needs to take technical SEO into consideration.

--

--