Finding the Most Popular Medium Posts of 2017 — or — what I’ve learned from scraping Medium

Browsing https://topauthors.xyz and https://toppub.xyz I thought there must be a place to quickly check out the most popular posts on Medium of 2017. I’ve found none.
“The future belongs to those who learn more skills and combine them in creative ways.” ― Robert Greene, Mastery
Here are a few things I learned along the way.
Googles importxml() formula quickly reaches it’s limit.
Use the importxml() formula in google spreadsheets to test your selections with xpath but you won’t be able to build a scraper within google spreadsheets as it will quickly reach its limit.
(Completely new to Web Scraping? Go here)
Stumbling upon Nokogiri
In my search for a way to get rid of googles importxml quota I stumbled upon this article introducing to me the ruby gem “Nokogiri”. I’ve never worked with ruby nor with a so called gem but the article got me intrigued.
Finding the most “copyable” code
As I’m no programmer I most often will google long enough until I find the code (or code fragments) that fit my needs best. Applied this “technique” here and found this “gem” (pun intended)
Putting it all together
After hours of headaches with xpath basically reliving this comic I finally got it to work. I scraped 234 Top Posts Pages e.g. https://medium.com/browse/top/january-01-2017 and extracted 20 Headlines, Hearts/Claps, Dates, Links each. After cleaning up the data I now have a beautiful dataset of 1988 Posts ranked by popularity.
Now what?
The fun is over — my project is done. But it was never about the project itself but rather about the learning along the way. Now it’s time to find another skill to acquire!
One last thing… If you liked this article, click (and hold) the 👏 below so other people will see it here on Medium.
