Mining Benedict Evans’ Newsletter

Reverse engineering the sources used for 1,215 articles across ~70 issues.

Dana Majid
4 min readSep 1, 2015

Benedict Evans’ weekly newsletter is a goldmine of knowledge. It’s a great digest of relevant industry updates, interesting tech-related blog posts, data-driven analyses and his own weekly blog posts. According to one of his recent tweets it has over 39k subscribers including some of the industry’s biggest names.

I’m curious to know how Benedict Evans curates his newsletter. From where does he get his information? What sources does he frequently use? Is he finding content on those sources directly by visiting websites? Or indirectly, through people he follows on Twitter?

Time to have fun with the data

I extracted links to 1,215 articles and websites from almost 70 issues of his newsletter. The categories, as consistently used by Benedict, are: industry news, interesting blog posts, and statistics. I ranked sources based on how frequently they’re used per category. And used Twitter data to take an educated guess to whether he found these updates via people in his network before publishing the newsletter and to find who those accounts are.

Frequently used sources

He’s using roughly 400 unique sources, with the top 20 sources accounting for 46% of all the links shared, in the issues analyzed (see notes below). Here’s an overview of sources you might want to subscribe to — if you haven’t already.

Not many surprising names. Apple’s website ranks #7 and is used directly for industry news — but most likely related to both ’14 and ’15 WWDC conferences taking place within this period. But also might have to do something with Benedict having a thing for Apple products :)

But is he subscribed to all unique sources directly, or is he finding interesting content through accounts he follows on Twitter? Approximately 20% of the shared links contained a combination of the following characters: ‘?’ and ‘tw’/‘twitter’/‘ newsletter’/ ‘digest’/‘fb’/‘facebook’/ ‘buffer’/‘share’/‘utm’ etc. Even if he turns out to be a robot, he’s certainly not going through every source individually.

I searched and scraped Twitter for every mention of the links he had shared the week before his newsletter went out (only including scrapable retweets; see notes) and cross-referenced that with his Twitter network of ~950 accounts.

About 30% of the links from his newsletters were shared by one or more people he follows, in the week before. Who are some of the people within Benedict Evans’ network he might be consistently using/could use t0 find interesting content?

What about other accounts he doesn’t follow that consistently share the same content the week before the newsletter goes out? Here are more names you might want to follow, since they’re sharing the same content in the week before the newsletter goes out.

Some of these are bots and tweeting everything popular tech blogs post. Also, while this was all very fun please do keep in mind that diversity in the sources of your news and updates is important :)

Interested in more?

This is the first of a series of analyses I will be publishing.
Follow me on Twitter for more: @dnmjd

Shout out to @SamuelBeek, @KayVink and Andrea Nerep for proofreading and giving feedback.

Notes

Since his Mailchimp archive overview is limited and I’m not subscribed since day one, I had to rely on Twitter data to get links to (web versions of ) as many missing issues as possible. Issue no. 58–126 were consistently shared and so this analysis was limited to newsletters sent between 13 April 2014 to 24 August 2015 – with the exception of issue no. 79 and 88.

Getting Twitter data (who shared what link in a specific week; who retweeted etc) turned out to be quite a hassle due to API restrictions and rate limits. Twitter’s Search API looked promising but is currently limited to tweets from up to last week. Full Archive Search, as recently announced and offered through Gnip, looked great but also looked expensive for a quick analysis like this. So I ended up scraping the hell out of their (public) search functionality (60k tweets) as it does offer unlimited data and history capabilities I needed. As a result of this limitation, simply getting retweeters of a tweet and attributing that to the list of folks he follows – in order to also take into account if a tweet posted by someone he doesn’t follow but which was retweeted by someone he’s following – was limited to the first list of retweeters Twitter allows users to publicly see.

--

--