Culture Dive into Silicon Valley - the TV show vs. the best books on startups

Aurel Pasztor
6 min readMar 30, 2018

--

The next young Elon Musk may be out there wanting to know more about startups and Silicon Valley but the wealth of information available may be daunting. What is the best way to get a feel about entrepreneurship and getting inspired? Watching TV shows is probably the easiest to start with and picking up a few paperbacks also wouldn’t hurt! But which book is best to start with?

Let’s find out by analyzing texts of the most popular books and shows with natural language processing (NLP), using tidytext mining and R. The inspiration and some of the code came from Tamas Szilagyi’s excellent analysis of Rick and Morty.

The startup culture has been the topic of several TV shows. The Inc. website listed HBO’s Silicon Valley as one of the most popular among the ones entrepreneurs should watch.

Good news, that Season 5 started this week!

Many great books have also been written about technology startups by entrepreneurs, VC investors and most importantly by entrepreneur turned VC investors.

Based on Goodreads, some of the most popular books about entrepreneurship are:

If you like the show you will probably like one of these books too. But which one?

Let’s find out which of the three famous books has the most in common with the Silicon Valley show.

To do this, let’s analyze the most frequent and representative words and phrases of the books and compare them to the text of the show. The books are all available in eBook format and subtitle texts of the show’s can be downloaded from a host of sites. Let’s look at the first three seasons. After cleaning the data and dropping stop words, we can start looking at what goes on in Silicon Valley.

Frequencies

First, let’s have a look at the most frequently used words in the series.

We find that the list mostly have names of the leading characters (Richard, Peter, Jared, Monica etc.), basic action verbs and nouns (people, time, wait, head, call, talk) and quite many swear words. Last but not least also a few phrases that are actually related to startups and business.

Comparing the different seasons there is no noticeable difference between the types or the meaning of the words beside some personal names showing up at certain seasons but not in the others. Let’s have a look at the books now.

The frequency list of the books shows a very different list of words than the series had. It is of little surprise that most of the top words are business related in all three books (e.g. company, product, people, business). Interestingly, there are almost no personal names, nor colloquial words in the top 30s.

Bi-grams

A bi-gram is a contiguous sequence of two items in a text. We can create a network of words where the ones that most frequently appear within regular pairs, have a central location.

In the show, the most central node is the name of the lead character: Richard. As most of the text is covering conversations colloquial phrases have central locations. While startup related technical and legal expressions do appear, those are less dominant.

Just as with standalone words, business phrases dominate the largest bi-gram network of the books as well. Product, management, startup, business and company have the highest centrality.

TF-IDF

Tf-idf stands for term frequency-inverse document frequency. The product of tf and idf (tfidf) of a word gives us how frequent this word is in the document multiplied by how unique the word is within the entire corpus of documents.

Often, TF-IDF is a very effective tool in information retrieval. Here however the selected words are not giving us strong clues about the content or the differences between the seasons. The lists are dominated by character names, followed by a few technical expressions. Having watched the show, this result is in line with what one may expect. The settings and the topics have not changed much during the first three seasons although new characters emerged and some disappeared.

When we look at the book, the power of TF-IDF analysis becomes apparent. These sets of words are now allowing us to form assumptions on the nature of the three books and how they differ:

The Hard things.. Is mostly a self narrated story of Ben Horowitz, the entrepreneur who guides the reader through his experiences at various companies he founded. Here we find names of actual people, companies and also some less professional expressions.

The Lean Startup is primarily a “How to “ manual for entrepreneurs describing theoretical concepts of management with some practical examples.

Zero to One is a more philosophical book about the modern day economy and the role of startups in it. It has many actual references but more in a professional that in a storytelling style.

Final Question

So which book has the most in common with the Silicon Valley show? To find out the answer, I have created a metric based on the TF-IDF product concept:

  1. I have ranked the words by frequency for both the individual books and the three seasons combined.
  2. Then, I merged the separate sets by word and took the product of the ranking position of each word in the books and the show, then arranged them in an ascending order.
  3. Finally, I checked each of the three dataset how many words it had with product values below a certain threshold (i.e. 1000)

And the winner is…

Ben Horowitz’s The hard thing about hard things had the best match of top frequency words with the TV show!

When looking at the actual words, we see that Ben’s book has all the business words that the two other books have as well as the colloquial lingo that the show heavily builds on but the others lack.

After reading the books and seeing the show this should be no surprise.

The show is the ever evolving drama of human characters who fight for their own goals while forming business alliances, entering and breaking up relationships against the backdrop of the California tech scene.

Similarly, The hard thing… is a non-fiction autobiography by a serial entrepreneur turned VC investor who went through the real-life roller-coasters that the show’s characters are experiencing.

Take-away

So, if you like the show, and you would want to read a book too, you will probably like The Hard Thing About Hard Things.

Go and pick up a copy if you haven’t got one yet!

If you liked the analysis you can check out the code here and do a lot more after reading the book Tidy Text Mining online.

Recommend or comment below! Tweet @aurlup and follow me on LinkedIn or here!

--

--