Using NLP to get inside Warren Buffet mind part I

Jair Neto
Analytics Vidhya


Warren Buffet is an American investor, philanthropist, and the actual number 6 on the Forbes billionaires list. He has an overall gain of 2,810,526% from 1965 to 2020 while the S&P500 (index of the 500 largest U.S. publicly traded companies) rose 23,454%. He is considered the best investor of all time and an inspiration for a lot of people, I included.


To combine my passion for technology and the financial market, I will write a series of posts to see if the use of Artificial Intelligence (AI) could help me understand the mind of Buffet by answering the 3 questions below.

  1. Can a machine learning model answer questions about Finance and Economy?

2. What are the most used terms of Buffet and if those terms changed along the time?

3. How was his feeling about the Economy and Stocks Market over the years?

To answer those I will use Natural Language Processing (NLP) techniques and the text from the letters that he writes every year to Berkshire Hathaway shareholders, the company that he is the CEO. In this post, I will answer the first question.

Can a machine learning model answer questions about Finance and Economy?

To respond to this question I will use the transformer architecture that is revolutionizing the AI landscape.

The paper “Attention is all you need” from Google introduced the Transformer technique. They used an encoder to create an embedding for each word taking into account its importance in the sentence and a decoder to transform the embedding back to text.

Transformer architecture

The encoder and decoder architectures are composed of stacked modules with Multi-Head attention and Feed Forward layers.

The Transformer — model architecture.

Differently from others NLP approaches the transformer does not use loopings but rather it uses stacked attention layers. In each attention layer, the model is looking at different parts of the sentence and trying to learn more about its words.

Multi-Head Attention consists of several attention layers running in parallel.

Taking a closer look at the Attention mechanism, you can see that it is composed of two parts, the scaled dot-product Attention where a softmax is used to learn weights, and the Multi-head attention that calls the dot-product operation several times in parallel.

In the end, there is a Feed Forward layer that realizes a linear transformation of each element from the given sentence. Those attention layers try to look at different parts of the sentence to discover the semantic or syntactic information of the words.

If you want to know more about transformers you can check this post by Cathal Horan.

Question Answering

The QA (Question Answering) is when the model receives only a context and a question and it outputs an answer (where the answer lies within the context) and a confidence score. The QA is a task that uses the transformers architecture.

To use a pre-trained transform for the task of QA is easy like we can see in the example below from huggingface:

So I used this trained model and the Buffet letter from 2008, just removing the special characters, as the context to see if the model could make correct predictions.

Questions, answers, and snippets showing the correct answers

As you can see, the model was able to correctly understand all the questions and for all scenarios gave a satisfactory answer. For the first question the answer should be something like: “the credit crisis, coupled with tumbling home and stock prices”, but the model was partially correct, answering “the credit crisis”.

The second question that I made was to check if the model would give the correct answer even if I changed the word for a synonym, so I changed the word produced by the word generated and the answer was the same.

Then in the third question, I used ‘Who’ instead of ‘What’ to see the model behavior with differents questions pronouns the and the model gave the correct answer with a confidence of 0.98. In the last question, besides the low confidence score, the response was also correct.
Then I went to questions that the response was not in the text, to see how the model performed.

Questions and answers given by the model

The model gave a nonsense answer to question 4, but this was expected since there isn’t any mentioning of Bear Market in the letter. But the model gave interesting answers to questions 5 and 6.

The model said that when the market crashes you should invest in derivatives transactions and this is indeed a good investment to have when the market crashes, as you can see at the market crash from 2020, when the Universa Tail Fund, that has derivatives in his portfolio, returned 3,600% in March of 2020.

For question 6, although the model did not answer a specific stock to buy, the output was ‘underpriced securities’ that was a good response, since Buffet is famous for buying underpriced securities and underpriced securities are almost always a good investment.


We can see that a pre-trained transformer model was able to get the correct answers with a new context and even gave answers that made sense from questions whose answers were not explicitly in the context. Of course, the model is not an Oracle yet, but I am looking forward to seeing the evolution of NLP and where this will lead us in the future.

You can check the code used to write this post at my Github repository. Feel free to reach me with any comments on my Linkedin account and thank you a million for reading this post.

If you like what you read be sure to 👏 it below, share it with your friends and follow me to not miss this series of posts. In the next post I will try to answer the questions:

  • What are the most used terms of Buffet and if those terms changed along the time?
  • How was his feeling about the Economy and Stocks Market over the years?

Stay tuned!




Jair Neto
Analytics Vidhya

ML engineer / Analytics engineer | UCI & UFCG Alumni