Unearth Latent Information from Quarterly Earnings Calls

Paul Lashmet
Product AI
Published in
2 min readApr 27, 2021

Challenge:

Analyze corporate earnings calls for key information that will give a trader or researcher deeper insight into the financial outlook of a corporation.

Solution:

Traders and researchers are always looking for some nuggets of information that nobody else possesses in order to gain an edge. Earnings calls can be particularly valuable, given the subtle insights that are expressed through the immense complexity of the human language. Fold in voice analytics and, if possible, facial expressions, and you can get an incredibly nuanced understanding of what someone is saying. Are they exaggerating? Are they telling the truth? Try as they might, it is virtually impossible for humans to remove all emotion from language. For the purposes of this article, we will focus purely on text analytics, which is rich with information.

Apache Spark is used for the ingestion of financial information and is employed to assess how earnings calls from various companies impact various markets and segments. Its task is to derive sentiment, identify key individuals, parse, and tokenize earnings calls. The approach is to use several sophisticated algorithms, executing on a Spark cluster, to evaluate transcripts using SparkML, NLTK, and PySpark. These sentiment numbers are then persisted for all companies and segmented based on industry and other reference data. Sentiment numbers for a particular call are then combined with other market data so as to infer if the call had any impact on prices, market sentiment of the corporation, or researcher sentiment. While traditional sentiment models are good at evaluating sentiment, they are only one of the approaches utilized.

Concurrently, TensorFlow and BERT, a state-of-the-art machine learning model used for NLP tasks, are also used to assess sentiment separately from the jobs running on Spark. BERT is a pre-trained language model that has achieved very high accuracy for NLP tasks. It is well-suited for analyzing sentiment from earnings calls.

Using these tools, we derive other information from the text, including individual sentence scoring, who is speaking, and specific entities (e.g., city, location, a new factory). All of this information is persisted and then cross- referenced with other information for greater insight.

Technologies Utilized:

NLP, NLTK, Python, Apache Spark, TensorFlow, NVIDIA GPU

--

--

Paul Lashmet
Product AI

Paul Lashmet is a business integration architect and financial services subject matter expert.