Part II: Alpha Vertex Launches Alta to Create Investment Signals From Our Speech, Chaos and Language

Mutisya Ndunda
7 min readJun 3, 2019

--

By Mutisya Ndunda, CEO of Alpha Vertex

Part II of our series on Alta and using alternative data to gain an advantage in the markets.

In my previous post, I discussed that while quantitative investing is usually all about numbers, the fact is that most of the world’s information is in written and spoken language and not encoded in numbers. It’s one of the reasons why we launched Alta, a product built from advanced natural language processing (NLP) and machine learning tools to extract unique, high-value information — such as investment signals — from unstructured, text-based datasets with broad coverage.

In this post, I’m going to be dive a bit deeper into the technology and tools underpinning Alta.

ALPHA FACTORS AND DATA FEATURES

We use proprietary NLP and machine learning models to parse the content of hundreds of thousands of conference call transcripts and cross-reference this information with our internal knowledge graph of financial data, executives’ biographical data and company data in order to find valuable insights that can be used by investors to generate profitable trading opportunities.

Features engineered from the transcripts fall into three broad categories:

  • Language Features
  • Sentiment Features
  • Behavioral / Psychological Features

LANGUAGE FEATURES

Language features are developed by studying word choice, language complexity and the communication style of executives and analysts to uncover nuanced communication styles which are difficult to discern from numerical data.

Language Features — Contrastive Word Factor

We analyze the use of contrastive phrases which are instances in a conversation when a speaker, having mentioned one thing, wants to go on to talk about something else that contrasts with, and is often in opposition to the first thing. Commonly used contrastive words/phrases frequently include: ‘but’, ‘however’, ‘instead’ and ‘despite’.

Contrastive phrases are commonly used by company executives in a corrective manner to clarify unfavorable revelations and to reverse investors’ perceptions of seemingly poor financial performance. The example below highlights a simple example of contrastive word usage:

“Revenues were 4% from the same quarter a year ago but resales to broad-based customers were weak.”

Contrastive phrases used by analysts can convey new and valuable information that challenge management’s explanation of the state of affairs. For example, the following excerpt is from an analyst asking Tim Cook about sales in India.

“I believe, Tim, in your prepared comments you mentioned India was growing double digit, which is great. But I believe if you look at geographic information, India is really underpenetrated from an Apple reception perceptive.”

We also find that sectors with the highest incidence of contrastive usage generally have the most exposure to global macro-economic and trade related risks.

Language Features — Explanations and Clarifications

Alta users will also be able to benefit from another unique feature which captures how often management provides explanations or the cause of a business or financial outcome. This feature is generated by analyzing the text based on its parts of speech definition and its context with adjacent and related words in a phrase to find uses of connective phrases such as: ‘a result of’, ‘because of’, ‘attributed to’ and ‘due to’ which explain the cause or effect of some event.

The following excerpt illustrates how explanations for a miss can be detected by analyzing connective phrases.

“Without getting overly detailed, dollar weakness late in the quarter and accounting rules required us to accelerate and recognize a large amount of foreign exchange premium expense related to future quarter hedges in the current quarter”

“We’re seeing what we believe to be a pause in purchases on iPhone, which we believe are due to the earlier and much more frequent reports about future iPhones. And so that part is clearly going on, and it could be what’s behind the data. I don’t know, but we are seeing that in full transparency”

The number of explanations provided by a company on its earning’s call is an indicator of the level of transparency it provides to investors and analysts. An interesting and potentially troubling finding, is that the average number of explanations offered on earnings call has declined since 2008.

Language Features — Use of Euphemisms

Euphemisms are mild, vague, or periphrastic expressions that are used as substitutes for blunt or disagreeable expressions. For example, the expression “open a can of worms” is a euphemistic expression that means to inadvertently create numerous problems while trying to solve one.

The use of euphemisms in conference calls is often associated with executives’ intent to present information about their companies in a more favorable light or obscure the details of bad news. Examples of how executives commonly use euphemisms include the use of the phrase: “we hit some speed bumps” when talking about the failure to meet financial targets. Another example is the use of the phrase “we intend to right-size our business” when discussing personnel layoffs.

A high usage of euphemisms is associated with lower stock returns.

Language Features — Language Complexity

We define features which measure a call participant’s ability to communicate effectively with easy to understand language. The language complexity factor includes various descriptors including the average sentence length, count of polysyllabic words, the number of characters in a word, the gunning fog readability score and the ratio of numeral values to plain text.

Historically, firms whose executives use easier to understand language outperform those firms whose executives use more complex language. Additionally, firms whose executives used more numbers when presenting results during the prepared remarks outperformed firms whose executives used less numbers.

SENTIMENT FEATURES

At Alpha Vertex, we measure the positive, negative and aggregate polarity of the management presentations and Q&A sections of conference calls. Additional sentiment scores related to specific topics are calculated. These include:

  • Earnings and earnings guidance
  • Macro-economic and trade related sentiment
  • Regulation
  • Partnerships and mergers
  • Natural disasters and weather

The raw sentiment analytics can be combined together or transformed into indicators to track the change in sentiment over time, the discordance in sentiment between the CEO and CFO, or the change in a key analyst’s tone.

Companies with the Best Change in Sentiment — as of 2019–04–29

Companies with the Worst Change in Sentiment — as of 2019–04–29

Additionally, the company level sentiment can be aggregated to create custom indices which track analyst or executive sentiment for specific topics, an industry or a company’s competitors.

Our research indicates that the effects of earnings call sentiment is nonlinear with the effect of negative sentiment being larger and more predictive than positive sentiment.

A — Financial Crisis, B — Eurozone Crisis, C — OPEC oil disagreement

BEHAVIORAL FEATURES

The linguistic features of a call not only provides valuable insights into the operations of a particular firm — they can also shed light on the personality traits of its leaders.

Behavioral Features — Hyperbole

Our hyperbole feature measures executive overconfidence and use of hyperbole in both the prepared remarks and Q&A sections. Hyperbole is a figure of speech that uses extreme exaggeration to make a point or show emphasis and is the opposite of understatement.

Companies with the highest percent of hyperbole in the Q&A section generally outperform companies with less exposure to this factor.

Behavioral Features — Vagueness

The use of vague words when communicating is a way to hedge a statement by using words or phrases which dampen the directness or explicitness of a statement. Less vague communication from executives should enable investors and analysts to easily and more effectively incorporate newly communicated information into their expectations for the future prospects of the company.

When communicating information that may be based on estimates or potentially unreliable sources, people will often hedge through the use of vague quantifies such as “few”, “some” or “many” in place of precise numbers.

For example, a CFO pressed about gross margins could respond:

“as I said in our guidance, we expect gross margin to be largely in-line with the long-term objective “

Our measure of vagueness is computed from the percentage of uncertain words each type of participant used during the presentation and the Q&A sessions of the call.

The next post in this series will demonstrate a systematic investing use case for Alta.

--

--