Constructing Robust Financial Models Utilizing Deep Learning (Attention-based LSTM’s for Aspect-Level Sentiment Classification)

Tejeswar Tadi
Intel Student Ambassadors
13 min readMay 24, 2019

--

By Tejeswar Tadi

Proposed Application:

This project seeks to improve the robustness of financial modeling by leveraging cutting-edge deep learning technology to develop sentiment metrics for financial performance measures from several sources. Sentiment metrics will be extracted from a few sources including but not limited to the “Management Discussion and Analysis” & Factors” portion of 10-K/10-Q (Annual/Quarterly Financial Reports) documents, quarterly and annual earnings call discussions, as well as any other public sources of data such as interviews or social media in which management discussed financial performance of the respective firm. Often the 10-K/10-Q documents as well as other commentary that is provided by management is referred to as guidance for predicting future financial performance by firms.

The sentiment metrics generated can be utilized by Financial Modelers as an alternative data point consideration to build more accurate financial models. Improving the accuracy of Financial Modeling can lead to better strategic business decision-making within a firm, better investment insights when considering equity and credit-based investments in companies, enable more accurate pricing of securities, and grant more insight for M&A decision-making.

Background Information:

In the Financial field, Artificial Intelligence has also been used to improve pricing models, determine volatility of stock(source 3), understand sentiment in news articles, detect fraud that may have been committed during the construction of financial data, as well as determine risk factors for the firm, and to determine future financial performance.

Within Finance, Deep learning and Machine Learning models have been applied in the past to financial documents and management discussions across various mediums to extract sentiment level data. There have also been several studies supporting the correlation between sentiment within financial discussions and the correlation to financial performance. However, despite the force positive demonstrated by studies, due to the limitations of neural network models in the past, the sentiment data extracted also had many limitations in how much it could realistically be utilized. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTMs) have primarily been used to extract sentiment level data in the past. These neural networks have various limitations including the fact that they are mainly only able to extract sentiment on a document or sentence level depending on the inputs into the model. This enables sentiment analysis on a high level but when understanding the relevancy and complexity of a topic in the context of the entire document it proves to be ineffective.

There is a need for such a solution as Financial Modeling is a complex task. Modelers must digest and develop comprehensive and complex financial models for effective decision making. The task is more challenging as both numerical (quantitative) and text-based (qualitative) data are involved in the consideration of projections within a model. Studies have shown that prices in markets for equities for examples are at times mis-priced due to inaccurate financial projections (Source 4). Being able to provide accurate sentiment metrics for textual data could help to improve the quality of models and reduce error.

Solution:

This paper proposes the use of Attention-based LSTMs for Aspect-Level Sentiment Classification. By focusing attention on the aspects, also known as targets, for a given text input, sentiment can be extracted at the aspect level in the context of the entire inputted text. In the construction of a Financial Model, a modeler considers both numerical/qualitative and text-based/quantitative data to project financial performance measures such as net sales, cost of goods sold, advertising cost, depreciation, interest income/expenses, income tax expenses, net income, assets (cash, accounts receivable, inventories, etc), liabilities (accounts payable, accrued payable expenses), shareholders equity (common stock, retained earnings, treasury stock), and cash flow metrics. These financial performance measures are the aspects that the model will compute sentiment values for based on the qualitative text data inputted.

Aspect-Based LSTMS for Aspect-Level Sentiment Classification is perfect for this project as sentiment is determined based on an aspect level rather than sentence or document level sentiment analysis. Aspect based sentiment classification enables the model to understand the nuances within phrases and extract sentiment based on the object or topic being spoken about. Aspects are often domain-sensitive and understanding nuances between different aspects can enable better sentiment interpretation when a certain piece of text is being considered. Aspects when analyzing sentiment within in financial documents will focus on financial performance measures such as net sales, cost of goods sold, depreciation, accounts receivable, and etc.

The financial performance measures are projected by analysts by first considering the quantitative data which includes historical trends such as average growth rates and relative percentage values compared to revenue, depending on the consideration of each individual metric. The second data element modelers consider is text-based including “Management Discussion and Analysis” portion of 10K and 10-Q documents, quarterly and annual earnings call discussions, as well as any other public sources of data such as interviews or social media in which management discussed the financial performance. These second data elements are largely text based and difficult to decipher. The modeler must use their intuition to decipher management’s assumptions about future financial performance and sentiment interpretations of this data in the context of financial projections. Using both the historical quantitative data, and their subjective qualitative interpretation of the text-based data modelers are tasked with calculating accurate financial projections.

Deep learning can be used to improve the interpretation of the qualitative data. Utilizing Attention-based LSTM models, Aspect-Level Sentiment can be identified within this textual data. The sentiment scores generated for each of these financial performance measures (aspects) generated can be another supplementary factor for the modeler to consider when doing their modeling. If the sentiment scores complement a modelers projection than the modeler can be more confident in their projection. If the sentiment score disagrees with a modelers projection the modeler can go back to take a second look at any data that might have been missed. This type of use case is comparable to how in the future radiologists when analyzing x-rays for cancerous signs will be using artificial intelligence to complement their patient diagnosis.

Author Note (04/04/2019): This project and article are the first ever recorded proposal for improving Financial Modeling and as well as sentiment analysis for financial documents/data utilizing Deep Learning, specifically through the implementation of Attention-based LSTM models for Aspect-Level Sentiment Classification of Financial Performance Measures.

Technology Utilized:

This project will leverage Intel AI DevCloud, Intel’s Distribution Of Python, Intel’s Optimization of Tensorflow, and pWord2Vec (C++ implementation of Word2Vec which is optimized for Intel Xeon and Xeon Phi Processers.

Most notably, this project will leverage NLP Architect, an open source library of AI models, coding notebooks, and frameworks purpose-built for a range of Natural Language Processing (NLP) tasks. NLP architect also recently launched, 05/02/2019 in version 0.4, models for sentiment analysis for Aspect-Based Sentiment Analysis (ABSA). The ABSA feature is a perfect fit for this project as it’s a lightly supervised model which enables it to ingest unlabeled text and output opinion and aspect lexicons after domain-specific lexicons are defined.

This project leverages other frameworks, libraries, API’s and technologies not mentioned within this paper.

Financial Modeling:

Important Financial Models include a Three Statement Model, Discounted Cash Flow (DCF) model, Leveraged Buyout model (LBO), and a Mergers & Acquisitions (M&A) model. There are other Financial models out there as well not listed but within the context of this paper, these models are likely to benefit the most from the sentiment metrics generated by this project. The basis and at the heart of all these models is a Three Statement model. Without a three-statement model, a DCF, LBO and M&A model cannot be constructed.

A Discounted Cash Flow Model (DCF) is a valuation method which is used to estimate the value of an investment based on its future cash flows. It considers inflation, the time value of money, the cost of equity, the cost of debt, and WACC to determine the costs and the present value of the investment. The present value obtained is then used to evaluate a potential investment.

Leveraged Buyout Model is utilized by private equity firms to determine what happens when a company is purchased using a combination of equity and debt. A financial buyer such as a private equity firm normally invests a small amount of equity relative to the total purchase price and uses leverage (debt and other non-equity sources of financing) to fund the remainder of the consideration paid to the seller. The private equity firm makes the purchase with the objective of re-vamping and restructuring business operations in order to sell the acquired firm again in the future.

A Merger and Acquisition model is used when two firms are looking to merge to become an entity or when one larger entity is purchasing a small entity. The reason for a Merger and Acquisition is to enable the firms engaging to benefit from a join operation, gain a competitive edge, or to enable more efficient operation as a result of taking advantage of the synergies between the two business operations.

The DCF, LBO, and M&A 3 mentioned models as mentioned utilize a 3-Statement model as their basis. The 3-statement model consists of an Income statement which is a profit and loss statement for a specific period of time. The balance sheet is a snapshot at a specific point in time which gives insight into the capital structure, assets and liabilities, of the company. The cash flow statement gives insight into the liquidity of the business as measured by the cash flowing in and out the firm as a result of its various business activities.

Financial Modelling Procedure

Image made by author

The following images will walk through the steps of constructing a historical statement followed by projecting it for the first part of a 3-statement model, which is the income statement. The goal of the examples is to the reader to understand within the context of a Financial Model how Attention-Based LSTM networks can help enhance the Financial Modelling process.

Tesla Example Income Statement:

Within the Income Statement listed the following would be considered Aspects for which our model would compute sentiment for: Automotive Sales (Model S, Model 3, Model X, Model Y), Automotive Leasing (Model S, Model 3, Model X, Model Y), Energy Generation and Storage, Services and Other, Net Sales/Revenue, Cost Of Automotive Sales, Cost of Automotive Leasing, Cost of Generation and Storage, Cost of Services and Other, Gross Profit, Research and Development, Selling General And Administrative, Restructuring and Other, Interest Income, Interest Expense, Other Expense/Income, Income Tax, Net Income

Revenue Calculations (Growth Rates/Average):

Tesla 10-K Risk Factors Model 3:

Image from Source 2

*Excerpt from 10-K for Model 3 which is a revenue generatingproduct for Tesla

Tesla 10-K Management Discussion and Analysis Of Financial Condition And Results Of Operations:

Image from Source 2

Tesla 10-K Management Discussion and Analysis of Financial Condition and Results Of Operations:

Image from Source 2

*Excerpts from 10-K documents for Model 3 from Management Section

Based on the previous images and data the financial modeler is tasked with constructing an accurate financial model to make a financial decision based on. Tesla’s annual financial performance filing it’s 10-K alone is 173 pages long. For one person alone to digest and dissect this information is an arduous task. By utilizing sentiment analysis, this project hopes to supplement and reduce a portion of the work to be carried out by Financial modelers. Within our analysis, the following is a partial income sheet projection based on the data above:

Image created by author

Based on the sentiment around the increase in Model 3 sales as indicated in the 10-K excerpts as well as the continued expectation of continued demand for Tesla’s other models we anticipate a year over year growth rate of 130%. Along with the construction of a new Gigafactory and production process in Shanghai China, we anticipate increasing production capabilities and sales from the end of Q4 2019 through 2020 and 2021. Tesla’s low-cost model will surely help the firm to continue to boost its core revenue and automotive sales.

This estimated growth rate was drawn from not only the sentiment of the management team about their future financial endeavors but also based on historical calculations which showed 53% boost in sales from 2016–2017 and the 106.59% boost in sales from 2017–2018.

By making Revenue, Sales, Model 3 Sales, Model 3 Purchases, Model 3 Revenue, Model S Sales, Model S Purchases, Model S Revenue, Model Y Sales, Model Y Purchases, Model Y Revenue, Model X Sales, Model X Purchases, and Model X Revenue the sentiment scores generated could help us arrive at the same conclusion based on the positive sentiment contained within the 10-K.

Sentiment Analysis Techniques:

Sentiment models so far have been very accurate in interpreting sentiment on a document-wide and sentence specific level. Machine Learning Models have been used for decades. Many of these models start by tokenizing text and noting the frequency of the words used to create a bag of words representation. Then the model refers to a pre-label sentiment bank which it uses as a reference to label this bag of words representation from the corpus of data to determine the overall sentiment polarity of the document.

Neural Networks within the past decade have also begun to be utilized for sentiment analysis tasks. The type of model primarily used for sentiment analysis is a Recurrent Neural Network (RNN). Recurrent Neural Networks are good in that they can recall a memory from the past as they use previous outputs of each neuron as inputs to the next neuron. Thus, they have great performance for sequential data. This enables them to recall sentiment from a previous sentence and carry it over to the next; however, the biggest pitfall they have is in the backpropagation process they experience the gradient decent problem which leads to a vanishing gradient. RNN’s also have difficulty considering long-term dependencies due to their sequential process.

As a result, Long-Short Term Memory networks have been used as a solution. LSTM’s are essentially Recurrent Neural Networks but the neurons within this model have a different architecture which allows them to utilize and remember information more efficiently without experiencing the vanishing gradient problem. LSTM’s because of their architecture and memory capabilities are also able to understand long term dependencies better RNN’s.

Image from Source 1

Both LSTMs and RNN’s embed words into word vectors. The word vectors are then utilized for a plethora of natural language processing tasks including sentiment analysis. The advantage of utilizing word vectors is that these types of vectors are better at interpreting contextual information. As a result, they can provide better results when it comes to sentiment analysis as they are able to classify sentiment based on contextual information.

However, LSTM’s and RNN’s still are only able to provide sentiment on a document-wide or sentence specific level. To generate greater insight from the sentiment data generated by neural networks, for the use case of this project Attention-based LSTM for Aspect-level Sentiment Classification is proposed.

Attention-Based LSTM’s for Financial Aspect-level Sentiment Classification:

Attention-Based LSTM’s for Financial Aspect-level sentiment classification can help generate sentiment metrics on a more detailed level for specific financial performance measures described in financial documents such as 10-K. By extracting sentiment on an aspect level better financial models can be constructed as previously discussed within this paper.

Attention-Based LSTM’s for Aspect-level Sentiment Classification was first described in Study 1 located within sources. Attention has become an effective mechanism over the past few years for tasks such as image recognition, machine translation, and sentence summarization. Similarly, attention mechanisms can be used to focus on aspects within a text to gain superior performance results for sentiment classification tasks. A method proposed in the study is concatenating the aspect vector into the sentence’s hidden representations for computing attention weights.

Image from Source 1

Through aspect-embedding, greater attention weights can be placed on aspects when computing attention weights. Thus, when given an aspect, sentiment can be computed by placing greater attention on the aspect itself. Additionally, the study also proposes appending the aspect vector into the input word-vectors to obtain better results.

Image from Source 1

An example of a sentence that we could see in 10-K documents for Tesla is:

“We are very optimistic about our Model 3 sales in 2019, however we anticipate that our leasing revenue may remain stagnant as a result of the cheaper Model 3 price which may push consumers to purchase the lower-end model over leasing tesla cars.”

If aspect-based sentiment classification is utilized for this sentence, it can be detected that model 3 sales’ has a positive sentiment, while leasing revenue has a negative sentiment. An analyst reading this sentence can easily understand this; However, with the sentiment metrics for these aspects, generated analysts will be able to check with the sentiment metric generated if their deduction was accurate.

Conclusion:

It is evident that sentiment metrics generated for financial performance measures mentioned in financial documents such as 10-K’s can provide real value to financial modelers. The sentiment data can provide insight into the future performance of the firm and thus help project future financial performance. These metrics presented as an alternative data point can enable financial modeler’s to construct more robust and comprehensive models which can yield ultimately to better financial decisions making.

By being able to improve the 3-Statement model, ultimately the remaining models such as Discounted Cash Flow (DCF) Model, Leveraged Buyout Model (LBO), and Mergers and Acquisitions (M&A) Model can all be improved. DCFs, LBOs, and M&A models all leverage the projections made in the 3-Statement model to build out more complex models.

Attention-Based LSTM’s for Financial Performance Aspect-Classification can yield direct sentiment insight correlated with each financial performance metric. Such a rigorous analysis was not possible previously due to the complex nature and dynamics of the language used within Financial documents. If there is a strong correlation between future financial performance and the sentiment metrics extracted, in the future, it may be possible to automate a part of the financial modeling processing as regression can be run on the sentiment metrics and along with other metrics that feed into financial projections.

Sources:

1) Wang, Huang, Zhao, Zhu (2016), “Attention-based LSTM for Aspect-Level Sentiment Classification”.

2) Tesla, Inc. (2018) 10-K Annual Report.

3) Rekabsaz, Lupu, Baklanov, Hanbury (2017), “Volatility Prediction using Financial Disclosure Sentiments with Word-Embedding-based IR Models”

4) Cohen, Malloy, Nguyen (2018), “Lazy Prices”.

5) He, Lee, Ng, Dahlmeier (2018), “Effective Attention Modeling for Aspect-Level Sentiment Classification”.

--

--