Discovering alternative ESG insights using NLP

Emmanuel Vallod & Trey Heiskell, CFA (guest blogger), July 2019

In previous articles we outlined the degree to which the use of unstructured data is growing, giving examples of how this can be harnessed by professionals of all types through the use of text analytics. This rapidly growing field within Artificial Intelligence blends Natural Language Processing (NLP), Computational Linguistics and Machine Learning and enables users to rapidly process large amounts of information. We now turn our eye to the topic of Environmental, Social and Governance (ESG) investing and the use of text analytics to develop unique insights.

We have witnessed rapid growth in ESG interest on the part of professional and individual investors. Evidence of this is seen in the increase in the number of ESG data providers and users, adoption by institutional investors, as well as fund flows and investment vehicles (sources provided at end of article). The very clear trend is that ESG related services and products are likely to experience continued wide-scale growth.

What makes this particularly interesting is the subjective nature of ESG investing, wherein variability of definitions and implementation can make it difficult for consumers to differentiate between offerings. Text analytics can provide greater insight into this burgeoning area. Below we specifically address how investors seeking to identify non-traditional ESG exposures and build unique investment insights can readily harness the power of SumUp’s text analytics platform.

Case study

To illustrate how readily this can be accomplished, we have created a case study where we develop ESG scores for ten large stocks in the Information Technology sector using the corporate filings available through the SumUp platform. We have used the period from January 1, 2015 to June 1, 2019 for this analysis. The 10 companies are: Accenture, Cisco, Intel, Microsoft, Paypal, Apple, IBM, Mastercard, Oracle, and Salesforce.

We have also chosen to use an open-source Jupyter notebook for this example to illustrate how easy it is for portfolio managers to reproduce and adapt our code. A detailed version of this work is available here. We provide an abbreviated version of this process below.

In the example, we have extracted a set of key topics related to each of the ESG pillars, based on the related content discussed by these companies in their corporate filings. For example, for the environmental pillar, we define a series of keywords associated with the topic including: Biodiversity, Carbon, Cleantech, Clean, Climate, Coal, Conservation, Ecosystem, Emission, Energy, Fuel, Green, Land, Natural, Pollution, Raw materials, Renewable, Resources, Sustainability, Sustainable, Toxic, Waste, Water

We then take Accenture’s corporate filings and compute their exposure to the top 6 topics and their associated sentiment scores:

Keywords: amd products;amd business;adversely affect;material adverse;natural disasters;adverse amd;materially adversely;timely basis; Exposure: 0.019; Sentiment: -0.48

Keywords: natural disaster;power loss;loss telecommunications;disaster power;telecommunications failure;financial condition;unauthorized entry;damage interruption; Exposure: 0.032; Sentiment: -0.71

Keywords: climate change;global climate;effects climate;change regulations;posed climate;challenges posed;change water;occurring frequently;disasters occurring;change result; Exposure: 0.12; Sentiment: -0.16

Keywords: operating segments;energy utilities;manufacturing logistics;travel hospitality;retail consumer;consumer manufacturing;products resources;logistics energy; Exposure: 0.013; Sentiment: -0.12

Keywords: electronic products;energy efficiency;hazardous substances;recycling electronic;products accessories;laws focused;focused energy;efficiency electronic;accessories recycling; Exposure: 0.05; Sentiment: -0.04

Keywords: public health;health issues;war terrorism;disasters public;political events;industrial accidents;trade disputes;international trade; Exposure: 0.07; Sentiment: -0.55

We measured sentiment for each of the topics and classified them as having “good” or “bad” connotations, then identified the exposure each company has to each topic. Notice, each sentiment associated with the top 6 topics for Accenture is negative. Next, for every company we aggregated its exposures across topics in order to create a ranking for each company by individual ESG pillar.

Below we plot the results of the three ESG related pillars plus a composite score, which provides some very interesting insights. Not surprisingly, the top company is the one that displays the highest exposure to the “good” topics and/or the least exposure to the “bad” topics. We begin with the Social or “S” pillar, where we see very little dispersion in our rankings and persistently negative scores for the companies on the whole.

Information Technology — Social Pillar

Relatively low dispersion across companies with mainly “negative” scores

It would be easy to focus on the negative scores shown in the above chart, but what really matters is the lack of dispersion across all ten stocks. There could be a number of reasons for this such as the use of conservative language in the filings. We will not dig into this question in this article but think it worth mentioning that this does not take away from the potential use of negative scores. In practice, the rankings of the companies may be more important than the raw scores themselves as systematic managers often normalize the results prior to implementation.

In the Governance or “G” pillar we see very little dispersion in our raw scores again, suggesting a lack of differentiation, with the exception of the very recent period. This may be a reflection of these technology companies having fairly standardized language when it comes to describing their corporate governance. In this exercise, the G pillar is of limited use to compare Information Technology companies.

Information Technology — Governance Pillar

Low dispersion in rankings

It is the high dispersion in the Environmental category (below) that commands our attention. The results for the companies in the “E” pillar also exhibit relatively stable sentiment for our holdings. This suggests there is opportunity for differentiation within the Info Tech sector to the degree in which companies (and managers) focus on the Environmental pillar.

Information Technology — Environmental Pillar

High dispersion with relatively stable sentiment by company

In aggregate (reflecting a basic equal weighting across the three pillars) the combined ESG scores indicate there is sufficient dispersion to potentially take active tilts within the sector. If we were an ESG equity portfolio manager seeking to make active calls in portfolio construction, we would view the dispersion in the “E” pillar to be the primary driver of our overall relative positioning. As mentioned, we have equally weighted the pillars in this example. In practice we would likely look to over or underweight the pillars depending on the relevance and effectiveness of each.

Information Technology — Aggregate Score

Reasonable dispersion and stability in scores, dispersion driven by “E” scores

This is a relatively simple case study but one designed to illustrate the potential of the APIs. The vendor search can be time consuming and still requires a lengthy onboarding and integration process. It is hard to find both the right data and the right people to get usable ESG information at a reasonable cost. We are not saying that vendor information is not worthwhile — in our belief it very often is — but rather point out that achieving practical use of this information can be a major hurdle to success.

SumUp can be a powerful alternative, and complement, to ESG data vendors without the commensurate implementation headaches. There is also the potential to be less reliant on a third party’s methodology in developing unique insights. In our opinion the flexibility and transparency baked into our web application and APIs allow users a greater degree of control and creativity over the development of proprietary insights.

The platform can also be applied to non-standard areas (ex. investment grade or high yield credit) where there is less practical ESG implementation and data. Expanding into credit is a logical area for asset allocators to look for additional ESG exposure.

Learn more about what Nucleus’ text-analytics platform can do for your business. Visit our website at

Sources: 1) Douglas, Elyse, and Van Holt, Tracy, and Whelan, Tensie. “Data Providers and Relevant Trends.” Journal of Environmental Investing 8.1 (2017); 2) 2019 Bloomberg Impact Report; 3) 2018 Callan ESG Survey; 4) 2018 Morningstar Sustainable Funds Landscape

Accelerate Understanding: Explainable AI for Text at Scale