Machine Learning for ESG Stock Trading: PCA and Clustering

Hugh Donnelly
Analytics Vidhya
Published in
16 min readJun 30, 2021

--

Introduction

In developing a Pairs Trading strategy, finding valid, eligible pairs that exhibit unconditional mean-reverting behavior is of critical importance. We walk through an example implementation of finding eligible pairs and then perform a backtest on a selected pair. We show how popular algorithms from Machine Learning can help us navigate a very high-dimensional search space to find tradable pairs.

Jupyter Notebooks are available on Google Colab and Github.

For this project, we use several Python-based scientific computing technologies listed below.

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans, DBSCAN
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn import preprocessing
from statsmodels.tsa.stattools import cointfrom scipy import statsimport requests
from bs4 import BeautifulSoup
import time
import pymc3 as pmimport theano as th
import seaborn as sns

1. Define the Stock Universe

We start by specifying that we will constrain our search for pairs to a large and liquid single stock universe. To achieve this, we create a function that scrapes the…

--

--

Hugh Donnelly
Analytics Vidhya

Hugh founded AlphaWave Data in 2020 and is responsible for risk, attribution, portfolio construction, and investment solutions.