Machine Learning for ESG Stock Trading: PCA and Clustering
Introduction
In developing a Pairs Trading strategy, finding valid, eligible pairs that exhibit unconditional mean-reverting behavior is of critical importance. We walk through an example implementation of finding eligible pairs and then perform a backtest on a selected pair. We show how popular algorithms from Machine Learning can help us navigate a very high-dimensional search space to find tradable pairs.
Jupyter Notebooks are available on Google Colab and Github.
For this project, we use several Python-based scientific computing technologies listed below.
import matplotlib.pyplot as plt
import matplotlib.cm as cmimport numpy as np
import pandas as pdfrom sklearn.cluster import KMeans, DBSCAN
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn import preprocessingfrom statsmodels.tsa.stattools import cointfrom scipy import statsimport requests
from bs4 import BeautifulSoup
import timeimport pymc3 as pmimport theano as th
import seaborn as sns
1. Define the Stock Universe
We start by specifying that we will constrain our search for pairs to a large and liquid single stock universe. To achieve this, we create a function that scrapes the…