Are you still diversifying using the Markowitz model? Welcome to the 21st century.

Diogo Seca

Published in

Analytics Vidhya

4 min readJul 14, 2020

This article is Part 1 of a series of articles on Diversification.

Most retail traders a.k.a. “dumb money” don’t diversify.

Contrastingly, professional portfolio managers diversify:

across markets;
across sectors;
across asset types;
across investment strategies.

Professional portfolio managers are more afraid of the occasional downswings and crashes than retails traders are. And rightly so. The best managers monitor their exposure and seek to re-balance it to according to their perceived optimum portfolio weights. They understand that diversification is key to minimizing risk.

The most popular, most used model for calculating the optimal portfolio weights is the Markowitz model and the Efficient Frontier.
This is now considered ancient technology.

There are two main issues with the Markowitz Model:

Markowitz is always overfitting the training data.
Markowitz’s result was an optimization process for the weights that would have been profitable in the past. Markowitz model is not learning a set of rules for predicting the future. Instead, this model can be summed up as “Has this portfolio worked well in the past 10 years? Then it is sure to work well for the next 10!”. You can change the 10 years for whatever timestep you like — you’ll still be overfitting.
It finds the weights that maximize Sharpe ratio. Sharpe ratio sucks.
“What?! But everybody is using Sharpe ratio!”
Using Sharpe for showcasing portfolio metrics to clients is ok; using it for daily quant work is not ok. Sharpe ratio’s denominator is the standard deviation of returns, which is a good measure of variability, not a good measure of risk. Sharpe ratio penalizes large positive swings, penalizes accelerating returns, and fails to penalize deaccelerating returns. For experimental evidence, see: https://www.crystalbull.com/sharpe-ratio-better-with-log-returns/

Enter Machine Learning.

There are several ways we can frame the problem of portfolio optimization as a Machine Learning problem:

Reinforcement Learning: learning the optimal increase/decrease in portfolio weights.
Supervised Learning: learning the optimal portfolio weights for the next N days/months/years.
Unsupervised Learning: learning clusters of assets, according to their price and fundamentals similarity.

Today, I will focus on the latter problem. Why? Because learning groups of assets can identify how to diversify according to historical data.

We will be starting with the following dataset:

Each line describes an instance. Each instance contains quantitative information about a given stock for a given year, as well as its GICS sector, industry_group, industry, and sub_industry.

We select the top50 industries with that contain the most instances:

Machinery                                         1379
Oil, Gas & Consumable Fuels                       1312
Chemicals                                          916
Specialty Retail                                   899
Energy Equipment & Services                        779
Aerospace & Defense                                634
Health Care Equipment & Supplies                   582
Electronic Equipment, Instruments & Components     485
Hotels, Restaurants & Leisure                      480
Food Products                                      463
Metals & Mining                                    452
Commercial Services & Supplies                     424
Health Care Providers & Services                   387
Containers & Packaging                             362
Textiles, Apparel & Luxury Goods                   302
IT Services                                        301
Construction & Engineering                         272
Professional Services                              270
Household Durables                                 255
Building Products                                  250
Auto Components                                    250
Media                                              241
Life Sciences Tools & Services                     226
Multiline Retail                                   223
Pharmaceuticals                                    221
Household Products                                 203
Electrical Equipment                               203
Diversified Consumer Services                      197
Trading Companies & Distributors                   194
Equity Real Estate Investment Trusts (REITs)       170
Food & Staples Retailing                           167
Diversified Telecommunication Services             166
Technology Hardware, Storage & Peripherals         160
Road & Rail                                        157
Leisure Products                                   149
Capital Markets                                    137
Personal Products                                  136
Automobiles                                        134
Paper & Forest Products                            125
Industrial Conglomerates                           125
Entertainment                                      118
Software                                           115
Marine                                             106
Tobacco                                            103
Beverages                                          101
Airlines                                            90
Construction Materials                              81
Real Estate Management & Development                80
Gas Utilities                                       65
Air Freight & Logistics                             63
Name: gics_industry, dtype: int64

And filter our data so that it only contains those GICS industries.

We then standardize the numerical information:

We can now calculate and visualize the dissimilarity between industries, measured by the Maximum Mean Discrepancy between the samples of the different industries).

We can also frame this as a Hierarchical Clustering problem and use MMD as a linkage metric between industries and clusters of industries:

Based on the chart above, we can see that some industries are being clustered with industries of different sectors. Therefore, the data indicates that we’re better off diversifying by industries than diversifying by sectors.

This concludes Part 1.
The original Jupyter Notebook for this experiment can be found here.

In Part 2, we will look at ML-based diversification strategies and compare their forward testing / out-of-sampling results. This will include some methods from the MlFinLab python package.

Are you still diversifying using the Markowitz model? Welcome to the 21st century.

Enter Machine Learning.

Written by Diogo Seca