Strategic Asset Allocation with Python

Mariano Scandizzo, CFA, CQF
8 min readJun 14, 2018

Introduction

The following analysis compares the risk adjusted return of two diversified portfolios.

Research Objectives:

  • Compare the marginal risk adjusted return contribution provided by the addition of EM Debt to a portfolio versus Gold.
  • To create a basic framework to analyze and compare portfolios with N assets using Python.
  • To create easy to deploy visualizations and simulations based on descriptive statistics and Monte Carlo concepts.

Note: Due to the large code size, only code related to statistical analysis is included, while skipping code related to visualizations ( full code file & GitHub link is provided and the end of the article ).

Portfolio components & key statistics

Dataset:
Source: Yahoo Finance
Assets:

  1. SPY : SPDR S&P 500 ETF
  2. QQQ : PowerShares QQQ ETF
  3. AGG : iShares Core US Aggregate Bond ETF
  4. GLD : SPDR Gold Shares
  5. EMB : iShares JP Morgan USD Em Mkts Bd ETF

Period:

  • Start: 4/29/2013
  • End: 4/30/2018
  • Periodicity: Weekly
  • Data points: 262
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
plt.style.use(‘ggplot’)
semana = 52
datos = pd.read_excel(‘MasterAllocation.xlsx’,sheet_name=’Summary’,index_col=’Date’)

Historical evolution
Data visualization is an important step to develop an initial intuition.
Below, the historical price evolution for the full observation period:

Historical Prices

Relative performance by Asset Class
Each asset class has a different initial value which makes difficult to compare their relative performance.
What would have happened if you would have invested one dollar on each asset at the beginning of the observation period?
Let us normalize the data to compare the performance of our initial one dollar investment on each asset.

normalized_series = (datos/datos.iloc[0])
Normalized Prices

Descriptive Statistics
It is time to get some numbers to help us to describe the behavior of each asset class.
Annualized returns and measures of dispersion are a good place to start to get an intuition about the return and the embedded risk on each asset.
Note: Because we are using weekly data, the convention is to assume 52 weeks per year to annualized data.
Formulas:

datos_returns = np.log(datos/datos.shift(1))
datos_returns.dropna(inplace=True)
stats = pd.DataFrame()
stats[‘Annualized Returns(%)’] =datos_returns.mean() * semana *100
stats[‘Annualized Volatility(%)’] = datos_returns.std() * np.sqrt(semana)*100
stats[‘Sharpe Ratio’] = stats[‘Annualized Returns(%)’] /stats[‘Annualized Volatility(%)’]
print(82*’-’)
print(‘Assets Classes Annualized Statistics — full observation period’)
stats.style.bar(color=[‘red’,’green’], align=’zero’)

Dispersion of returns
The next layer of analysis is driven by the third and fourth moment of the data, i.e. Skewness and Kurtosis.
As part of the portfolio risk management we need to understand if returns bend more frequently than a normal distribution towards positive or negative values (Skewness).
In the same manner It is critical to know if the assets tend to have extreme events either positive or negative (Fat tails), more frequently than normally distributed assets.
The charts below present the histogram of returns for each asset.
To facilitate te comparison a bell shaped normally distruted curve with mean and starndard deviation equal to the assets under consideration is drawn.
Values outside the bell shape indicate an asset behavior which cannot be fully described by normality assumptions.
Skewness and Kurtosis values are included in the charts for the ease of reference.

Returns Histogram

U.S. Investment grade Bonds versus EM Government Bonds
Let us specifically compare the distribution of returns of the two Fixed Income assets in the portfolio.
It is clear that Emerging Markets Bonds offer a greater dispersion of returns with respect to Investment grade US Government bonds. The higher volatility is compensated by higher returns.

Fixed Income Assets comparison

Portfolio Analysis
So far we have only considered assets individually.
The more assets a portfolio has, the more important it becomes the relative behavior of each asset with respect to the rest of the assets in the portfolio to determine portfolio risk behavior.
Simulation
Portfolio 1 allocation:
1. US Investment Grade Fixed Income: 30% (AGG)
2. US Equities : 50% (SPY & QQQ)
3. Gold: 20% (GLD)
Portfolio 2 allocation:
1. US Investment Grade Fixed Income: 30% (AGG)
2. US Equities : 50% (SPY & QQQ)
3. EM Government Bonds: 20% (EMB)

Asset Allocation

Let us compare their relative performance
The portfolio 2 offers a better risk adjusted return than portfolio 1, indicating that Emerging Market Bonds are a better choice than Gold, during the observation period.

Portfolio volatility
The risk profile of each portfolio is described by their volatility values

datos_returns.corr(‘pearson’)
Full period correlation matrix

Using the full period correlation matrix (Pearson formula) , let us calculate the annualized portfolio return and volatility.

Expected_Return_noEM = np.sum(datos_returns.mean()* allocation.No_EM)* semana
Expected_Std_noEM = np.sqrt(np.dot(allocation.No_EM.T,np.dot(datos_returns.cov()*semana,
allocation.No_EM)))
Sharpe_noEM = Expected_Return_noEM / Expected_Std_noEM
Expected_Return_EM = np.sum(datos_returns.mean()* allocation.EM)* semana
Expected_Std_EM = np.sqrt(np.dot(allocation.EM.T,np.dot(datos_returns.cov()*semana,
allocation.EM)))
Sharpe_EM = Expected_Return_EM / Expected_Std_EM
print(‘Key Stats: Portfolio with no EM Securities ‘)
print(82*’=’)
print(‘Annualized Returns: {:.3%}’.format(Expected_Return_noEM))
print(‘Annualized Volatility: {:.3%}’.format(Expected_Std_noEM))
print(‘Sharpe Ratio: {:.4}’.format(Sharpe_noEM))
print(82*’-’)
print(‘Key Stats: Portfolio with EM Securities ‘)
print(82*’=’)
print(‘Annualized Returns: {:.3%}’.format(Expected_Return_EM))
print(‘Annualized Volatility: {:.3%}’.format(Expected_Std_EM))
print(‘Sharpe Ratio: {:.4}’.format(Sharpe_EM))
print(82*’-’)

Following the same approach applied to visualize the dispersion of returns per asset class, let us observe how normally distributed or not are the returns of the 2 selected portfolios.

Monte Carlo Simulation
We are closer to the end so let us have some fun !!
Let us simulate the Markowitz efficient frontier.
We are not only going to calculate the frontier portfolios but also internal (sub-optimal) portfolios.
To calculate each portfolio we are going to randomly alter the assets weights while keeping constant the asset classes of each portfolio.
The exercise will run 2,500 simulations for each portfolio.
Furthermore, the color scale is based on the Sharpe ratio of each portfolio, to visually separate the portfolios by their degree of risk adjusted efficiency.

pretsEM = []
pvolsEM = []
prets_noEM = []
pvols_noEM = []
[[‘AGG’,’SPY’,’QQQ’,’EMB’]]
[[‘AGG’,’SPY’,’QQQ’,’GLD’]]
for p in range(2500):
weights = np.random.random(len(allocation)-1)
weights /= np.sum(weights)
pretsEM.append(np.sum(datos_returns[[‘AGG’,’SPY’,’QQQ’,’EMB’]].mean()* weights)* semana)
pvolsEM.append(np.sqrt(np.dot(weights.T,np.dot(datos_returns[[‘AGG’,’SPY’,’QQQ’,’EMB’]].cov()*semana,
weights))))
pretsEM = np.array(pretsEM)
pvolsEM = np.array(pvolsEM)
for p in range(2500):
weights = np.random.random(len(allocation)-1)
weights /= np.sum(weights)
prets_noEM.append(np.sum(datos_returns[[‘AGG’,’SPY’,’QQQ’,’GLD’]].mean()* weights)* semana)
pvols_noEM.append(np.sqrt(np.dot(weights.T,np.dot(datos_returns[[‘AGG’,’SPY’,’QQQ’,’GLD’]].cov()*semana,
weights))))
prets_noEM = np.array(prets_noEM)
pvols_noEM = np.array(pvols_noEM)
# the chartsfig8 = plt.figure(figsize = (12,16))
plt.subplots_adjust(wspace=.5)
plt.subplot(211)
plt.scatter(pvolsEM, pretsEM, c = pretsEM / pvolsEM, marker = ‘o’,cmap=’coolwarm’)
plt.grid(True)
plt.xlabel(‘expected volatility’)
plt.ylabel(‘expected return’)
plt.colorbar(label = ‘Sharpe Ratio’)
plt.title(‘Monte Carlo Simulation Efficient Frontier with EM’)
plt.subplot(212)plt.scatter(pvols_noEM, prets_noEM, c = prets_noEM / pvols_noEM, marker = ‘o’,cmap=’viridis’)
plt.grid(True)
plt.xlabel(‘expected volatility’)
plt.ylabel(‘expected return’)
plt.colorbar(label = ‘Sharpe Ratio’)
plt.title(‘Monte Carlo Simulation Efficient Frontier with no EM’)
plt.show();
fig8.savefig(‘frontiers.png’,dpi=fig8.dpi)
Monte Carlo Simulation

Return & Volatility going beyond the surface
We have used so far the same Returns and Correlation matrix for the whole period.
Using start to end calculations would naturally smooth the results by making the time series less volatile.
To better portrait asset behavior, let us calculate the 3 months trailing correlation for each portfolio.
You will notice that return and volatility mean revert around the historical average, hence, the overall conclusion would not chance if you would use a shorter time frame to calculate, it would only increase the degree of stochasticity.
Geek Note: for those of you who started working in Finance while Excel was probably the only tool of choice and tried to calculate the efficient frontier, you might have calculated a Matrix Multiplication of the following shape:
[1x5] * [5x5] * [5x1] = Portfolio volatility
It is actually not that difficult to do it only once, but if you have to run that 250 times like in our case, well…. you will pull a few hairs trying to do that in Excel !!!
I want to highlight the power of numpy Tensordot, where in one single step you can run the 250 multiplications, it is just mind blowing !!!

def trailing_ret(retornos, window, weights, annualization = 52):

roll_ret = retornos.rolling(window=window).mean()
roll_ret = roll_ret.dropna()
roll_ret = (roll_ret * weights)* annualization
roll_ret = roll_ret.sum(axis =1)
roll_ret = roll_ret.to_frame()
roll_ret.rename(columns ={0:’returns’}, inplace = True)
return roll_ret
def trailing_cov2(retornos, window, weights, annualization = 52):
retornos_length = len(retornos)
retornos_width = len(retornos.columns)
roll_cov = retornos.rolling(window=window).cov()
roll_cov_dates = np.unique(roll_cov.index.get_level_values(0).values)
roll_cov_dates = roll_cov_dates[window-1:]
roll_cov = roll_cov.values.reshape(retornos_length,retornos_width,retornos_width)
roll_cov = roll_cov[window-1:] * annualization
weights = weights.values.reshape(len(weights),1)
step1 = np.tensordot(roll_cov,weights,axes=[1,0])
step2 = np.tensordot(weights,step1, axes=[0,1])

volatility = np.sqrt(step2)
volatility = volatility.reshape((retornos_length-(window-1)),1)

trailing_vol = pd.DataFrame()

trailing_vol[‘date’] = roll_cov_dates
trailing_vol[‘volatility’] = volatility

trailing_vol.set_index(‘date’,inplace = True)

return trailing_vol
def full_analysis(retornos, window, weights, annualization = 52):
volatilidad = trailing_cov2(retornos = retornos, window = window, weights = weights, annualization = annualization)
retornos = trailing_ret(retornos = retornos, window = window, weights = weights, annualization = annualization)
fusion = pd.merge(volatilidad, retornos, left_index=True,right_index=True)
fusion[‘sharpe’] = fusion[‘returns’]/fusion[‘volatility’]
return fusion
3-Months trailing Analysis

Conclusions:

  1. EM Bonds improves portfolio risk adjusted returns versus gold.
  2. We have created a flexible model to compare assets whether individually or as a portfolio and visualize the risk and return implications while conducting strategic asset allocation.
  3. The initial analysis indicates that the rolling time frame (full period versus 3-months) does not change strategic conclusions.
  4. The framework serves as an initial base to keep expanding portfolio analysis, such as VAR calculation, Stress testing, etc. , leaving to the reader to leverage the existing model and to continue expanding this quantitative approach.

--

--

Mariano Scandizzo, CFA, CQF

Mariano carries more than 20 years of experience working in Investment management, corporate strategy, private equity, and business consulting.