The Efficient Frontier In Python

Simon Zeng
5 min readSep 10, 2020

--

The most fundamental aspect of portfolio management is to maximize returns while minimizing risks. In 1952, Harry Markowitz would give birth to the Modern Portfolio Theory (MPT) in his writing for the Journal of Finance — “Portfolio Selection.” According to Investopedia, MPT “…argues that an investment’s risk and return characteristics should not be viewed alone, but should be evaluated by how the investment affects the overall portfolio’s risk and return.” In simpler terms, MPT means that investors can increase their returns, while minimizing or having no additional risk, by investing in different asset classes instead of just one. In other words, putting your eggs in multiple baskets instead of just one. Markowitz’s theory presides on the idea that investors are risk adverse, or someone who would want a lower return with a known risk instead of a higher return with an unknown risk. In his paper, Markowitz highlights the importance of viewing investments in different asset classes as a portfolio, instead of different asset classes, to see the relationship between them. Having a combination of securities that lack correlation with each other, allows investors to increase or optimize their returns without increasing the risk of their portfolio.

Expected Return of a Portfolio of 2 Assets

Investopedia mentions that the expected return of a portfolio is calculated as “a weighted sum of the individual assets’ returns.” This means that if you have 2 equally weighted assets in your portfolio, with expected returns of 5% and 8%, the expected returns of the portfolio would be:

(5% x 50%) + (8% x 50%) = 6.5%

To calculate the risk of this portfolio, you would need the weighting of each asset, the standard deviation of each asset’s return, and the correlation between the assets. I have attached below a picture of how to calculate the standard deviation of a portfolio consisting of 2 assets.

Efficient Frontier

Efficient Frontier Graph: Showing the Power of Diversification
Red dot shows the power of diversification

In the graphs above, one can clearly see that in theory, diversification of a portfolio can lead to a higher rate of return while also maintaining the same level of risks. In the 2nd graph, the portfolios that lie on the black portion of the graph are considered sub optimal because there are other portfolios that have the same amount of risks, but have higher returns. The minimum variance portfolio (MVP), is a portfolio that lies on the efficient frontier that provides the lowest variance amongst all possible portfolios. The red dot on the graph above shows that with diversification, investors can increase the rate of return of their portfolio while maintaining the same amount of risk.

Calculating the Efficient Frontier in Python

I will be demonstrating how to find this data in Jupyter Notebook. Here is a good guide by Jupyter to install their notebook:

Here is documentation on how to use Jupyter Notebook:

To get started, we will need to import these libraries. Numpy to get arrays in Python, Pandas to manipulate the data, pandas_datareader to get the stock data that we need, matplotlib.pyplot to make a visual representation of our efficient frontier.

import numpy as np
import pandas as pd
from pandas_datareader import data as wb
import matplotlib.pyplot as plt
%matplotlib inline

We will be comparing Twitter & AMD. The code snippet below will grab the prices of the 2 assets from January 1st, 2015 until now.

assets = ['TWTR', 'AMD']
pf_data = pd.DataFrame()
for x in assets:
pf_data[x] = wb.DataReader(x, data_source = 'yahoo', start = '2015-1-1')['Adj Close']

We know want to graph the 2 assets and see how they have performed over this time period.

(pf_data / pf_data.iloc[0]*100).plot(figsize=(10,5))

To get the logarithmic return of the assets:

log_returns = np.log(pf_data/pf_data.shift(1))

To find the returns of both assets over the time period (assuming 250 trading days in a year).

log_returns.mean() * 250

To get the covariance between the 2 assets:

log_returns.cov() * 250

To get the correlation between Twitter and AMD.

log_returns.corr()

We can see that Twitter and the AMD are not highly correlated.

Here we are creating a 1000 different portfolio variations with the same 2 assets — AMD & Twitter.

portfolio_returns = []
portfolio_volatilities = []
for x in range(1000):
weights = np.random.random(len(assets))
weights /= np.sum(weights)

portfolio_returns.append(np.sum(weights * log_returns.mean()) * 250)
portfolio_volatilities.append(np.sqrt(np.dot(weights.T, np.dot(log_returns.cov() * 250, weights))))

portfolio_returns = np.array(portfolio_returns)
portfolio_volatilities = np.array(portfolio_volatilities)

portfolio_returns, portfolio_volatilities

We are calculating the expected portfolio return by:

np.sum(weights * log_returns.mean()) * 250

We calculate the expected portfolio volatility by:

np.sqrt(np.dot(weights.T, np.dot(log_returns.cov() * 250, weights)))

Here is the matplotlib code that will let us visually see how the 1000 different portfolios create an efficient frontier.

portfolios = pd.DataFrame({'Return': portfolio_returns, 'Volatility':portfolio_volatilities  })portfolios.plot(x='Volatility', y='Return', kind='scatter', figsize=(15,10));
plt.xlabel('Expected Volatility')
plt.ylabel('Expected Return')

--

--