Portfolio Optimization: The Markowitz Mean-Variance Model

Luís Fernando Torres
LatinXinAI
Published in
8 min readApr 6, 2023

This article is the third part of a series on the use of Data Science for Stock Markets. I highly suggest you read the first part, “Introduction to Quant Investing with Python”, and also the second part, “The Science of Smart Investing: Portfolio Evaluation with Python”.

I also recommend you read the notebook on Kaggle, 🤑 Data Science for Financial Markets 📈💰 to see the code and interact with the images.

Introduction

The Markowitz Mean-Variance Optimization Model is a mathematical framework first introduced by the economist Harry Markowitz in 1952. It’s based on the idea that investors are extremely averse to risk, and will only accept more risk if compensated by higher expected returns.

This model works on the foundation that an investor would not have a reason to invest in a determined portfolio if there was another option with a more favorable risk-expected return. The idea is then to determine the optimal combination of securities by balancing their expected returns against the risks associated with them.

Beyond that, it works on the belief that an investor can efficiently reduce risk by holding a combination of assets that are not perfectly positively correlated with each other.

In this article, we are going to use the PyPortfolioOpt library to find the optimal allocations for the stocks we’ve been working with during the past two articles to obtain the best risk-return relationship possible.

PyPortfolioOpt

PyPortfolioOpt is a Python library that simplifies the implementation of the Markowitz Mean-Variance Model to optimize portfolios. It allows investors to find the optimal allocation weights according to many goals and risk tolerance. In this case, we are going to optimize a portfolio to obtain the highest Sharpe ratio possible.

There are two key requirements for a mean-variance optimization:

First, we need to obtain the expected returns for each asset in the portfolio. With the expected_returns module, we can compute the expected returns for the assets by using the mean of their daily returns. This module assumes that the daily closing prices are available as input and gives as output the annual expected returns. You can obtain more information on this module by reading the Expected Returns session of the library’s documentation.

After obtaining the annual expected returns, we need to choose a risk model that quantifies the level of risk for each security. One of the most widely used risk models is the covariance matrix, which is useful to describe the volatility of the assets and the degree to which they are co-dependent.

It’s crucial to select an appropriate risk model, because that’s precisely what is going to be used to help us make uncorrelated “bets”, which is essential for risk reduction.

With PyPortfolioOpt, we have a wide range of risk models to pick from, such as the annualized sample covariance matrix of daily returns, semicovariance matrix, and exponentially-weighted covariance matrix. You can find further information on risk models in this chapter of the library’s documentation.

You can paste the code below into any Python environment to install PyPortfolioOp.

# installing PyPortfolioOpt
!pip install pyportfolioopt

Optimizing Portfolio

First, we are going to load the historic data for each asset using the yfinance library from July 1st, 2010 to February 11th, 2023.

# Getting dataframes info for Stocks using yfinance
aapl_df = yf.download('AAPL', start = '2010-07-01', end = '2023-02-11')
tsla_df = yf.download('TSLA', start = '2010-07-01', end = '2023-02-11')
dis_df = yf.download('DIS', start = '2010-07-01', end = '2023-02-11')
amd_df = yf.download('AMD', start = '2010-07-01', end = '2023-02-11')

Remember that PyPortfolioOpt expects the closing historic price to compute the expected returns. For that reason, we extract only the “Adj Close” column from each one of the stocks above. The use of the “Adj Close” column is that it contains the closing price adjusted by dividends and stock splits, being a better representation of price changes over a longer period of time.

# Extracting Adjusted Close for each stock
aapl_df = aapl_df['Adj Close']
tsla_df = tsla_df['Adj Close']
dis_df = dis_df['Adj Close']
amd_df = amd_df['Adj Close']

We then concatenate these separate dataframes together to form a new dataframe containing the adjusted closing prices for each stock during the time period.

# Merging and creating an Adj Close dataframe for stocks
df = pd.concat([aapl_df, tsla_df, dis_df, amd_df], join = 'outer', axis = 1)
df.columns = ['aapl', 'tsla', 'dis', 'amd']
df # Visualizing dataframe for input
Figure 1. Dataframe

Now that we already have our data, we import all the necessary imports to work with the PyPortfolioOpt library.

# Importing libraries for portfolio optimization
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns

In the code below, we obtain the expected returns mu of the stocks in our dataframe by getting their mean historical return, while also obtaining the covariance matrix S by applying the sample_cov method from the risk_models module to the dataframe.

# Calculating the annualized expected returns and the annualized sample covariance matrix
mu = expected_returns.mean_historical_return(df) #expected returns
S = risk_models.sample_cov(df) #Covariance matrix
# Visualizing the annualized expected returns
mu
aapl    0.268385
tsla 0.475549
dis 0.114966
amd 0.209862
dtype: float64
# Visualizing the covariance matrix
S
Figure 2. Covariance Matrix

By having the estimated expected returns and the covariance matrix, we can now find the optimal allocation weights for the stocks above.

We use the EfficientFrontier class from PyPortfolioOpt that is going to receive the mu and S values as input. With these values, we can optimize the allocation weights according to our goal. With this library, you have tons of optimization objectives to pick from, and you can see them in the Mean-Variance Optimization chapter of the documentation.

We are going to use the max_sharpe method for the maximization of the Sharpe ratio.

# Optimizing for maximal Sharpe ratio
ef = EfficientFrontier(mu, S) # Providing expected returns and covariance matrix as input
weights = ef.max_sharpe() # Optimizing weights for Sharpe ratio maximization

clean_weights = ef.clean_weights() # clean_weights rounds the weights and clips near-zeros

# Printing optimized weights and expected performance for portfolio
clean_weights

The code above gives us the following output:

OrderedDict([('aapl', 0.70828), 
('tsla', 0.29172),
('dis', 0.0),
('amd', 0.0)])

For the maximum Sharpe ratio possible with the stocks we’ve chosen, we have a portfolio with 70.83% of its value allocated into Apple stocks, and the remaining 29.17% allocated into Tesla stocks, while no allocation was made into Disney or AMD.

Evaluating Optimized Portfolio

With the optimal weights in hand, let’s go and construct a new portfolio and use Quantstats to compare this optimized portfolio to the one we’ve previously built in the article The Science of Smart Investing: Portfolio Evaluation with Python.

# Creating new portfolio with optimized weights
new_weights = [0.70828, 0.29172]
optimized_portfolio = aapl*new_weights[0] + tsla*new_weights[1]
optimized_portfolio # Visualizing daily returns
Date
2010-07-01 04:00:00 -0.031481
2010-07-02 04:00:00 -0.041054
2010-07-06 04:00:00 -0.042102
2010-07-07 04:00:00 0.022988
2010-07-08 04:00:00 0.029061
...
2023-02-06 05:00:00 -0.005359
2023-02-07 05:00:00 0.016701
2023-02-08 05:00:00 -0.005863
2023-02-09 05:00:00 0.003844
2023-02-10 05:00:00 -0.012936
Name: Close, Length: 3176, dtype: float64

We can now use Quantstats’ report.full method to compare the optimized portfolio to a portfolio that allocated 25% of its investment value into each one of the four stocks we’ve applied the mean-variance model to.

# Displaying new reports comparing the optimized portfolio to the first portfolio constructed
qs.reports.full(optimized_portfolio, benchmark = portfolio)
Figure 3. Cumulative Returns (Benchmark = The Portfolio Built in the Previous Article)

The report is, once again, too extensive to post here on Medium. That’s why I encourage you to click on the 🤑 Data Science for Financial Markets 📈💰 link to see the full work on Kaggle and obtain a deeper understanding of the results we’ve achieved.

Overall, as can be seen in Figure 3 and Figure 4, the optimized portfolio achieved a higher cumulative return of 4,830.76% against 3,429.90% of the original portfolio.

Figure 4. Optimized Portfolio x Original Portfolio

The Sharpe ratio and the Sortino ratio also show that the optimized portfolio has higher risk-adjusted returns than the original portfolio, meaning it has a more favorable risk-return relationship.

Figure 5. Optimized Portfolio x Original Portfolio

The maximum drawdown of the optimized portfolio is lower than that of the original portfolio. Its annual volatility is slightly lower, and its kurtosis is closer to 3, meaning a lower tail-risk for the optimized portfolio.

The Daily Value-at-Risk, which is an interesting metric that measures the expected loss for the overall portfolio on a given terrible day, is also lower for the optimized portfolio, although there really isn’t much difference between them. The expected daily, monthly and yearly returns are also much higher for the optimized portfolio.

Figure 6. Optimized Portfolio x Original Portfolio

The average drawdown, overall, is also lower for the optimized portfolio, as well as the average days of drawdown, although the difference isn’t really that much. The most exciting metric in Figure 6, however, is the Recovery Factor, which shows that the optimized portfolio recovers from drawdowns much quicker than the first portfolio we’ve built.

The Serenity Index in Figure 6 is also relevant, since it measures — as its own name suggests — how serene an investor will feel in regard to his portfolio. The optimized portfolio provides much more serenity to the investor than the original portfolio.

Conclusion

Overall, the first optimization model we performed on our portfolio, the Markowitz Mean-Variance Model, resulted in a portfolio with higher returns and overall lower risks than that we’ve previously built in the article The Science of Smart Investing: Portfolio Evaluation with Python, which already outperforms the S&P500 index — the American benchmark.

These results are extremely exciting, and the PyPortfolioOpt proved to be an easy and efficient tool quants can use to determine the best allocation weights for their securities when building a new or trying to optimize an already existing portfolio.

However, the Mean-Variance Model isn’t the only option when it comes to finding optimal allocation weights for different assets. In the next article, we are going to explore the Black-Litterman Allocation Model, which was first introduced in 1992 and is also an extremely popular choice for portfolio optimization.

Thank you for reading!

Luís Fernando Torres

LinkedIn

Kaggle

🤑 Data Science for Financial Markets 📈💰 (Source for this article)

Introduction to Quant Investing with Python (Part 1)

The Science of Smart Investing: Portfolio Evaluation with Python (Part 2)

Like my content? Feel free to Buy Me a Coffee ☕ !

LatinX in AI (LXAI) logo

Do you identify as Latinx and are working in artificial intelligence or know someone who is Latinx and is working in artificial intelligence?

Don’t forget to hit the 👏 below to help support our community — it means a lot!

--

--

Luís Fernando Torres
LatinXinAI

Data Scientist | Machine Learning Engineer | Commodities Trader & Investor | https://luuisotorres.github.io/