POP! — Portfolio Optimization in Python

Published in

ACM VIT

9 min readJun 29, 2022

Introduction

When we invest our money an age-old idiom comes to mind, “Do not put all eggs in one basket.” Therefore, we must own a portfolio. A portfolio is a collection of investments such as stocks, commodities, bonds, etc. While investing, portfolio management is very important, as it can cover for a certain amount of risk through diversifying the stocks, i.e., trying to buy stocks from different sectors/industries as that will have an equalizing effect. Also, we can reshuffle and allocate funds in the portfolio to achieve the maximum returns they can generate.

What is Portfolio Optimization?

After reading the definition of a portfolio, we now have questions such as “How do I know which stocks to buy?”, “How many shares of each stock do I buy to get maximum profits with minimum or calculated risk?” among others.

To these, the answer is — Portfolio Optimization. It is a process to make the best portfolio by distributing your stocks in such a way to maximize expected returns and minimize risk. For this we will use the Markowitz Model by Harry Markowitz.

Markowitz Model or Modern Portfolio Theory (MPT):

Also called Mean-Variance model, the Markowitz Model is called so because it is relying on expected returns and its mean and the standard deviation and its square(variance) of the portfolio. It is a very good measure of risk vs return and can show an investor efficient data to weigh the risk and returns.

Some assumptions are made when applying the Markowitz Model, they are:

1. Risk is based on the diversification of the said portfolio of stocks

2. The investor does not want to take, i.e., wants to avoid risk (risk-averse)

3. The investor is only seeking maximum profits with minimum risk; they only want to increase returns with a given level of risk or minimize risk for a given return.

4. The analysis is based on the single investment being made.

To understand the mathematics behind the Markowitz model, we will use the following example.

E[R] represents the Expected Return of A or B respectively. Var[R] can be interpreted as the risk associated with A or B respectively. It is basically the variation that is possible from the average of expected returns.

Corr[Ra,Rb] is the correlation coefficient which is measured usually between -1 and 1.

▪ If 0 it means, there is no relationship between A and B

▪ If between 0 and 1, there is a positive relationship, i.e., with every increase in A there will be a subsequent increase in B.

▪ If between -1 and 0, there is a negative relationship, i.e., with every increase in A there will be a subsequent decrease in B. To compute the value for Expected Returns of the Portfolio E[Rp]. For this the formula will be,

Subsequently we will need the Variance of the Portfolio, Var[Rp]. The formula for it is,

▪ Std[R] is the standard deviation of A or B, to find it we will need to find the value of square root of Var[Ra/b]

Now we will make a table for the possible weights of the portfolio. Weights is the proportion we are giving to stocks in a portfolio that will be equal to 100% Suppose we take the first case of the table,

We will need to find the E[Rp] and Var[Rp] for this portfolio. Using eq. (1) and (2) we get,

We will continue to fill values for all possible portfolios, which will look like this,

Here we can clearly see the Optimum Portfolio is one that has maximum returns with minimum risk. The portfolio comprises of 60% stocks of A and 40% stocks of B.

Plotting Risk vs Return we get this graph,

This parabola is the Efficient Frontier. Portfolios that lie below the E.F. are sub optimal as they do not provide enough return for the risk. Also, all portfolios to the right are suboptimal too as they are considered to have a higher level of risk.

We can find Volatility, which is unlike variance, the variation that is possible throughout the returns and not average.

Volatility = √𝑉𝑎𝑟[𝑅𝑜𝑝] ; Var[Rop] is the variance of the optimal portfolio.

Finally, we will calculate the Sharpe Ratio. It compares a portfolio’s past performance (or expected performance) for the risk being taken. Usually, a Sharpe Ratio above 1 is good and considered investable. The formula is,

Therefore, it is an extremely good portfolio to invest in considering the volatility and variance.

Implementation of MPT using Python:

Prerequisites:

▪ cvxpy

▪ cvxopt

Use “pip install” followed by the two packages above, in your terminal, to install them on your system.

▪ Dataset (CSV): https://drive.google.com/file/d/1bu1TKnI7Zj437FZVJCeRkL6O-foMMS5x/view?usp=sharing

▪Code: https://gist.github.com/ManavMuthanna/81012396b2323c2e47229a6a12d819c1

Libraries loaded in are:

▪ pypfopt is PyPortfolioOpt which is a library that has necessary functions to calculate and provide results for an optimal portfolio.

▪ Pandas is used to read csv files efficiently and create data frames for easy data handling.

▪ matplotlib is required to plot a graph of a given data set.

▪ numpy is used in this program to make calculations easier.

First, we will read the data of 6 companies’ daily close prices during the period of 12–04–2012 and 12–04–2022.

Next, we need to set the date as index for easier calculations in the program, then we will remove the current Date column as there will be 2 of them.

Now we will plot a graph that shows us the given data graphically to better understand which stocks have performed in the 10-year window and which ones haven’t.

As shown mathematically the first prerequisite to any calculation in MPT is the percentage returns (daily). The method is simple. Suppose we have these values:

t is current value in consideration, considering the last row is the latest values, the percent change of returns will be,

Notice how there is no value possible in row 1, this is because when considering t=1, t-1 will be 0 and thus return will be undefined.

We further move on to calculate the mean of the returns and covariance matrix.

Covariance matrix is also a measure of risk, it is a comparative value between 2 or more variables and the relationship of their variances. To explain covariance matrix, I will use the following example.

Suppose we have the following data:

And the mean of this data will be, Mean of M = 1 ; Mean of N = 0

The covariance matrix would look like this:

Here, covar[M,M] = var[M] and similarly covar[N,N] = var[N]

To calculate covar[M,M],

▪ E is expected value.

▪ E[M] and E[N] are equal to their respective mean values.

To find E[M²] = we will square the values of M, add it and then divide by number of values given. Mean of M, i.e., E[M] = (1+3–1)/3 = 3/3 = 1

Therefore, covar[M,M] = 11/3–1 = 8/3 You can repeat the process for covar[N,N], which will result in 2/3. To calculate covar[M,N], we will use the formula:

We will just multiply the values in each row of the table, and divide by the number of values in the table to find {E[MN]/(values)}. For example, cov[M,N] = ([1+0+1]/3) — [1][0] = 2/3

covar[M,N] = covar[N,M] as the order doesn’t matter. Therefore, the covariance matrix will be:

Moving on, we now optimize our portfolio to get the maximum Sharpe ratio possible. For this we use the function EfficientFrontier to automatically calculate the Expected Annual return, Annual Volatility and Sharpe Ratio. We will use the function max_sharpe to distribute the weights in a manner that we can get the maximum Sharpe Ratio value. The clean_weights method will nullify tiny weights to zero and round the rest of the weights:

Now we will calculate the variance of the portfolio. Just for the sake of simplicity, we will use matrix multiplication to find the variance of the portfolio. For this we can use another formula:

The first matrix will be a transpose of the weights of the portfolio. The second matrix will be a dot multiplied matrix of the covariance matrix and the weights matrix. Finally, we will dot multiply the First and Second matrix to get our portfolio’s variance. We will then convert it to percentage.

Now we will take the portfolio optimisation one step further and calculate for a case where if we invest X amount of money, how many shares to buy of a certain company in our portfolio, if our budget is less automatically put money in the stock that gives more returns and whose price is less, moreover how much money will be left of that certain fund. Here the method, DiscreteAllocation does the above-mentioned calculations. For now, we will take a value of Rs.3,00,000 (3 Lakhs INR).

Here, we will just store the discrete allocation values in a list.

We will create a dataframe of the portfolio now.

Lastly, we will print the dataframe.

Output:

First the csv file given by us is printed.

Next, we have a graph to visually compare which stocks have increased in value over time.

Following that a table of returns is outputted.

Next, the covariance matrix is printed.

Then, the weights we have to assume to get maximum Sharpe Ratio.

Note that BPCL has been given ‘0.0’ in weight.

Next, we have the information of the portfolio like returns, volatility, etc. outputted.

Finally, the funds remaining and the allocation of the portfolio is printed.

Here since BPCL has ‘0.0’ in weight no funds were added to BPCL shares.

Closing Remarks:

The Modern Portfolio Theory is efficient to calculate an optimal portfolio but not universal. Suppose we want better returns to risk ratio we need to diversify even more than the example above. Saying that, portfolio optimization is a key part of portfolio management to generate profits as we’ve seen above. This blog is aimed to be a one-stop read into understanding portfolio optimization through MPT and its implementation on Python. I have explained the math in the manner I have understood and tried explaining it in the same way. Hope you enjoyed the read!