Building a Stock Option Valuation Model with Python: Part II

Part II: Generating probability distributions of payoffs

Jacob Linger
7 min readAug 17, 2020
Photo by M. B. M. on Unsplash

In Part I, I went over how we could use the yahoo_fin module to access stock and option pricing data, and we generated payoffs under different scenarios.

What we aim to do now is figure out an expected payoff.

In our case, the expected payoff of an option contract is the sum of each payoff times the probability of reaching that payoff.

For example, in part I we showed the various payoffs for an AT&T option under scenarios ranging from a -50% price drop to a +50% price gain:

If we now generate the probability of each price % change, we can then compute the sum of the payoff and probability vectors to gain our expected payoff.

Generating a Probability Distribution of Price Changes

Stocks can move up and down. Predicting the price of a stock in a week is theoretically impossible (without inside information). Yet, we can have some idea about the relative probabilities of how a stock might move.

For example, if Procter & Gamble’s stock is trading at $135, and rarely ever fluctuates below $110 or above $150, then it is very unlikely that it will be $160 in a week.

While we might not be able to use a stock’s historical price data to predict its price, we can use it to generate a distribution of potential price changes based on past experience.

#Importing required modules:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import random
import datetime
def delta_dist(ticker, duration, sample_size):
stock = get_data(ticker).close
dates = list(stock.index)
duration = int(duration)
sample_size = int(sample_size)
deltas = []
for s in range(sample_size):
try:
x = random.randint(0, (sample_size - 1))
start = stock[x]
stop = stock[x + duration]
difference_percent = (stop - start)/start
deltas.append(difference_percent)
except:
pass
return deltas

In the code above, delta_dist() will take in a stock ticker, duration (time period to estimate price change over), and sample_size.

Suppose we ran delta_dist( ‘AAPL’ , 14, 1000). This function would choose 1000 random days to pull price info on Apple’s stock. It will then find the difference between the stock price of that day and 14 days ahead for each day sampled. I use a try and except within the loop in case one of the days sampled happened to be within the last 14 days (in which case it could not calculate the difference, since 14 days ahead of it would be in the future).

As shown above, we want to find probabilities over each 1% interval. Therefore, we want to bin the results that delta_dist() returns:

def binned(diffs):
bins = []
for bin in range(101): #needs to be 101 to count -50% and +50%
begin = (bin - 50)*0.01
def between_bins(k):
return (k <= begin + .01) and (k > begin)
count = list(filter(between_bins, diffs))
amount = len(count)/len(diffs) #amount is percent of total
bins.append(amount)
return(bins)

Using binned(), we can count up the frequency of times a price change falls within each 1% interval.

Let’s use Apple as an example:

#EXAMPLE : SAMPLE SIZE 200changes = delta_dist('AAPL', 14, 200)
binned_changes = binned(changes)
x_axis = np.arange(-50, 51, 1)
plt.bar(x_axis, binned_changes)
plt.xlabel('Stock Price % Change')
plt.ylabel('Probability')
plt.title('AAPL: Probability Distribution- 14 day price change(Sample Size=200)');
plt.savefig('AAPL_prices.png', dpi = 800)

This will generate the probability distribution of a 14 day price change using a sample of 200 random days in which Apple was on the market:

Here is what it looks like when we adjust the sample size to 9000 days:

And here is what it looks like when we look at average prices over 3 years, rather than two weeks:

As you might expect, the probability of larger price swings is much greater over longer time periods.

Applying Price Change Probabilities to our Options Dataset

The options_df dataset from Part I contains options expiring on September 18th. Therefore, it would be useful to know the amount of time between that expiration date and today.

To do this, we use the datetime module and subtract today’s date from the expiration. We then create a dictionary containing each ticker as a key along with a list containing the probability distribution from [-50% , +50%].

#Generate time lapse between today and expiration of options contract:timedate_until_exp = datetime.strptime(expiration, '%B %d, %Y') - datetime.today()#convert day number to integer:
time_until_exp = int(timedate_until_exp.days)
scenario_tickers = list(set(options_df.Ticker))
number_of_tickers = len(scenario_tickers)
simulations = 2000
#dict_of_stuff will contain each ticker key corresponding to the 101 values of its distribution of price changes [-50%, 50%]
dict_of_probs = dict()
for stock in scenario_tickers:
changes = delta_dist(stock, time_until_exp, simulations)
distribution_list = binned(changes)
dict_of_probs.update({stock : distribution_list})

To test that our code worked, let’s see a random example taken from dict_of_probs.

#Verify it works with examplex_axis = np.arange(-50, 51, 1)
example = random.randint(0, number_of_tickers - 1)

plt.bar(x_axis, dict_of_probs[scenario_tickers[example]])
plt.xlabel('Stock Price % Change')
plt.ylabel('Probability')
plt.title(str(time_until_exp) + " day average % price change for " + str(scenario_tickers[example]))
plt.savefig('price_example.png', dpi = 800)

Success! At this point, we have both a straightforward method to generate our payoff distribution and a dictionary containing the average % price change distribution for each stock ticker.

Calculating Expected Payoffs

Photo by Sharon McCutcheon on Unsplash

Suppose that the price of Apple stock has a 10% probability of increasing by 1% in two weeks. Let’s say you see an options contract with a 2-week expiration that generates $20 in profit if Apple is up 1%. Therefore, the expected payoff of that contract on the [0 , 1%] interval would be $20 * 0.1 = $2.

If you did this for every 1% interval of that contract (which we will assume is from -50% to +50%), and sum up the results, then you would have the overall expected payoff of the contract.

This is what we will do for every contract within our options_df.

#Generating expected payoffsx_axis = np.arange(-50, 51, 1)
ExpectedPay = []
for i in range(len(options_df)):
payoffs = []
ticker = options_df.iloc[i].Ticker
for p in range(len(x_axis)):
percent = (p - 50)*0.01
payoff = price_percent_payoff(percent, options_df.iloc[i])
payoffs.append(payoff)
probs = dict_of_probs[ticker]
expected_value = sum( np.array(probs) * np.array(payoffs) )
ExpectedPay.append(expected_value)
options_df['ExpectedPay'] = ExpectedPay

At this point, we now have our expected payoff for each options contract. Let’s see how the payoffs look:

plt.hist(ExpectedPay, bins = 40, range=[-2000, 2000])
plt.xlabel('Total Expected Gain/Loss of Contract in $')
plt.ylabel('Frequency')
plt.title("Frequency of Expected Gains and Losses");
plt.savefig('payoffs.png', dpi = 800)

You will notice that most options have an expected payoff around $0. This makes sense since the contract is hedging risk among buyer and seller.

Remember that an option is a zero-sum game. If I purchase a call option (providing me the right to buy at the strike price), then someone else has purchased a put option (they sell me the stock at that strike price if I exercise it).

Therefore, an efficient market would bring the expected value of most contracts to zero (as buyers and sellers are not going to get ripped off).

Still, we can notice that there are contracts on the right tail that could possibly provide profitable investments.

In practice, this dataset can provide some insight into options worth looking more into. The payoffs calculated should not be used to make investment decisions, and those looking to trade options should do further research into why those expected payoffs are as high/low as they are.

One other significant point not yet discussed is that many of these ‘profitable’ contracts might appear so due to low liquidity. That is, the Volume and Open Interest could both be very low, suggesting no one is actually trading based on what the Bid-Ask spread that we use.

With that being said, I plan to further tweak this model to provide more insight into valuing options with different expirations and to account for low liquidity issues.

--

--

Jacob Linger

Young professional interested in economics, data science, and mathematics