Building a Stock Option Valuation Model with Python: Part I

Part I: Accessing the data and creating the payoff distribution

Jacob Linger
7 min readAug 16, 2020

This is Part I of a two-part post. Part II can be accessed here.

In this article, I want to go into a basic model I developed to evaluate options. The general steps involved are to (1) identify the payoff distribution based on stock price changes, (2) identify the probability distribution of the underlying stock’s price changes, (3) calculate the expected value of the option contract based on weighting the payoffs by their respective probabilities.

What is an Option?

An option is a contract giving you the option to purchase shares of a stock at a specific price in the future.

Let’s say that AT&T’s stock is currently trading for $30. Perhaps you have some hunch that the share price will go up to $35 in the next two weeks.

One way you could bet on this hunch is to buy stocks in AT&T. Let’s suppose you decide to purchase 100 shares at $30 ($3,000 investment). If the stock does go up to $35 in 2 weeks, your total investment is now worth $35 * 100 = $3,500. You’ve made a $500 return on $3,000 invested.

Another way you could bet on this hunch is by purchasing an options contract on AT&T. This contract would expire in some time period — in this case, you could buy a contract that expires in 2 weeks. Along with this expiration date, the contract would specify a strike price. The strike price is a price at which you can buy shares of AT&T at or before 2 weeks from now. More specifically, the contract usually entitles you to purchase up to 100 shares at the strike price.

Suppose you purchase an option entitling you to purchase 100 shares of AT&T at a strike price of $20 within 2 weeks, and that the contract costs $5 per share.

If the stock goes up to $35, your total investment is now worth $35 * 100 (the amount of selling 100 shares in AT&T) minus $20 * 100 (the amount of the purchase of 100 shares at the strike price) minus $5 * 100 (the cost of the option). That means you’ve made a return of $1,000 on a $500 investment.

Options can amplify gains, but they also can amplify losses. We will show this in more depth through the model.

Developing the Dataset

To begin with, I import the following modules:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
!pip install dfply
from dfply import *
!pip install yfinance
import yfinance as yf
!pip install ordered_set
from ordered_set import OrderedSet
import random
from copy import copy
!pip install requests_html
!pip install yahoo_fin
from yahoo_fin.stock_info import *
from yahoo_fin.options import *
import math
from datetime import date
from datetime import datetime
%load_ext google.colab.data_table

I use the yahoo_fin module to access options and historic stock price information. More info on its uses and examples can be found here. You might also notice %load_ext google.colab.data_table. This provides a great way to view DataFrames when using Google Colab, which is what I used for creating this.

To begin building our dataset, we first pull stock data on the S&P 500 using the tickers_sp500() function:

#Generating S&P 500 List of Stocks
sp_list = tickers_sp500()

The resulting sp_list is a list containing each ticker symbol within the S&P 500. One issue I found later on is that yahoo_fin has trouble pulling info for tickers with a period in their name. Since this is only two of the stocks, I remove them for simplicity:

#these two make errors due to the period in them
sp_list.remove('BF.B')
sp_list.remove('BRK.B')

At this point, we can use the 498 tickers we now have (since we’ve gotten rid of the two stocks prone to errors) to build our options DataFrame.

To keep things simple, I want the options to all have the same expiration date. Some stocks have options expiring every Friday, while others might have contracts on one Friday, but not another. Let’s look at the expiration dates for the first ticker in our sp_list:

get_expiration_dates(sp_list[0])
>>>['August 21, 2020', 'September 18, 2020', 'November 20, 2020', 'January 15, 2021', 'February 19, 2021', 'January 21, 2022']

Let’s go for September 18th (any of the dates will do). The following for loop will assign September 18th to expiration and then use a for-loop to generate our DataFrame:

#Creating options datasetexpiration = get_expiration_dates(sp_list[0])[1]   #September date
options_df = pd.DataFrame()
for stock in sp_list:
try:
price = get_live_price(stock)
option = get_calls(stock, expiration)
option['CurrentPrice'] = price
option['Ticker'] = stock
option['DateExp'] = expiration
options_df = options_df.append(option)
print(price)
print(stock)
except:
print('Option contract not available on '+ str(expiration))
pass

I use a try and except statement within the for-loop due to the fact that some of tickers within sp_list might not have options contracts available on the date specified. To give you an idea of how often this occurs, there are the print() quotes which will display with each ticker iterated over.

Here is a snapshot of how the resulting DataFrame options_df appears:

What pieces of information do we care about here?

Firstly, it is useful to keep the Contract Name, Ticker, and DateExp columns for clarity. We also want to keep Current Price, which is the price per share of the underlying stock, along with the Strike, which tells us what each share would cost to purchase if exercising the option. I also keep both Volume and Open Interest, in order to see whether there is any liquidity in the contract we are viewing:

Let’s clear up a little space:

options_df = options_df[['Contract Name', 'Strike','Bid', 'Ask', 'CurrentPrice', 'Volume', 'Open Interest', 'Ticker', 'DateExp']]

One piece of information that we need, yet remains unclear, is the price of the option itself. I use an average of the Bid and Ask as the most likely price for the option. In order to do so, I remove rows in which either contains a ‘-’ and use the pd.to_numeric function in order to operate on them.

options_df['AvgPrice'] = (pd.to_numeric(options_df.Bid) + pd.to_numeric(options_df.Ask)) / 2

Evaluating Costs and Payoff Scenarios

Photo by Pepi Stojanovski on Unsplash

To keep things simple, we want to first assume that we will not exercise the option until the expiration date. If not, we would need to analyze the probability of specific price changes for each day, and we would likely add unnecessary complexity to the model.

Depending on how the price of the underlying security changes up to the expiration date, the option might pay off or might not.

For example, if the share price at expiration is lower or equal to the strike price, we would not exercise the option, since we would lose money. In this case, our loss would be the cost of the option contract purchased (which is the option price per share * 100 shares, since contracts are most often for 100 shares at a time).

If the share price at expiration is higher than the strike price, then we have the following payoff (where P is share price at expiration, S is strike price, Option is the price to purchase the contract):

Using this payoff equation, we can find a breakeven point where the payoff would be equal to 0. In order to do so, I use the following:

Where x is the percent change in the share price of the stock from the start date to the expiration date. We can factor out the 100s and solve for x to get the breakeven percent change from the starting share price:

#Creating columns for total option cost and breakeven percentageoptions_df['OptionCost'] = options_df['AvgPrice'] * 100options_df['BE_percent'] = ((options_df.Strike + options_df.AvgPrice)/(options_df.CurrentPrice)) - 1

We can see a basic relationship between the breakeven point, strike price, and current share price. The further the strike price is below the current price, the lower the breakeven percent needs to be. The greater the strike price is, the higher we need the price to move up.

We can now create visualizations of the option payoffs under different price change scenarios. Let’s create a function that will generate a payoff based on the equations above:

def price_percent_payoff(x, row):   #x is in decimal format
new_price = row.CurrentPrice * (1 + x)
if (100*new_price - 100*row.Strike - row.OptionCost) >= 0:
payoff = 100*new_price - 100*row.Strike - row.OptionCost
else:
payoff = -(row.OptionCost)
return(payoff)

Let’s now use this function to generate payoffs for some AT&T (T) option. We will first generate a set of scenarios for price changes (in this case, we will look at each 1% change from a 50% drop in the price to a 50% gain).

x_axis = np.arange(-50, 51, 1)
payoffs = []
for p in range(len(x_axis)):
percent = (p - 50)*0.01
payoff = price_percent_payoff(
percent,options_df[options_df.Ticker == 'T'].iloc[5])
payoffs.append(payoff)
plt.bar(x_axis, payoffs)
plt.xlabel('Price % Change from start date to expiration')
plt.ylabel('Payoff ($)')
plt.title('Payoffs for AT&T under various price changes');
plt.savefig('T_prices.png', dpi = 800)

Any price under the breakeven percent change of 0.00049 (can be accessed through options_df[options_df.Ticker == ‘T’].iloc[5][‘BE_percent’] ) results in a loss equal to the full price of the option contract. In other words, if AT&T stock is $30.01 per share, we need the price to be $30.02 (1.00049 * 30.01) to breakeven.

Now that we have generated the options dataset and created a way to find payoffs at different price changes percentages, we will move onto the probability distribution section in Part II.

Stock Options Model: Part I Github (Created using Colab)

--

--

Jacob Linger

Young professional interested in economics, data science, and mathematics