Exploring Fin Data on Quandl
I’ve decided to do some data mining using Quandl. Quandl makes it easier to mine financial data. We are going to see if there is a relationship between Apple’s historical stock prices and option call volume from the CBOE. Make sure to have created a Plotly and Quandl account.
First we need to import a few libraries.
import quandl
import plotly
import pandas as pd
import plotly.plotly as py
import plotly.graph_objs as go
Importing Quandl will allow us to access the financial database of our choice. If you don’t know what Quandl is or haven’t used it before, its a great resource for accessing financial data from all type of vendors. You can find out more about the service here.
https://www.quandl.com/data/WIKI/AAPL-Apple-Inc-AAPL-Prices-Dividends-Splits-and-Trading-Volume
Next we will need to include our username and api keys. You can find these in your account settings when you sign up for a free account on Plotly.
plotly.tools.set_credentials_file(username=‘username’, api_key=’key’)
Now we will need to actually extract the data. We will create a python object called stock_data that will hold the historical prices data we have on Apple.
stock_data = quandl.get(“WIKI/AAPL”, start_date=“2007–01–01”, end_date=“2015–01–01”)
Second we will create another object that holds the option volume data provided by the CBOE.
option_data = quandl.get(“CBOE/TOTAL_PC”, authtoken=“authtoken”, start_date=“2007–01–01”, end_date=“2015–01–01”)
Next we will create a dataframe using the pandas library. The “import pandas as pd” line above will come in handy. After this is complete we will concatenate the two dataframes into one. This will concat the two different datasets by timestamp and make it easier for us to do some simple timeseries analysis.
stock_df = pd.DataFrame(stock_data)
option_df = pd.DataFrame(option_data)concat_df = pd.concat([stock_df, option_df], axis=1)
Now we will use the Plotly libraries we imported earlier to create a scatter plot of the data points we are interested in. As you can see below we can really pull any data point of our choice. If you want to see other data points of interests you can run..
print(concat_df.head())
to see other elements, such as “Total Put Volume”, “Open”, “High”, “Low” prices etc. Below we will just look at the relationship between Total Call Volume and Apple’s opening prices.
x_data = concat_df[“Total Call Volume”]
y_data = concat_df[“Open”]trace = go.Scatter(
x = x_data,
y = y_data,
mode = “markers”
)
data = [trace]
Below is where your plotly account will be useful. The line below will create a unique url for the plot we just created using the above data.
plot_url = py.plot(data, filename=’Apple vs. Options Volume’)
Cheers!