Historical Stock Data and Basic Analysis for Python Users

Christopher Riggio
4 min readJun 5, 2019

--

Python is quickly becoming the go to programming language for all things finance related. In this blog post I’m going to demonstrate how to pull historical stock data into a python environment with just a few lines of code, as well as some basic analysis and visualizations.

In order to get stock data into your python environment it’s not necessary to use web scrapping or pre download CSV files. The first step will be importing the appropriate libraries.

The Pandas library is a powerful software library that is used mainly for data manipulation and analysis, Pandas Datareader is a sub package of Pandas that allows you to create data frames from various sources on the internet including Yahoo Finance. And finally the Datetime module provides functionality for manipulating dates or time in Python.

As you can see with just a few lines of code we were able to create a data frame of the S&P 500’s high, low, open, close, volume and adjusted close for every day between January 1, 2019 and May 31, 2019 by passing web.DataReader a stock ticker, ‘yahoo’ and the start and end date variables we defined. You can also note that it goes from Friday May 24th to Tuesday May 28th because the market was closed on Memorial Day.

This is not the case if we use the same technique to look at the data for Bitcoin. In the case of cryptocurrency the opening price will always be equal to the closing price of the day before because crypto trades 24 hours a day, 365 days a year.

Basic Analysis

Once you have a data frame of your desired stock and dates I would highly recommend resetting the index to numbers. As you can see when this method is used to pull data from Yahoo Finance it automatically sets the date as the index and this can make it extremely difficult to work with.

Now that we have a usable data frame the indicator I want to take a look at is average true range. Average true range is a technical analysis indicator that measures market volatility. This is done by taking the maximum of either todays high or yesterdays close and subtracting that value from the minimum value of either todays low or yesterdays close.

https://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:average_true_range_atr

Once that is done it is usually divided by a certain number of trading days to get an average. A 14 day moving average seems to be the standard but for the purpose of this blog post I calculated values for 3, 7 and 14 day periods.

Visualizations

While ATR is a very popular tool for analysis it does have its limitations. It can’t predict future stock prices(no metric or analysis tool can). Furthermore it’s measurements are subjective and should always be compared to past readings in order to recognize any trends. Let’s take a look at a candlestick graph of the S&P 500’s open, high, low and close compared to the 3, 7 and 14 day average true range.

As you can see the ATR was relatively low and steady in the middle of April until the beginning of May when the price of the S&P 500 started to decrease, thus increasing the average true range. It should also be noted that because of the amount of stock data I pulled I was not able to calculate the ATR for the beginning of January, to do this I would have had to pull data from the end of 2018.

Obviously this graph is not the most intuitive visualization for stock analysis, however I wanted to put them all on the same graph to demonstrate their differences. As you can the 14 day ATR is a lot smoother while the 3 day line is a lot more “up and down” and reactive to change in price. This doesn’t always mean that shorter ATRs are better, obviously a shorter ATR is going to identify price changes quicker but they’re also more susceptible to false positives.

Going forward I would like to further examine the relationships between these moving averages to better predict the correct time to execute trades.

Sources:

--

--