Time Series and Autocorrelation — An Exploration

Manali Shinde
One Datum At A Time
7 min readMar 18, 2018

--

Hello Readers!

Welcome to another exploratory article of some of the most important tools that one can use in Python when conducting statistical analysis and begin constructing a predictive model. In this article, I wanted to highlight one of the biggest reasons people use programming and python programming: Time Series. With time series, I also wanted to highlight a practical usage of when we use time series, and how they can come in handy when dealing with popular data such as the stock market and tech stocks.

First, I want to go in and explain what exactly time series are, and the different functions that you can use to convert some aspects of your data frame into a time series. Then, I wanted to explore the concept of autocorrelation and correlation and when you can use either of these statistical tool. During this article, I will be use stock market data for Apple as an example of how these tools can be used to make observations of the company’s month-end stock price trend, and it’s month-end returns . This data has been downloaded in a CSV format for easier access.

Let us start this exploration!

  1. What Are Time Series?

Time series are data frames that are organised by how time effects a specific variable or a series of number. We can look at years, months, days, or even seconds. Anything that can be observed at any point in time can constitute as a time series. Used in finance, economics, life sciences, or the physical sciences, knowing how to manipulate different time series tools can helps us observe trends or make predictions on how the dependent variable can be affected by the passage of time. How you mark and refer to the time series will depend on how you want to present your data set. Depending on the observation you want to make you can choose to format your dataframe according to:

  • Timestamps — specific instants in time
  • fixed periods — a particular month or year
  • intervals of time — indicated by a start and end of a timestamp. Example; five year, one year, 6 months
  • experiment or elapsed time — each timestamp is a measure if time relative to a particular start time. Example: observing the growth of bacteria colony on an agar plate.

In this article, we are looking at a five year interval of time for the stocks of Apple. Specifically, we are looking at Feb 28, 2013 — Feb 28, 2018 as the start and end timestamp:

As you can see here. our index is not the Date, but rather just the number of the row. When manipulating time series data such as this, you have to be careful and manipulate a couple of things before we start. First, I want to ensure that my Date column isn’t a column of strings, but the actual date time formate in pandas. Second, I want to go ahead an ensure that the dataframe is easy to work with, in this case, I’m going to make my Date column as my index, as I know that the stock market data is related to the time. You can see these manipulations below:

2. Using Resampling

Resampling is the process of converting a time series from one frequency to another. When manipulating time series data, we may want to do one of two things: downsampling, or upsampling (to be very honest, we may even want to do both, depending on what questions you are trying to solve).

  • Downsampling — aggregating data from a higher frequency into a lower frequency. For example, going from days to months, or month to year. You can also aggregate according to quarters or intervals of a couple of years.
  • Upsampling — aggregating data from a lower frequency to a higher one. From years to months. One may want to go from days to hours to minutes and seconds to see a particular trend.

For the purpose of this analysis, I used downsampling in order to observe the month end stock prices for Apple. It is important to note that the month end stock price was taken for the business month end as that is when you would want to observe the price. For business month end, you would tell python that you want to: .resample(‘BM’).mean( ). When aggregating the date, you can either select for the mean, or the sum — for this article, I chose mean in order to find the average adjusted close stock price at the end of each month.

3. The Adjusted Close Price

Another manipulation I made was to slice for the adjusted closed price in the Apple stock price data. The adjusted closed mean is interpreted as the finding of the average adjusted closing price for the stock on any day of trading. This has been amended to include any distributions and corporate actions that occurred at any time prior to the next day’s open. When looking at business month-end data, you want to makes sure that you slice for the adjusted close as this will help you give the overall amended and final stock price for that month.

4. Autocorrelation

Now, for the nitty gritty statistical part: autocorrelation. An autocorrelation plot is very useful for a time series analysis. This is because autocorrelation is a way of measuring and explaining the internal association between observations in a time series. We can check how strong an internal correlation is in an given amount of time. If the correlation is very strong and positive, it will be +1, if it is very strong and negative, it will be -1, and if there is no correlation, it will be 0.

To more clearly explain autocorrelation, let’s take a look at the Apple stock price data:

a) X — Lag: this is the years that are observed

b) Y — Correlation: the correlation of the adjusted closed price according to time

c) The dotted lines: as we can observe, the data lines are just above and below the first quartile, or within the 95% confidence interval. This will indicate the significance of the correlation. If the line is above or below the dotted line, not in between, we can say that the correlation is significant, and that the adjusted closed value is correlated to time. In this plot, we can see that in the first year, there is a correlation between stock price and time. Then, from year 1 to around year 4.3, we can not say that there are any significant values that show any correlation, and finally, between year 4.3 and 5.3, we can again see some significant values, and it would be beneficial to dive deeper into these observations.

5. Month-End returns

Function in question: appl_monthend_mean/appl_monthend_mean.shift(1)-1

What do I mean by returns? Returns are the capital measure of a company’s profitability, it is the combination of dividends and the increase in stock price. Using the pandas .shift function, you can shift the column you want down by one month to project the stock price for the next month end. Then, using the appl_monthend_mean/appl_monthend_mean.shift(1)-1 function, you can calculate the returns, which is the return of the previous month, divided by the shift of the month. Calculating the returns, and then using the autocorrelation function let’s us observe if the returns are significantly related to the time passing. As we can see in the plot, there is no significant trend seen in the month end returns, the values are randomly distributed, and therefore, we can concluded that the stock price returns are not affected by the time.

Conclusions

Autocorrelations are a great tool to observe whether there is any significant change or trend that can be explained by internal association. In the case of Month End data, we can see that there is some significant trend we can observe, and that the values are not randomly distributed. Although, when looking at the returns, there is no such significant trend (all the values are in between 95% confidence interval). Therefore, in the future, it would be beneficial to observe how stock price data is affected by other company’s stock prices, and if different time intervals affect how stock price, and their returns are affected.

For more information, you can check out the book Python for Data Analytics by Wes McKinney. This is a great textbook for python beginners and to learn more about tools such as time series and resampling.

Hope you found this article insightful! Be sure to check out my other articles and let me know if you found this one useful.

Happy coding, and happy analyzing!

--

--

Manali Shinde
One Datum At A Time

A health informatician and aspiring health data analyst. I am a photographer, writer, dancer, and public health advocate. Join me on my journey!