The Startup
Published in

The Startup

Working with Time Series in Pandas

A guide on how to work with time series.

Photo by NordWood Themes on Unsplash

A time series is a series of data points indexed in time order. You can often see time series datasets in different fields such as finance and economics. You can easily manipule time series with pandas. In this post, I’ll cover the following topics .

  • How to use the to_datetime method?
  • Changing the date format
  • Convert unix/epoch time to a regular time stamp
  • Working with the Period and PeriodIndex
  • Frequency and date offsets

Let’s dive in!

How to use the to_datetime method?

To work with time series, let’s import Pandas and Numpy first.

You can convert a date into timestamp with the to_datetime method. Let me show you this.

If you set the dates as an index, you can easily perform the analysis. Let’s create a variable named date which includes dates.

Let’s convert this data to DatetimeIndex object.

Changing the date format

You can change the date format. Let’s take a look at the default format.

To change the format of the date, you can use the dayfirst parameter.

You can use the format parameter when converting the date in different formats. For example, let’s want to use asterisk instead of slash.

You can also use any other symbol instead of this asterisk.

There may be strings in the index that do not represent a date. By default, the pandas cannot recognize them and gives an error message. For example:

Here, xyz does not specify a date. Let’s try to covert this variable into datetime.

As you can see, we got an error message saying “Unknown string format: xyz”. To avoid this error, you can use the errors parameter. This parameter take three values. These are ignore, raise and coerce. If you set the coerce value for this parameter, different format strings are represented by NaT. Let me show you this:

Convert unix/epoch time to a regular time stamp

You can also convert an epoch time into time stamp with the to_datetime method. Note that the start time for computers is January 1, 1970. The epoch is the number of seconds that have passed since this date. To show this, let’s take a value.

Let’s convert this epoch value into time stamp with the to_datetime method. You can set the unit parameter to s to read the number in seconds.

Here you go.

Frequency and Date Offsets

Dateoffsets are a standard kind of date increment used for a date range in Pandas. They are used for the frequency parameter. Let’s create 4 hour values between two dates with the freq parameter.

Now let’s create dates with the Sundays of the last week of each month.

Period and PeriodIndex

Period pepresents a period of time such as days, months, and years. For example, let’s create a variable with the Period method.

You can see the methods that can be used for this variable with the dir function as follows:

Let’s see the start date of this variable.

Let’s see the finish date.

You can perform some operations such as addition and subtraction with the period variable. To show this, let me create a variable in a monthly period.

Now let’s add 5 to this date as follows:

Let’s subtract 3 to this dat as follows:

If the frequency of the two periods is the same, you can see the difference between the two dates.

How to use the period_range function?

Regular date ranges can be generated with the period_range function.

Notice that these dates are the PeriodIndex object. You can set this PeriodIndex object as an index.

Period and PeriodIndex objects can be converted to another frequency with the asfreq method. To show this, let’s create a variable named p.

Let’s turn the first month of this annual period into a one-month period.

Let’s convert the last month into a monthly period.

Quarterly data are standard in areas such as finance. Quarterly reports are reported at the end of the financial year. The end of the fiscal year is usually the last month of the year, but sometimes there may be different months of the year. For example, the data below indicates that the 4th quarter of the year ends in DEC.

The data below indicates that the 4th quarter of the year ends at the FEB.

Let’s check it out.

Let’s change the format of this date to the daily format.

Quarterly dates can be generated with period_range.

Let’s create a time series using these dates.

We can convert Series and DataFrame objects indexed with timestamp to period with to_period method. To illustrate this, let’s create a date range.

Let’s create a time series with this date range.

Now let’s convert this time series to period type.

Let’s check the index of this data.

As you can see, this variable is a PeriodIndex.

Conclusion

Time series is a series of data points in which each data point is associated with a timestamp. In this post, I talked about how to work with time series. That’s it. I hope you enjoy it. Thank you for reading. You can find this notebook here.

Don’t forget to follow us on YouTube | GitHub | Twitter | Kaggle | LinkedIn

Please don’t forget to follow me and if this post was helpful, please click the clap 👏 button below a few times to show me your support 👇

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store