Working with Time Series in Pandas
A guide on how to work with time series.
A time series is a series of data points indexed in time order. You can often see time series datasets in different fields such as finance and economics. You can easily manipule time series with pandas. In this post, I’ll cover the following topics .
- How to use the to_datetime method?
- Changing the date format
- Convert unix/epoch time to a regular time stamp
- Working with the Period and PeriodIndex
- Frequency and date offsets
Let’s dive in!
How to use the to_datetime method?
To work with time series, let’s import Pandas and Numpy first.
You can convert a date into timestamp with the to_datetime method. Let me show you this.
If you set the dates as an index, you can easily perform the analysis. Let’s create a variable named date which includes dates.
Let’s convert this data to DatetimeIndex object.
Changing the date format
You can change the date format. Let’s take a look at the default format.
To change the format of the date, you can use the dayfirst parameter.
You can use the format parameter when converting the date in different formats. For example, let’s want to use asterisk instead of slash.
You can also use any other symbol instead of this asterisk.
There may be strings in the index that do not represent a date. By default, the pandas cannot recognize them and gives an error message. For example:
Here, xyz does not specify a date. Let’s try to covert this variable into datetime.
As you can see, we got an error message saying “Unknown string format: xyz”. To avoid this error, you can use the errors parameter. This parameter take three values. These are ignore, raise and coerce. If you set the coerce value for this parameter, different format strings are represented by NaT. Let me show you this:
Convert unix/epoch time to a regular time stamp
You can also convert an epoch time into time stamp with the to_datetime method. Note that the start time for computers is January 1, 1970. The epoch is the number of seconds that have passed since this date. To show this, let’s take a value.
Let’s convert this epoch value into time stamp with the to_datetime method. You can set the unit parameter to s to read the number in seconds.
Here you go.
Frequency and Date Offsets
Dateoffsets are a standard kind of date increment used for a date range in Pandas. They are used for the frequency parameter. Let’s create 4 hour values between two dates with the freq parameter.
Now let’s create dates with the Sundays of the last week of each month.
Period and PeriodIndex
Period pepresents a period of time such as days, months, and years. For example, let’s create a variable with the Period method.
You can see the methods that can be used for this variable with the dir function as follows:
Let’s see the start date of this variable.
Let’s see the finish date.
You can perform some operations such as addition and subtraction with the period variable. To show this, let me create a variable in a monthly period.
Now let’s add 5 to this date as follows:
Let’s subtract 3 to this dat as follows:
If the frequency of the two periods is the same, you can see the difference between the two dates.
How to use the period_range function?
Regular date ranges can be generated with the period_range function.
Notice that these dates are the PeriodIndex object. You can set this PeriodIndex object as an index.
Period and PeriodIndex objects can be converted to another frequency with the asfreq method. To show this, let’s create a variable named p.
Let’s turn the first month of this annual period into a one-month period.
Let’s convert the last month into a monthly period.
Quarterly data are standard in areas such as finance. Quarterly reports are reported at the end of the financial year. The end of the fiscal year is usually the last month of the year, but sometimes there may be different months of the year. For example, the data below indicates that the 4th quarter of the year ends in DEC.
The data below indicates that the 4th quarter of the year ends at the FEB.
Let’s check it out.
Let’s change the format of this date to the daily format.
Quarterly dates can be generated with period_range.
Let’s create a time series using these dates.
We can convert Series and DataFrame objects indexed with timestamp to period with to_period method. To illustrate this, let’s create a date range.
Let’s create a time series with this date range.
Now let’s convert this time series to period type.
Let’s check the index of this data.
As you can see, this variable is a PeriodIndex.
Time series is a series of data points in which each data point is associated with a timestamp. In this post, I talked about how to work with time series. That’s it. I hope you enjoy it. Thank you for reading. You can find this notebook here.
10 Best Python Libraries for Data Science
Libraries that data scientists should know and top 5 books to learn them.
8 Best Seaborn Visualizations
Hands-on statistical plots with Seaborn using the penguin dataset.
Please don’t forget to follow me and if this post was helpful, please click the clap 👏 button below a few times to show me your support 👇