Interactive Exploratory Data Analysis (EDA) of Sensor Data With Pandas: Univariate Time Series Data

Visualizing univariate time series data with the pandas plotting API

Florian Kromer
Jan 7 · 6 min read

This post shows the basic look and feel of the pandas plotting API applied to typical univariate sensor data represented as time series. Feel free to visit source code repository, press the “Binder” button to open the repository in a Binder environment and explore the plot type interactivity in the notebook time_series_univariate.ipynb. Of course not every plotting type makes sense to visualize univariate time series data. However for the sake of completion and to make it clear that some plot types make no sense I’ve added GIFs for all of them. If a plotting backend does not support a plot type I skipped the GIF in the corresponding section. In the Binder environment I’ve tried to plot with all plot types to force output of the exceptions. These exceptions relate not to wrong usage of the pandas plotting API but could help to figure out that a plot type is simply not supported (yet). The example uses linear fake data of a temperature sensor. The fake data is constructed as follows:

d = [i for i in range(20, 20+10, 1)]
dti = pd.date_range("2020-01-01 12:00:00.000001", periods=10, freq="S").tz_localize("Europe/Berlin")
temperature_series = pd.Series(data=d, index=dti, name="Temperature")

This data is rather boring. I recommend to adjust the corresponding Jupyter Notebook cell and replace d = [i for i in range(20, 20+10, 1)] with

import randomrandom.seed(42)
d = random.sample(range(20, 20+10), 10)

to generate random, almost ever non-linear fake data. The following sections show how the plot types look like and behave for the different plotting backends in the default configuration.

Besides the power of the pandas builtin capabilities for visualizing time series data another huge advantage for visualization lies in the data structure itself. It’s comparably easy to get raw data into the pandas Series representation as well as to preprocess, combine, separate data using Series. This topic is huge and beyond the scope of this post. However it’s important to note cause it’s one of the reasons why interactive exploratory data analysis is that powerful when using pandas Series.

area plot (altair)

In the default configuration the altair backend creates non-optimal axis labels as well as formatting (y-axis: datetime formatting).

area plot (pandas_bokeh)

In the default configuration the pandas_bokeh backend has the best axis labeling and datetime formatting.

area plot (hvplot/holoviews)

In the default configuration the hvplot backend creates non-optimal axis labeling and relative time axis information.

Bar plots are not suitable to visualize time series data. The comments for this plot type have been skipped.

bar plot (altair)
bar plot (pandas_bokeh)
bar plot (hvplot/holoviews)

Horizontal bar plots are not suitable to visualize time series data. The comments for this plot type have been skipped.

barh plot (altair)
barh plot (pandas_bokeh)
barh plot (hvplot/holoviews)
box plot (altair)

The altair backend is the only backend which shows most detailed statistical metrics.

box plot (hvplot/holoviews)

The hvplot backend does not show statistical metrics and not really usable.

density plot (hvplot/holoviews)

The hvplot backend is the only backend capable of visualizing density plots and creates them with labeled axes out of the box.

Hist plots are not suitable to visualize time series data. The comments for this plot type have been skipped.

hist plot (altair)
hist plot (pandas_bokeh)
hist plot (hvplot/holoviews)
kde plot (hvplot/holoviews)

The hvplot backend is the only backend capable of visualizing KDE plots and creates them with labeled axes out of the box.

line plot (altair)

In the default configuration the altair backend creates non-optimal axis labels as well as formatting (y-axis: datetime formatting). Line plots are usually used to visualize potentially long ranging data. This means being able to zoom into the data is essential. This backend does not support zooming at all.

line plot (pandas_bokeh)

In the default configuration the pandas_bokeh backend has the best axis labeling and datetime formatting.

line plot (hvplot/holoviews)

In the default configuration the hvplot backend creates non-optimal axis labeling and relative time axis information.

Pie plots are not suitable to visualize time series data. The comments for this plot type have been skipped.

pie plot (pandas_bokeh)

Scatter plots require two related data sets. W.r.t. time series data visualization using a scatter plot does not make sense.

The look and interactive feel varies significantly dependent on the plotting backend in use.

The altair backend is the least interactive one. It is not possible to zoom via the mouse wheel, it is not possible to select a plot area via drag and drop. In comparison to the other plotting backends the information shown during mouse hover is little. For plots with the datetime index on the x-axis the formatting differs in comparison to the other backends as well as between plot types all plotted with altair. Exporting the visualizations to image formats is supported. The only plot type the altair backend is recommended for is the box plot.

The pandas_bokeh and hvplot/holoviews backend both use bokeh under the hood which results in a quite similar look. In My opinion the pandas_bokeh backend uses slightly better defaults for datetime index formatting and labeling of axis. In addition it’s way easier to hit data points to display data via mouse hover. I’d recommend to use pandas_bokeh for all other plot types relevant for time series data despite of the ones recommended to use with altair or hvplot (box, density, KDE plot). The most important plot types are hist plot and line plot.

The hvplot backend is the only plotting backend which supports the density plot and KDE plot.

The Startup

Get smarter at building your thing. Join The Startup’s +789K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Florian Kromer

Written by

Software Developer for rapid prototype or high quality software with interest in distributed systems and high performance on premise server applications.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +789K followers.

Florian Kromer

Written by

Software Developer for rapid prototype or high quality software with interest in distributed systems and high performance on premise server applications.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +789K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store