# Interactive Exploratory Data Analysis (EDA) of Sensor Data With Pandas: The Plotting API and Plotting Backends

## About Altair, hvPlot, Pandas Bokeh and Plotly

## The pandas plotting API

Pandas allows to plot *DataFrames* and *Series* with a almost identical API and plot types. *DataFrames* have `pandas.DataFrame.boxplot()`

in addition to `pandas.DataFrame.plot.box()`

. But that’s the only exception without relevance cause as far as I know the later has the same functionality as the first one. In the sensor data context w.r.t. *DataFrames* one plots usually a single (univariate data) or several columns (sensor fusion of several univariate data, multivariate data). W.r.t. *Series* one usually plots time series (timestamped series) with usually datetime timestamps as index and univariate data as values. Dependent on what data is contained in the *DataFrames*/*Series* and how data is represented using the *DataFrames*/*Series* some plot types do not really make sense. We’ll dive deeper into this in a later post. For a deep dive you’ll probably want to have a look into plotting API source code.

## The pandas DataFrame plotting API

Data from Pandas DataFrames may be plotted with

- pandas.DataFrame.plot: Make plots of DataFrame. The default type is a line plot.
- pandas.DataFrame.plot.area: Draw a stacked area plot.
- pandas.DataFrame.plot.bar: Vertical bar plot.
- pandas.DataFrame.plot.barh: Make a horizontal bar plot.
- pandas.DataFrame.plot.box: Make a box plot of the DataFrame columns.
- pandas.DataFrame.plot.density: Generate Kernel Density Estimate plot using Gaussian kernels.
- pandas.DataFrame.plot.hexbin: Generate a hexagonal binning plot.
- pandas.DataFrame.plot.hist: Draw one histogram of the DataFrame’s columns.
- pandas.DataFrame.plot.kde: Generate Kernel Density Estimate plot using Gaussian kernels.
- pandas.DataFrame.plot.line: Plot DataFrame as lines.
- pandas.DataFrame.plot.pie: Generate a pie plot.
- pandas.DataFrame.plot.scatter: Create a scatter plot with varying marker point size and color.
- pandas.DataFrame.boxplot: Make a box plot from DataFrame columns.

## The pandas Series plotting API

Data from Pandas Series may be plotted similarly to DataFrames with

- pandas.Series.plot: Make plots of Series. The default type is a line plot.
- pandas.Series.plot.area: Draw a stacked area plot.
- pandas.Series.plot.bar: Vertical bar plot.
- pandas.Series.plot.barh: Make a horizontal bar plot.
- pandas.Series.plot.box: Make a box plot of the Series.
- pandas.Series.plot.density: Generate Kernel Density Estimate plot using Gaussian kernels.
- pandas.Series.plot.hist: Draw one histogram of the Series.
- pandas.Series.plot.kde: Generate Kernel Density Estimate plot using Gaussian kernels.
- pandas.Series.plot.line: Plot Series as line.
- pandas.Series.plot.pie: Generate a pie plot.

## The backends

Before Pandas version 0.25 the builtin plotting functionality for *DataFrames* and *Series* used matplotlib as backend with support for static, non-interactive plots. Beginning with Pandas version 0.25 it’s to use other, potential interactive plotting frameworks for plotting. To change the plotting backend put either

import pandas as pdpd.options.plotting.backend = '<BACKEND-NAME>'

or

import pandas as pdpd.set_option('plotting.backend', '<BACKEND-NAME>')

into a Notebook cell and execute the cell.

Some of the visualization backends supported are listed on the visualization docs page. Other’s can be found on stackoverflow (Change pandas plotting backend to get interactive plots instead of matplotlib static plots). One thing to point out is that the plot types supported by the backends do not neccessarily all work with pandas DataFrames or Series. Which plot types are supported and to what degree depends on of to what extend the backends implement the Pandas plotting API.

`altair_pandas`

via Altair (backend name:`altair`

): Supported interactive plot types.`hvplot`

via Bokeh (backend name:`hvplot`

or`holoviews`

, beginning with version 0.5.1): Supported interactive plot types.`pandas-bokeh`

via Bokeh (backend name:`pandas_bokeh`

): Supported interactive plot types.`plotly`

(backend name:`plotly`

, beginning with version 4.8): Supported interactive plot types.

We’ll be using the following versions of the visualization integration packages (taken from `requirements.txt`

):

`# plotting backends`

git+https://github.com/altair-viz/altair_pandas

pandas-bokeh==0.5.2

hvplot==0.7.0

plotly==4.14.1 # requires additional setup

When using *Altair* one has to install the Pandas Backend `altair_pandas`

as dependency directly from GitHub and cannot be pinned to a specific version tag ATM and is installed as part of `pip install -r requirements.txt`

or via `pip install git+https://github.com/altair-viz/altair_pandas`

which can create trouble w.r.t. plot styling reproducibility (in case the implementation is changed over time).

When using *altair* and/or *pandas-bokeh* and/or *hvplot* interactive plots work out of the box. With *plotly* you’ve to consider additional setup steps described here when using plain Jupyter Notebooks and here when using JupyterLab. Cause I host the examples on binder and this additional setup step I’ve not included *plotly* plots at the time of writing.

## Plot type compatibility

`hvplot`

depends on `scipy`

to being able to use `pandas.DataFrame.plot.density()`

and `pandas.DataFrame.plot.density()`

. In case you miss the dependency `scipy`

you’d get an import error:

`ImportError: univariate_kde operation requires SciPy to be installed.`

The following table summarizes which backend supports which plot types.

The `pandas.DataFrame.plot.density()`

, `pandas.DataFrame.plot.hexbin()`

and `pandas.DataFrame.plot.kde()`

methods are supported by `hvplot`

only. The `pandas.DataFrame.plot.pie()`

method is supported by `pandas_bokeh`

only. In the context of sensor date pie plots are rather unimportant. Density and KDE plot however can be important. I’d recommend to use `hvplot`

in the beginning. The interactive features like information shown when using mouse hover, etc. may differ significantly. In case you miss information or do not like the look and feel of `hvplot`

you can check it with one of the other frameworks. In any case the probability is high that you’ll have to switch the backend in your notebooks dependent on what plot type you want to use.

Feel free to visit source code repository, press the “Binder” button to open the repository in a Binder environment and explore the `DataFrame`

plot interactivity in the notebook backend_pandas_plotting_api_compatibility_dataframe.ipynb .