Altair Interactive Multi-Line Chart

Bind a legend to a multiline display of stock close price. Add a tooltip line and text for the selected stock.

Simi Talkar
Analytics Vidhya
6 min readMar 5, 2021

--

I recently watched a very neat trick BI elite’s Parker Stevens employed to create a ticker chart in Power BI. You can watch it here. What I find quite interesting is that, whether in Power Query or Power View quite a few of the concepts used in Python libraries to clean and visualize data can be transferred over to Power BI and vice versa.

Take the above chart created in Altair (Python library for visualization for instance. In Altair, you can break down various elements of the chart, build them based on the underlying data in standalone units and then layer them and concatenate them. You can position them next to each other or on top of each other like bringing in jigsaw puzzle pieces, until they all fit together in a cohesive whole.

The most significant thing about creating visualizations in Altair is that you can disengage the style of the graph, whether line or scatter or bar (called the mark) from the undisputable encoding of the data that can be Categorical (Nominal or Ordered) or Quantitative or Temporal and so on….

In Power BI, you will see Parker overlaying one visualization on top of another to highlight one stock line chart at a time based on the customizable selection. The rest of the stocks recede into the background as they are greyed. The “detail on demand” is provided by the tooltip line rolling along the x-axis. We will generate this very graphic from raw data in this article using Pandas and Altair.

In this article, we will layer charts and text, tooltip line and images and position them so they all fit together snugly to emerge as the chart seen above.

Installations

The Python packages and their version used to create the graph are as below retrieved using the following commands in a Jupyter Notebook Cell:

%reload_ext watermark
%watermark -v -m -p pandas,altair,numpy,vega_datasets

Data

The data is generously provided by MarketWatch.

Code to clean and prep the data:

After consolidating the downloaded files into a folder called data (relative path), Pandas is used to read the excel files. Data is cleaned to set the proper types and drop columns that are not required for analysis. Additionally required columns such as stock names, that is derived from the file name using regular expression, and the rolling mean are added with the code seen here.

Note: you can find detailed information on the usage of rolling window functions here,

We’re creating a new column “rolling_mean” which takes the moving average of the close price within a window. To do this, we apply .rolling(2).mean() to the Close column, where we specify a window of “2” and calculate the mean for every window along the DataFrame. Each row gets a “rolling_mean” equal to its “Close” value plus the previous row’s “Close” divided by 2 (the window). In essence, it’s Moving Avg = ([t] + [t-1]) / 2.

We also add min_periods=1 to the method which reduces the minimum required number of valid observations in the window to 1 instead of 2. This helps us avoid the NaN in the first row that would otherwise result from the window being a size of 2.

FINAL DATAFRAME

Altair optimization with JSON Caching

alt.data_transformers.enable(‘json’)

Before we set out on creating the visualization, let’s discuss performance gained by including this bit of code, way up top. With this code, the Pandas dataframe we use, is sanitized and serialized and stored in json format as seen below. Altair after all is a Python API for building statistical visualizations that builds upon Vega-lites’ visualization grammar. Conversion of the dataframe speeds up the rendering process due to the caching.

Generate the chart

The code is dotted with comments to explain the various elements that encode the data and create the marks to chart the data. The notable elements are the binding between the line chart and the chart of circles representing the choice of stocks. The “fields” argument of the selection_multi function ties these together.

The alt.condition Altair function is the “If” condition we need to selectively display a color and tooltip text of price, upon the choice of the stock. The transform_filters function of Altair, reduces the dataframe to the selection of stock and “nearest” Date on x-axis chosen.

The output is the graph seen at the top of the article where only the stock selected is charted with color and the tooltip shows its price, whereas the rest and sent to the background in grey.

The operators of layering “+” or alt.layer, side by side concatenation with “|” (pipe operator) or alt.HConcatChart and vertical concatenation “&” or alt.VConcatChart have all been employed in one single line of code to bring all the elements together! The documentation and examples for further study are here.

Data manipulation with Altair or Pandas?

The Choice is yours!

We got a little taste of Altair’s manipulation of the dataset in the code above with the use of “.transform_filter(radio_select)”. The dataframe is reduced to the rows containing the stock as indicated by the selection. But Altair has several other powerful transform functions.

In the below code, the selection elements used above are discarded in favor of a hover selection which allows us to create a tooltip displaying the date that the tooltip lies on as well as the price of ALL stocks along that Date. Here we see the power of another Altair transform function — transform_pivot that, for every stock, aggregates the close rolling mean average to a single date (that the tooltip is on along the x-axis).

The advantage of these transform functions is that you do not have to create an additional dataframe for each of these aggregations that you would like to display in the chart.

And finally..

This article outlines my journey of discovery through the grammar of graphics that Altair visualizes data with. The interplay of elements through layering, binding and data transformations were demonstrated with a visualization that has simple data but a lot of potential. If you have more to add, do comment!

--

--

Simi Talkar
Analytics Vidhya

Certified DS Associate (DP-100, DA-100), pursuing Masters in University Of Michigan’s Applied Data Science Program https://www.linkedin.com/in/simi-talkar/