Better Pandas DataFrame Visualization (with a Taco Bell Example)
TL;DR: The data science team at ShopRunner is releasing a small helper function,
df_to_html
, that can convert a DataFrame to HTML while supporting image embedding, CSS styling to columns, and more. The code for this can be found here.
For most data scientists working in Python, the Pandas library is at the core of data exploration. For us at ShopRunner, we don’t start any modeling without first looking at a Pandas DataFrame that we’ve generated with various products we sell in our network, which usually includes columns for price, image URLs, and other various features we track. But while Pandas is chalk-full of helpful functions for visualization and exploration, there is a glaring issue that has yet to be resolved — HTML!
You might be thinking that Pandas already supports HTML through its to_html
function, but this issue is best seen through an example. Take the finest food of them all, in my opinion: Taco Bell. If we scrape the Bell’s website of it’s various products and keep columns similar to what we would normally find during ShopRunner data exploration, we might end up with a DataFrame like this:
While the DataFrame technically has all the information we need, for someone sadly unfamiliar with the item, they would have to manually copy and paste each URL and image URL to open it in a browser for each of the many, many rows in the DataFrame.
Since Jupyter supports displaying HTML, we might want to use Panda’s built-in to_html
function to alleviate this issue. Trying that, however, leaves us with:
Generating this DataFrame to HTML leads to the most disappointing version of “Spot the Difference” ever — while we now have the ability to display this exact DataFrame in an HTML webpage, it doesn’t make data exploration or visualization any easier. And even by using all the possible arguments we can pass to to_html
, this issue isn’t alleviated much at all:
Sure, with the code written in the above image, one issue is solved with all links now being clickable hyperlinks, but even if we tried to get clever, we still won’t be able to display the image URLs or apply any sort of CSS styling to any of the other columns right within our notebook:
Alas, we are left with another boring DataFrame without any Taco Bell flair. Is there any hope for this incredibly pressing issue?!
Today, there finally is.
With this blog post, the ShopRunner data science team is open-sourcing a small helper function we have written called df_to_html
, which combines the ability of Panda’s to_html
with greater flexibility, visualization capability, and much more. Let’s see it in action with the same initial DataFrame example as above:
With df_to_html
, in a single function call, we’re able to pass all keyword arguments that we would to Panda’s to_html
function while also having the benefit of displaying columns as images with a set width, use custom HTML and CSS tags for any number of columns, and transpose the DataFrame HTML code with a simple interface! For us at ShopRunner, this is an essential function for displaying DataFrames in Jupyter or on a HTML webpage.
To run the gist above, you’ll need Pandas installed, and that’s it! We hope you’ll use this to visualize all the Taco Bell (and other) DataFrames you need!
We are releasing this code under WTFPL license. Enjoy!