Convert Pandas DataFrames to images using IMGKit

Pandas is a wonderful Python tool for data analysis, and from time to time it’s nice to be able to integrate some Pandas tables into printed or PDF data products.

One simple approach is to generate CSS-styled HTML versions of Pandas DataFrames and “print” them to images. It’s possible to do this using IMGKit with minimal code, and it’s convenient because it leverages any existing knowledge of CSS layouts and styling that you may have.

What’s also convenient is that you get relatively bug-tested and predictable WebKit rendering — the same engine behind Chrome and Safari web page rendering — when your tables are output. This means that things like multi-line cells in tables, control over padding and advanced rules like alternating line colors should be dealt with somewhat elegantly.


(for recent Debian-like distros; for other environments see the project’s home page)
apt install wkhtmltopdf

pip install imagekit

Building the table

  1. First, build your dataset as a Pandas DataFrame. I used the biostats.csv sample CSV file from here:
import pandas
data = pandas.read_csv(open("biostats.csv", "r"))

2. Next, define a string containing some CSS to style your table. Note the triple quotes for pasting in multi-line string literals.

# modified from
css = """
<style type=\"text/css\">
table {
color: #333;
font-family: Helvetica, Arial, sans-serif;
width: 640px;
border-spacing: 0;
td, th {
border: 1px solid transparent; /* No more visible border */
height: 30px;
th {
background: #DFDFDF; /* Darken header a bit */
font-weight: bold;
td {
background: #FAFAFA;
text-align: center;
table tr:nth-child(odd) td{
background-color: white;

3. Concatenate the CSS and table data into a single file

text_file = open("filename.html", "a")
# write the CSS

# write the HTML-ized Pandas DataFrame

4. Render the HTML to a file

imgkitoptions = {"format": "png"}
imgkit.from_file("filename.html", outputfile, options=imgkitoptions)

5. Finally, marvel at the result

You can find a slightly nicer version of this approach in Jupyter Notebook form in the Github repo: