Full Stack Data Scientist 5: Automating report generation with Jupyter Notebooks

Daniel Sharp
Applied Data Science
3 min readSep 21, 2020

--

Automatically generating reports is useful in a wide range of scenarios, from regularly sharing data within a company or the public, or for personal use, such as comparing the performance of different models side by side without having to manually run a Jupyter Notebook ’n’ number of times.

In my most recent project, I wanted to be able to train several models and then calculate a set of metrics and draw result exploration plots for each. I started by building a notebook with a menu at the top which would allow me to select one of the models I had run, and then execute all the cells in the notebook to explore the results. Although it worked, this quickly became boring to do manually, so I explored how one could programmatically run Jupyter Notebooks and export HTML versions of it, which took me to the following solution.

For this to work, you will need a template notebook where only one or two values need to be changed before execution. For this example, I am going to use this simple notebook that explores the normal distribution for different mean and standard deviations.

Example notebook

The idea with this notebook will be to programmatically change the LOC and SCALE parameters and run the rest of the cells. This will allow me to open the HTML exports side by side and explore the results.

To do this, I’ll replace the 0 and 1 values for something that’s easier to parse (PUT_LOC_HERE and PUT_SCALE_HERE) and clear cell outputs to avoid conflicts in version control:

Template

Then I need to set up a function that will load this notebook, replace the placeholders with the values I want to use, run the notebook and finally export it to an HTML file. This function looks as follows:

Then, with this function, we could run a script like the following to generate reports for a set of parameters that we would like to explore:

This generates four HTMLs which I can open side by side — all of which were created in less than a minute!

Hopefully this will help you save time if you periodically have to run Jupyter Notebooks exports!

Applied Data Science Partners is a London based consultancy that implements end-to-end data science solutions for businesses, delivering measurable value. If you’re looking to do more with your data, please get in touch via our website. Follow us on LinkedIn for more AI and data science stories!

--

--