Tips for Developing User-Facing Tools in Jupyter Notebooks
At Zymergen, we use Jupyter notebooks to quickly create user interfaces for rapidly developed tools. In this post, we share code samples for developing IPython custom extensions, ipywidgets, and pandas dataframes displays to create user interfaces for Jupyter notebooks.
Solutions Engineers at Zymergen work to rapidly prototype software tools needed by our internal users (scientists). They work on the Technology team and serve as a support system for software, providing last mile support and one-off solutions. Once these prototypes are shipped to the users and are proven valuable, many of the tools are transitioned into our production codebase and out of Jupyter notebooks.
Most of our users do not have software backgrounds, and command line tools can be overwhelming and complicated to use. Instead, we use Jupyter notebooks because they can provide a user-friendly interface delivered via the browser that can be developed quickly. This means Solutions Engineers can focus on the core logic of the tool and spend less time developing a user interface, while still avoiding giving command line tools to users.
One of our Solutions Engineers, Danielle Chou, traveled to the 2017 JupyterCon in New York to talk about using Jupyter notebooks at Zymergen for rapid development. As outlined in the presentation she delivered, this blog post describes three key features of Jupyter notebooks that allow us to create standard user interfaces for our solutions tools: magic commands and custom extensions, ipywidgets, and pandas dataframe displays. We also include some sample code for each feature, using our “Hitpick Plan” Jupyter notebook as a specific example.
Problem Background: Hitpicking
At Zymergen, we genetically engineer strains of bacteria and other organisms to be more efficient at producing products of interest. When we make genetic changes to a strain, we need a way to test the results of these changes. These tests can include measurements of yield and productivity of our newly designed strains, as well as production issues such as contamination. Normally, these strains are sent for testing in a 96-well plate with a new strain sample in each of the wells (pictured below).
Within a single plate, there can be microenvironments that affect test results such as exposure to elements, evaporation, and temperature. These microenvironments have been known to dramatically impact the measurements taken from wells. In order to minimize the chances that test results are biased by each strain’s location on the plate, samples that are ready for testing are “picked” from a source plate to a destination plate (or plates) and placed in random locations. At Zymergen, we call this process hitpicking.
A hitpick plan defines the source well locations, destination well locations, and the volume of liquid to be transferred. Once a hitpick plan is generated, we send it to our liquid handling robots to perform the pipetting from plate to plate.
Most lab protocols are defined in automated workflows at Zymergen. Because of the specific logic involved in the hitpick plan creation, it was originally generated manually by the scientists or via complicated Microsoft Excel tools. There was also little standardization across teams despite working towards the same goal. After identifying a need for a more standardized tool, Zymergen Solutions Engineers implemented a solution using a Jupyter notebook. This notebook illustrates the three main features that are important to create a standard Jupyter notebook at Zymergen.
Magic Commands and Custom Extensions
We use magic commands to create custom extensions. A custom extension is a Python module which incorporates custom behavior in the shell. In our case, we use these custom extensions to load a display of widgets for a user to interact with. Extensions also allow us to display minimal code, making the notebook much cleaner to the user.
Magic commands are recognized specifically by Jupyter notebooks and are identifiable via the “%” symbol. They are special commands defined by the IPython kernel. While magic commands can have many capabilities, we will focus on the use of the %load_ext command which loads custom IPython extensions by their module names.
In order to create your own extension, you have to save the Python file to the .ipython/extensions folder. Then, you can load each extension by the filename (in this case, generate_hitpick_template). In order for the extension to load properly, you must name your function load_ipython_extension for it to be recognized as the extension to run upon loading. This is what the function looks like in the extensions file:
The magic command to load custom extensions is extremely useful because it allows us to load a lot of code into the notebook without exposing it to the user. Thanks to the custom extension, this is the only “code” exposed to users in our Hitpick Plan Jupyter notebook. All other code is hidden from the users and upon running this command, users will see a nice interface created by ipywidgets.
We use ipywidgets to create the UI for our solutions notebooks, including standard navigation and input elements. Ipywidgets are reusable interface elements that can be included in Jupyter notebooks. For full documentation of ipywidgets, see here.
We use the tabs widget for guiding a user through different inputs and instructions. Through tabs you can embed individual widgets into a “menu” of widgets. This steps users through a process that reads left to right. Below is a snippet of code that creates the tabs widget:
This is what the widget looks like in the Jupyter notebook. You can see that each step in the instructions is represented by a tab as you move your way from left to right. Each of these tabs is a widget itself.
Ipywidgets have more simple interactive controls that are great for keeping our scientists focused on providing input without needing them to dive into the code. Specifically, we like to take advantage of the dropdown and box widgets for user inputs. These help us restrict inputs to valid values.
Below is an example of the “Volume” tab of the Hitpick Plan Jupyter notebook. On this tab, we ask users how much volume the robot should draw up from each well on the plate. Based on the defined specifications of the machine, we know that the input volume must be between 1 and 100 microliters of liquid. We also know that, normally, the liquid volume is 10 ul. Using the bounded float text widget, we can limit users to enter data only in the range in which the robot can operate, while also predefining a default volume for the user. This is really helpful when a parameter needs to be collected but when we also know there are certain limitations to this user input.
Pandas Dataframe Display
Finally, we use pandas to display a visualization after the user runs the Jupyter notebook.
Jupyter notebooks can by default display visual output at the end of each run. We use this to show users a representation of what the result of executing the Hitpick Plan will look like.
Pandas interacts well with Jupyter notebooks, and for our Hitpick Plan notebook, the data is held in a pandas dataframe throughout the computation steps. Pandas has really nice functionality to display dataframes using the style functionality, which builds an HTML style representation of the dataframe in which you can add color to table cells. Below is a block of code that generates this:
And here is an example of the visualization displayed to the user. This figure represents the 96-well plate described above, where each cell is a well that holds a sample in the plate:
For the Hitpick Plan notebook in particular, the ability for Jupyter notebooks to display visualizations is really helpful for users — they can see what the destination plates will look like when a certain hitpick plan is executed. This is not only important for any error checking, but also helps users gain trust in the tool: this output reproduces a visual similar to the one produced with the previous Excel tools, helping to ease the transition to a new implementation.
Overall, Jupyter notebooks are extremely important to the Solutions Engineering team at Zymergen, and they have become a standard for deploying rapid prototypes of new tools to our users. Using custom extensions and magic commands, ipywidgets, and pandas data displays, the notebooks provide a platform for us to create standardized, user-friendly interfaces, without our team needing a background in front-end development. Finally, Jupyter notebooks have enabled Solutions Engineers to provide proof-of-concept solutions that can be easily transitioned to Zymergen’s production systems and out of Jupyter notebooks.
Acknowledgements: Thanks to Marc Colangelo, who originally authored the hitpick tool.
Danielle Chou is a Solutions Engineer on the Technology Products team at Zymergen.