Reactive notebooks in Deepnote

Introducing reactivity and how it solves the main challenges of data science notebooks

Simon Sotak
Deepnote
4 min readFeb 16, 2021

--

Traditional computational notebooks like Jupyter receive a lot of criticism. It’s too easy to create a non-reproducible notebook with out-of-order execution or hidden state. We believe that reactive notebooks address some of these issues head on. This article outlines how reactivity works in Deepnote.

Reactive notebooks are like Excel spreadsheets — you just input the data and code, and the whole document updates itself accordingly. No need to click to manually run cells. This makes reactive notebooks easier to reason about and more reproducible.

If you want to try out a reactive notebook for yourself, just open the demo project

Key challenges of notebooks today

Out-of-order execution

Traditional notebooks are executed cell-by-cell, allowing for cell to be executed out-of-order or even repeatedly. While this is very powerful, it also has the potential to create hidden state that is extremely difficult to reason about.

Let’s take a look at an example:

Here, the cells were executed in the order: 1st, 2nd, 3rd, 2nd. This created an out-of-order computational narrative that is very unintuitive. Someone who just reads the notebook (e.g. a supervisor or a colleague) usually does it in a top-to-bottom fashion, so notebook executions are usually interpreted to happen in this order, too.

More hidden state

Every time a notebook is edited but not executed, it becomes stale, as executing it again would create different outputs than the ones already present. To build on the above example, let’s say I delete the last cell, and the notebook now becomes this:

This notebook obviously isn’t reproducible anymore, even if we account for out-of-order execution.

Iterating on notebooks in this manner is extremely common when doing exploratory programming, which is what notebooks are intended for. So these kinds of issues are very common.

What is reproducibility?

A notebook is reproducible when someone else can take all of its code, run it on a different computer than the author’s, and reliably get the same results — in this case, the same cell outputs. Reproducibility is very important in science, where it’s a part of a standard peer review process. Since data science is also a science (as the name suggests), it’s very important for any data science work to also be reproducible.

A recent paper found that only 25% of Jupyter notebooks on Github were reproducible.

It is not easy to create a notebook that is reproducible — it’s not just about executing your notebook from top to bottom. A reproducible notebook also includes a reproducible environment, and code that is deterministic. These are all the issues we’re solving at Deepnote.

Reactive notebooks

A reactive notebook is a notebook that is always kept up-to-date. Whenever its code is changed or a cell is deleted or moved, the notebook’s outputs are automatically updated as if the notebook was executed fresh, from top to bottom.

This is what reactive notebooks look like in Deepnote:

At the same time, reactive notebooks also aid exploratory programming by making iteration loops tighter. When building e.g. charts, updating the code automatically updates the visualizations so there’s an instant feedback loop on the changes in the code and data. This truly makes reactive notebooks a powerful tool for exploration.

The ceiling of reactivity

Reactivity has its limitations. While a reactive environment is a perfect model for small and fast scripts, there are situations in which a reactive notebook doesn’t shine: e.g. when training a model on a large dataset. Training a non-trivial model can often take a few minutes or even hours, so a reactive notebook wouldn’t have enough time to update itself between edits.

Once your use case becomes complex and your bottleneck is not the notebook but the actual execution of code, you can simply opt out of reactivity and go back to a conventional notebook.

At Deepnote, we start all new projects with reactive execution turned on to speed up the iteration cycles and feedback loops. As the projects grow, the reactivity, users can easily switch to the traditional non-reactive environment and execute each cell manually.

Reactive notebooks are ✨

Everyone who works with a computer is familiar with spreadsheet tools like Excel. This makes it the most popular computational document. It contains cells with data, cells with code, and it’s a reactive document — everything is kept up-to-date after edits. Data science notebooks deserve the same level of convenience and ease-of-use.

You can give Deepnote reactive notebooks a spin today! We’ve recently opened up our public beta so sign up and try it for free.

This article was originally published on Deepnote.

Made with 💙 by the Deepnote team. Deepnote is a new kind of data science notebook — Jupyter-compatible with real-time collaboration and running in your browser. Try it out for free.

--

--