Hacking productivity in data science notebooks

The tools we use don’t just influence how fast we work, but also the kind of results we deliver.

Elizabeth Dlha
Dec 7, 2020 · 5 min read
Illustration by GoodStudio on Shutterstock

Notebooks have become the go-to tool for data scientists in both academia and industry, but they have also introduced a new set of challenges. Many conference talks and white papers have described the drawbacks. Notebooks don’t support collaboration, hinder reproducibility, encourage sloppy coding practices, don’t play well with the rest of your stack…you get the idea.

At Deepnote, we recognize the power of notebooks, but we see a lot of areas in which they can be improved upon. We want to create an interface that makes data scientists more productive and helps them reason and collaborate. We are building on top of the familiar Jupyter experience, but also making some significant design changes. In this article, we will discuss how Deepnote addresses the following 4 key challenges:

Challenge 1 — Environment set up and management

The potential of data science teams is often limited by the engineering support that’s available to them. Managing Python packages, researching connectors to disparate data sources, and maintaining data pipelines is particularly difficult. The learning curve is steep and the experience is often exasperating. Setting up a development environment and maintaining its consistency takes up valuable time, and it can end up being an expensive and frustrating task.

At Deepnote, we aim to reduce this overhead and free up the hands of data scientists to focus on what they are best at, which is why we abstracted all this complexity away. Deepnote is built for the browser and is platform-agnostic. No pre-installs are needed — you can simply sign in, create a new notebook, and get to work while Deepnote handles the rest (you can try it here).

All hardware is remotely managed, every hardware tier is connected to the internet and can execute long-running tasks. Your projects in Deepnote are always available, with the hardware being up and running in just a few seconds. No need to spend time picking the right Linux version and managing multiple Python environments. If you need to upgrade your hardware, you can do so in just one click. We believe that by enabling a frictionless setup, we can move the data science community closer to reproducibility. Deepnote supports that in two major ways.

First, Deepnote makes dependencies management easy. When you pip-install a package in the cell of a notebook, we prompt you to move it into requirements.txt and append a specific version of the used package. Second, you can create Teams in Deepnote, which allows you to share datasets, integrations, projects, and environment configurations. This way, when your colleague shares a project with you, it includes the environment it runs in, not just the .ipynb. To learn more, take a look at my previous article discussing how Deepnote fosters collaboration in notebooks.

Challenge 2 — Code assistance

Although writing and managing code is a fundamental activity in the computational notebook paradigm, a lack of code intelligence in notebooks can make the experience difficult. Code editors and IDEs used by software engineers are not the right tools for the job either. As a data scientist, you most likely navigate function and class names by having another browser open to search for help or you get the job done by switching between software IDEs and notebooks.

Whether you are transforming your data, exploring, or building ML models, Deepnote helps with advanced code assistance. An IDE-style autocomplete system lets you work faster, and configurable linting tools point out bugs before they break your long training jobs.

IDE-style autocomplete in Deepnote

Challenge 3 — Understanding data quickly

Discovering patterns in data takes up a lot of the time before we’re able to start using those insights and building out models. Initial data exploration lacks the immediacy and often ends up being a never-ending cycle of “copy-paste and tweaking bits of code made worse by feedback latency and kernel crashes”.

Deepnote has a built-in variable explorer so that you can instantly review the contents of your variables without having to print them. It contains additional information, including histograms for each column of a data frame so that you can quickly get an overview of the current state. Discovering patterns is also made easier with the help of interactive plots.

Challenge 4 — Maximizing your speed

As data scientists, we need interfaces that help us explore data efficiently, prototype quickly, and move towards actionable insights. With Deepnote, we’ve introduced a bunch of features that save your time and help you iterate on your experiments faster:

We’ve also introduced a powerful Command palette, as well as shortcuts that provide quick access to all your files and the most popular actions. Simply press Ctrl+P (or ⌘+P if you’re on Mac) and start typing to switch to another file, open a terminal, or execute an action.

Command palette gives you access to all popular actions

Like what you see? We’ve recently opened Deepnote for public beta, so you can try it out for yourself.

This post is Part II in a series on how Deepnote tackles the common challenges of data science notebooks. Check out our currently released articles below:

There are more ways to learn from Deepnote and we’re always happy to share:

Deepnote

A data science notebook you’ll love to use

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store