Polynote vs Zeppelin: A Comparison

Ceren Güzelgün
Insider Engineering

--

As the Predictive Business Unit, we utilize many different tools to make sure our workflow is as smooth as possible. Notebooks are one of the most important components of it, acting as a sandbox for us as we engineer new models, improve existing ones, and sometimes well, just good old fashioned debugging.

A notebook interface, sometimes also called a computational notebook, is the virtual notebook environment we use for literate programming. It takes the role of a word processing software; combined with the shell and kernel of that notebook’s programming language. Virtual notebooks are being used in a lot of areas; most fundamentals being data science, journalism, and education. A skillfully prepared notebook has proven to be an amazing learning tool at computational lectures, online courses and workshops.

Apache Zeppelin and Polynote are the main notebooks used in our team, and here you will read the comparison of the two.

Apache Zeppelin is a multi-purpose, web-based notebook that is optimized for data ingestion, discovery & analytics, as well as visualization and collaboration. It offers various interpreter support; most important of which are Apache Spark, Python, JDBC, Markdown and Shell.

Polynote is the open-source experimental notebook environment created and used by the developers of Netflix. It supports Scala and Python (with or without Spark), SQL, and Vega. It was developed from the need for some fundamental features that the existing notebook environments lacked. Such as autocompletion or parameter hints, features that are provided by IDEs, but nonexistent or laggy in the current notebooks.

The most important feature of this environment is that it’s a polyglot notebook. Meaning; it supports mixing multiple languages within a single notebook, while allowing seamless data sharing. Its immutable data model supports reproducible notebooks.

So what are the main differences between these two environments, Apache Zeppelin and Polynote? The most obvious one would be the language support. Currently, Zeppelin offers Scala and Python support; but it only supports the usage of one of them at a time; meaning, in order to switch between these languages one needs to change the kernel.

But in Polynote; you can define a variable with Python in one cell, and do computations with it by using Scala on the next cell. This is the aspect that puts the poly in Polynote.

Source: https://polynote.org

Each cell can be in any supported language. This is established via the kernel providing the input values to that cell’s language interpreter. Then the interpreter returns the resulting output values to the kernel. This process allows the cells in Polynote to work within the same context.

Another difference between the two environments is dependency management and configurations. On Polynote, the dependencies are directly handled from a menu on the notebook itself, instead of configuring on the cluster/server level. The configuration for Zeppelin is handled via a plugin that gives access to the processing engines and data sources from the user interface. If you want to work with a custom JAR file, you can upload it easily from the Polynote UI, but to achieve the same thing with Zeppelin; you need to connect to your virtual machine and load your JAR file there. In our workflow, we often need to switch between JARs, and Polynote’s interface offers a much smoother process in this manner.

Let’s talk about the data visualization. Handling data is the largest part of what we do, and a notebook’s data visualization specs can be crucial for us at times. In Zeppelin, the basic visualization charts come as default. Apart from SparkSQL queries, many other language backends are also used to visualize data.

When writing with Scala, which is mostly what we do; inspecting tables without SQL yields a raw view that can be a bit hard on the eye. Such as;

The interface does not come with built-in sliders, or a beautification unlike Polynote. Polynote offers a more practical table display, and its polyglot support makes it possible to easily use Python visualization tools and Scala data analysis methods at the same time.

When it comes to browser support, Zeppelin is more stable compared to Polynote. Polynote notebooks work best with Google Chrome, but with Zeppelin we can work from Safari, Mozilla Firefox, or Google Chrome equally well. This fact is the main reason we still utilize both notebooks, instead of settling on Polynote for good. We will continue to follow new releases of both environments, and will always keep researching new technologies in order to make sure our workflow is as efficient as can be.

Want to be with us and new technologies? Check out our career page now!

--

--