Jupyter Dashboarding Workshop
Back in June, I attended the four-day Jupyter Dashboarding Workshop in Paris. A big thanks to Sylvain, Pascal and the rest of the team, it was a great event. Also a big thank you to all who lent a hand to me over the week but especially Philipp and Yuvi.
There were lots of great conversations, ideas and demos, and I’ve got loads of ideas buzzing around my head as a result. I hope to write many of them up in due course, but I’m going to start with a discussion of the three key libraries in the Dashboarding space.
Caveat: I know relatively little about the three technologies I’m going to talk about so please take with a pinch of salt, and your thought and corrections are welcome in the comments!
Voila / Panel / Dash
I think it’s fair to say the star of the show at the dashboarding workshop was Voila. There were many great demos, templates, ideas, talks and more flying around focusing on Voila. There was even a great demo that looked a lot like the old Jupyter Dashboards / Jupyter Dashboard Server drag, drop and deploy demos that got us all so excited many moons ago. However, I think focusing exclusively on Voila misses the more exciting discussion about the context that Voila sits in.
Voila is one of three key players that were heavily discussed at the dashboarding workshop with Dash by Plotly and Pyviz Panel being the other two. All three offer you a way of making interactive Python-based web apps. Given that’s the high-level similarity let’s have a look at the differences.
I would describe Voila as Jupyter native. The aspiration (that's rapidly becoming a reality) is if it works in Jupyter Lab / Notebook it works in Voila. This is Voila’s superpower and probably why it’s gaining such traction. You don’t have to change what you do or the way you do it (assuming you are in the Jupyter ecosystem) and you’ve suddenly got a way of sharing/deploying your work as interactive web apps. What’s more, Voila will happily work with all kernels (i.e. not just Python) giving it a wider appeal than the other two technologies.
Dash, on the other hand, is another platform completely separate from Jupyter. Many of the tools/libraries that you might use to make a Dash dashboard (such as Plotly’s plotting tools) will work in a Notebook but to build a rich Dashboard in Dash you’ll likely be leaving Jupyter behind. You can embed a Plotly Dash app inside a Notebook which will be a welcome feature for many. However, I have a hunch that the two paradigms won't ‘gel’ and working in this way will reduce both Plotly and Jupyter to the lowest common denominator rather than getting the best out of either.
Somewhere in the middle of this is Panel. Panel is based on Bokeh widgets and server and whilst a lot of effort has been made to allow it to work well in the Jupyter eco-system the odd crack certainly shows. In my experience fairly often you find yourself frustrated with how Jupyter and Bokeh don’t play well together in a way that isn’t so true of ipywidget based libraries.
(Note: I’d generally consider ipywidget based libraries Jupyter Native and great candidates for working in Voila.)
Scalable by design
The architectural differences between these three tools really stood out to me. Dash is scalable by design. The architecture is client-side state with a stateless server providing transitions and returning the result back to the client. The client/server communication is over HTTP. The beauty of this architecture is that it makes it extremely easy to scale horizontally, inherently cacheable and a good candidate for edge compute.
For comparison, both Panel (in this context essentially Bokeh Server) and Voila have at least one process/thread per session which must live for as long as the user wishes to interact. The user's session is intimately tied to this process (with at least some state being stored there) and the user cannot be moved between machines without ‘starting again’. This makes horizontal scaling and caching much harder. It also adds a responsibility on the server to manage and kill processes that aren’t being used anymore. This can easily be misjudged, killing processes that are still desired or leaving live processes that aren't being used. I’m inclined (but have no evidence) to believe this style of architecture is likely to be significantly more resource-heavy than Dash’s, though it would depend on the demand profile.
For what it’s worth I suspect Panel/Bokeh Server would perform better than Voila under a production scenario. Bokeh Server has been around and maintained for a long time now and has methods for improving performance such as using a static server for static assets. Voila is still young and many of the foundations it’s built on were not necessarily designed to do the exciting work Voila is doing. That said if I was building an application to be used by hundreds (let alone thousands) of simultaneous users my money would be on Dash.
Dashboard as a service
One of Dash’s killer features is it offers a Dashboards as a service. Plotly will host, manage, serve and scale your app for you (for a cost). This is something that has been sorely missed from the Jupyter ecosystem, the closest we’ve got is probably My Binder which is fine for exploratory demonstrations but not for rapidly accessible interactive dashboards. The success of R and R Shiny over the last few years is undoubtedly due to a number of factors but shinyapps.io is definitely one of them. It’s always felt to me Python has lacked an equivalent to the ‘shiny stack’ so it’s good that this has (at least partially) been addressed.
With Voila and Panel if you want to host your dashboard it’s up to you. There are tutorials and blogs that will help you but you will need to manage, maintain, patch, scale and support your servers, load balancers, storage etc. This is no mean feat and given the architecture I mentioned earlier will become really tricky for a highly popular app. As mentioned earlier I’d be more confident doing this with Panel than Voila but Dash would be my first choice.
An uncomfortable middle ground?
Voila is clearly aligned with Jupyter and aspires to be (and looks likely to become) part of the core ecosystem. Dash’s raison d’etre, on the other hand, is offering scalable dashboarding as a service. I have a worry that Panel is falling into a slightly uncomfortable middle ground. If I’m invested in Jupyter I’ll likely use Voila and if I want an easy to deploy scalable solution I’ll go Dash. I can even imagine a workflow where I explore, test and develop in Jupyter with Voila but re-write/tweak to deploy on Dash, if and when the time comes. Where does Panel fit into this?
Gridded data and the wider ecosystem
Panel’s saving grace is the wider ecosystem that it is part of. HoloViews/GeoViews (which I consider part of Panel’s ecosystem) offers the best option for dealing with gridded and geospatial data. Added to this is intake’s growing popularity, with it’s plotting features based on HoloViews. It feels that there is a niche here that HoloViews/GeoViews has carved out without any good competition from either Plotly or any ipywidgets backed library. This means whilst I would like to explore more with Dash or ipywidget dashboards (served with Volia) I’m likely to spend most of my time in Panel land.
Dash for deploy, Panel for Geo and Voila for Jupyter
Given what I’ve learned I think my approach to dashboarding would be to take Jupyter Notebooks as a starting point. Test and explore through Notebooks served with Voila for dashboarding. If I wanted to take things to a larger audience or was concerned about stability and security I’d look to re-write for Dash. If my case was heavily in the Geospatial field I’d likely find my self using Panel.
I don’t think the dashboarding question is settled yet so I’m sure that there will be lots more interesting stuff to come. Watch this space.