The Alternative to Google Cloud’s Vertex AI Managed Notebooks

Byron Allen
Contino Engineering
5 min readApr 12, 2022

--

MLOps tooling comes in a variety of flavours, but generally speaking, I see two camps: light-weight, tactical options and end-to-end, platform options. Think MLflow vs Kubeflow, respectively. I wrote a cheesy analogy about those two tools awhile back. In a way, that story is a ‘David vs Goliath’ archetype. And like any good archetype, we see it again and again.

This post is about another such story in the MLOps pantheon of tools.

A David and Goliath Story with a Small Twist

David and Goliath. A tale as old as the cheese samich. A tale told by the underdog with a spring in their step as they anticipate a win despite the odds. Unfortunately, David doesn’t fair so well…

It won’t be the ending you’re thinking of

Pointing a Spotlight on ‘David’

Google Cloud marketing will tell you all about the end-to-end pipeline capabilities of Vertex AI. They’re not wrong. Thanks to features like Pipelines and Feature Store that’s quite true — there are also a number of other very interesting features with varying degrees of being Generally Available (GA) or in Preview. Needless to say, I’m a big fan of Vertex AI and recommend it. Sometimes though, like other ‘end-to-end’ products, talk about a ‘Goliath’ overshadows a ‘David’. In this case, the underdog I’m referring to is Managed Notebooks within Vertex AI Workbench.

Managed Notebooks in Google Cloud Vertex AI Workbench

This feature would likely not have caught my eye if it wasn’t for the fact that (A) I had a client that needed a low-skill way to schedule and deploy notebooks and (B) that option via Managed Notebooks came into existence recently — albeit in Preview. Currently, the user has the ability to schedule any notebook from the JupyterLab GUI with Managed Notebooks, which is frankly pretty awesome!

Executed notebooks on a 4-hour schedule

I am now using this in a personal project, which is where I’ve pulled the images from below. This feature allowed me to quickly setup a recurring job that provides an an output — a Jupyter notebook with key indicators and graphs stored in Google Cloud Storage. I check the output daily and it has proven to be a critical component to this project. It allowed me to quickly establish the ability to implement a man-in-the-middle (i.e. the core intervention) driven by the analytics performed in the notebook.

Review the output notebook by clicking on ‘VIEW RESULT’

When David Loses

However, when I took this into the client’s use case, we bumped up against an issue. Come hell or high water, David is just not going to win if he can’t write a table to BigQuery!

Yes! That deserves an exclamation mark!

In ‘theory’, this shouldn’t be an issue. It’s not surprising that someone using a Vertex AI notebook would want to write out to BigQuery; I can write out to BigQuery when the job is not scheduled, and users can even provide a custom Service Account for the recurring job. This is where I take a deep breath…

Again, this product is in preview so give it some time and a few support tickets before Google Cloud works out the kinks.

So, that option was a bust. What to do now? Turn to Goliath (aka Cloud Composer)? Truly that would be overkill, so instead like any decent engineer I try to reverse engineer Managed Notebooks. Thankfully, it’s actually not that hard to create a pattern to automate notebook deployments. Moreover, it might even provide a better pattern.

How do you execute a notebook?

I know how to set up a simple scheduled job in Google Cloud using a Cloud Scheduler job that submits a message to a Pub/Sub topic — the same topic that is used to trigger a Cloud Build job. The scheduling and orchestration part was a known quantity, so that left one remaining question: How do you execute a notebook like a file?

Most data scientists are familiar with executing one cell in a notebook at a time. But did you know that not only can you execute a notebook like a file but you can also pass in parameters? This wasn’t entirely surprising to me, as I’d done it in Databricks, but I had never done it with Jupyter notebooks.

In short, there are two main options nbclient and papermill (the later is an abstraction on top of the former). I recommend papermill from an ease-of-use perspective.

# Sample code to execute `notebook_in`import papermill as pmpm_execute_out = pm.execute_notebook(
notebook_in
, notebook_out
, parameters=dict(bucket_name=bucket_name)
, kernel_name="this-kernel"
)

I went about building a small python script to execute any notebook with papermill, which has been incredible handy and surprisingly easy. One downside to this approach are the limitations around what machine types are available for Cloud Build. The largest default far surpassed my needs, but some individuals may find this restrictive. In that instance, users will have turn to Private Pools for more options but will see that GPUs are absent. For those in need of GPUs or TPUs for training or inference, they will likely need to turn to Vertex AI Training and Vertex AI Batch Prediction / Endpoints (i.e. call one of these services from the notebook or a step in Cloud Build).

Moral of the story? Create a ‘Davíd’ when the promise of David isn’t quite there.

Potential Improvements to Managed Notebooks

Not only did I pivot to a secondary option in my clients project but I did so in my personal project as well — even though I did not need to write to BigQuery. Why? It’s cheaper for one thing — by more than half in my case. But mostly, it allows me to integrate execution of the notebook into a more robust CI/CD system rather than a singular schedule. I could string together multiple notebooks with this pattern.

I hope there are some improvements to Vertex AI Managed Notebooks. I like what Google Cloud is trying to do with it, making notebooks and scheduling more approachable. Hopefully, they create a trigger for Vertex AI Managed Notebooks — similar to the trigger that is provided for Cloud Build Jobs. This, along with writing out to BigQuery, would provide a huge improvement to the tool in my eyes.

In the meantime, I got Davíd.

Like what I write? Follow me on LinkedIn or Medium.

--

--