How to create a serverless JupyterLab IDE experience on IBM Cloud CodeEngine using KNative

Romeo Kienzler
IBM Data Science in Practice
4 min readJul 28, 2022

Having your IDE on your laptop has a couple of advantages. You can use your favorite tools on your Desktop. One example is meld which only runs on Linux. It is a visual diff tool which I’m using when I’ve totally screwed up local git repository. You can manage multiply environments with venv or conda.

But local development also has disadvantages. Mainly setting up an environment for a particular task (e.g. testing a pull request before merging) can become tedious. In data science you often have long running task which would benefit from to one or other additional resource like CPU cores, RAM, GPUs or storage.

Things like binder or colab have really taken off but both can’t go beyond whats in the free tier. In addition, the latter isn’t really portable, so definitely not hybrid cloud ready.

Therefore I’ve created a container images for JupyterLab and Elyra which can be simply deployed to a serverless cloud environment like IBM Cloud Code Engine, RedHat OpenShift Serverless or similar which is basically KNative as a Service.

As CodeEngine has a generous, never expiring free tier let’s just use it since we have it :)

CodeEngine, among other services, encapsulates KNative which allows for continuously running Kubernetes Pods or Jobs to be published. A job is a simple container image running from start to end. A Pod on KNative must provide a HTTP endpoint.

In addition, on CodeEngine and other “function” as a Service offerings only ephemeral storage is available, in other words, when the container restarts (which can happen any time in a serverless enviroment) your local file system is gone.

To prevent this, I’m using rclone to

a) pull all data from S3 on container start

b) continuously sync data to S3 while the container is running

That way I can create a fully serverless, scale to zero JupyterLab experience on IBM Cloud, RedHat OpenShift Serverless and anything supporting KNative as a Service.

Let’s get started:

First we need to create a bucked in S3 Cloud Object Storage (COS) for our persistent data.

  • Create a new COS service (feel free to use the all time free tier with 25GB of free storage)
  • Create a regional bucket in the “Smart Tier” in a region close to you. I’ve named mine “remotedev”
Create a regional bucket in the “Smart Tier”
  • Find out the endpoint for your regional bucket location
Find out the end point for your region
  • Create a credential — make sure “Include HMAC credentials” is enabled
  • Note down access_key_idand secret_access_key

Now you should have bucket name, endpoint url, access_key_id and secret_access_key of your bucket, that’s all you need.

No we can start our JupyterLab in KNative on CodeEngine

  • Go to CodeEngine and create a project
  • In the project click on Applications->Create
  • Put romeokienzler/elyra-ce:0.8 as image and 8888 as listening port
  • In “Runtime settings” change “Max number of instances” to “1”
  • Please add the following environment variables (plus BUCKET and PROJECT) (Hint: With the PROJECT variable you can have multiple scale-to-zero cloud IDEs available for different projects)
Only six variables need to be set
  • Click on “Create”

After some time you should see your application entering status “Ready”. Using the “Test application” button you can obtain and open the application URL. This is the URL if your serverless cloud IDE!

The cool thing: Since per default CodeEngine scales to zero. This way you wont consume any resources besides the data on S3 COS, where 25GB are for free anyway. After some time you’ve hit the URL a login screen appears. Please put the password you’ve used for JL_PASSWORD above.

your personal login screen

And here you are, with your personal instance of your serverless JupyterLab/Elyra cloud IDE

JupyterLab/Elyra running serverless in CodeEngine in IBM Cloud

--

--