Using Jupyter Notebooks as your cloud IDE for DevOps

ML-Guy
ML-Guy
Sep 5, 2018 · 8 min read

In this post, we will explore using Jupyter notebook as a development environment for DevOps for AWS. Jupyter notebooks are the preferred tool for data scientists, and surprisingly, I found it useful for developing the AWS lambda function for the use case of automatic control over the state of Amazon SageMaker notebook instances. It had all the benefits that I found when developing machine learning models, including interactivity, documentation, sharing, etc. The most interesting benefit from this exercise is to use the same tool across the process of machine learning, for both the data scientist and the DevOps engineer that is working with him. Standardizing on Python and Jupyter notebook removes many of the complexities of applying agile methodologies to the project. The common language and tools allow easy transition across the team and the project flow.

Saving 70% of Notebooks Costs

We will use an example of developing a mechanism to save 70% of the cost of notebook instances in SageMaker, using automatic stop and start at the end and beginning of every working day. Since the Jupyter notebooks are used interactively during the working hours, stopping them, and stopping to pay for them, during the nights and the weekends can save around 70% of the costs. We can ask our data scientists or DevOps teams to shut them off manually when they leave the office or don’t need them, but they won’t.

Automatic Start/Stop Solutions Architecture

We will use the following automation flow:

  • A CloudWatch Scheduled Event called OnDuty that is triggered every working morning
  • A Lambda function that is checking for notebook instances with a tag InDuty=Yes and starting them
  • A CloudWatch Scheduled Event called OffDuty that is triggered every working evening
  • The same Lambda function is stoping the InDuty instances

Why Using Jupyter Notebook?

We will be using a Jupyter notebook to develop the above Lambda function and the Terraform scripts to create the above environment. In AllCloud, we have a lot of DevOps code and a lot of infrastructures to manage, and we found it hard to collaborate between teams, team members and our customers. Since Python is also the most common language of our DevOps development, using Jupyter notebook as a development environment allow us to share internally and externally easier.

You can check the full notebook in this Github repository:

You are welcome to fork the notebook or use it as is to create the above functionality of saving jupyter notebooks costs in your environment. Here we will explore the functionality of the jupyter notebook to develop such a functionality.

How to interactively develop an AWS Lambda function?

AWS released last year Cloud9 as a cloud IDE, and made it easy to use it with Lambda. The Cloud9 IDE is opened when you author or edit a Lambda function in the management console of Lambda, and you can also connect to your Lambda functions remotely when you open the Cloud9 service. Other IDE integrations are available in other tools such as Eclipse or PyCharm. Nevertheless, if you decided to develop your Lambda function in Python (why wouldn’t you), the interactive nature of Jupyter makes it easy to use to add the functionality and test it quickly gradually.

Let’s start to develop the Lambda code. We first want to see what information do we get when you ask for the list of notebook instances from SageMaker API:

Jupyter Notebook Starting Cells

If you never used a Jupyter notebook, what you see in gray are the code cells and around them markdown cells for documentation and output cells (for example Out[10] in the image above). The numbers in parenthesis are the order of the cell execution. If you are wondering why the first cell has the value 1 and the second the value 10, you can guess that I executed the second cell nine times before I was happy with the output. I had issues with permissions and other API trials and errors. Nevertheless, now I can share with my team and you what is working.

I continue to explore the API and see what do I need to query the tags of the instances (hint: the instance name) and so on.

Amazon SageMaker API Exploration

Once I’m happy with the flow of my Lambda logic, I can put it all together.

The final Lambda Function

How to create the needed AWS environment?

Since we are in a DevOps context, I don’t want to click my way through the AWS management console manually, and I want to create the AWS infrastructure as code with automation. There are a few options to do that from AWS CloudFormation to Netflix’s Spinnaker. I’ll skip the religion war and jump to use Terraform. It is the most straightforward tool for me to use, and I could (install it and ) use it in the Jupyter notebook, thanks to the built-in shell support.

Terraform deploy AWS Lambda function from zip files in S3. I’ll zip the Lambda code and put in S3 using the AWS-CLI:

Zipping Lambda and Putting in S3

Now we can start creating the configuration files that are used to describe the infrastructure:

Please note the small trick of creating the files on the file system directly from the Jupyter notebook using the magic annotation on the first line (%%writefile). This way I don’t need to open any text editor for the configuration file, and I can execute the cell again to overwrite the file in the file system. This is an excellent time to point you to some valid criticism against Jupyter notebook on the ability to jump around the notebook and execute the cells in any order. This can often lead to confusion about the state of the system, and what is the value of a file on the file system is not apparent from the notebook cells or outputs. You should be careful when applying the %%writefile magic making sure that you are overriding or appending correctly. You should also use Git or other version control systems to make sure that you are committing and integrating what you want and not some intermediate state of your development. This is the reason that I don’t have my git code in the notebook, and I do that from an external shell. It adds some complexity, but I enjoy the flexibility of each tool.

The next step is to give the Lambda function the permission to start and stop SageMaker notebook instances. We do that using Terraform format and the %writefile magic with append flag (-a)

There is some complexity in this cell such as the way to embed JSON file (IAM Policy definition) within a TF file (using: <<EOF { JSON } EOF), and the added permission that it needs to EC2 to create the network for the notebook instances.

Using Terraform within a Jupyter Notebook

Terraform is a command line tool and thanks to the Jupyter support for shell commands, we can see in the notebook the commands and their output and share them.

Terraform Installation and Setup in a Jupyter Notebook

I’m using SageMaker as my notebook environment and therefore I need to use the Linux version of Terraform. If you run Jupyter on your local device (Windows or Mac), you can find the other installation versions here.

In the notebook I’ve shared I also ran validate and plan to see what Terraform is going to do based on the configuration file above, and once I was mostly happy with the results I applied it using:

!./terraform apply -input=false -auto-approve

Next, we can verify that the Lambda function is in place using the AWS CLI

!aws lambda list-functions

and then also test the function from the notebook

!aws lambda invoke --function-name StopStartSageMakerNotebookInstances \
--payload '{"event":"On"}' \
--invocation-type RequestResponse \
--log-type Tail /dev/null | jq -r '.LogResult' | base64 --decode
Testing my Lambda function from a Jupyter notebook

As you can see my Lambda function failed, but the failure was on its inability to start my notebook, which is already running. However, once the Lambda function is triggered in the morning on a stopped instance, it will start it correctly. All we need to have now is two triggers, one in the evening to turn it off ({“event”:”Off”}) and one in the morning to turn it on ({“event”:”On”}).

In this cell we create the CloudWatch events, setting the Lambda function that we created before as the target of the events and add permission for the events to call the Lambda function.

The events I’ve created are meant for our Israeli office that is working Sunday to Thursday and it is GMT+3 time zone. Therefore, the cron option is “cron(0 16 ? * SUN-THU *)” for stoping the instances at 7 PM every working evening and “cron(0 5? * SUN-THU *)” for starting the instances at 8 AM every working morning. You should change these values to meet your offices hours (Monday-Friday, for example).

I also tried to create some diagrams for the environment using Terraform draw command and Lucid-Chart functionality, but none of the output was something I can use to improve the visibility of the environment, and I had to draw my architecture diagram manually (see at the top of this post). Terraform is better in explaining to the cloud what to build than to explain to people what it built.

Conclusion

In this post, I share the development process of a typical DevOps task using Python and Shell code. I’ve used a popular tool, Terraform, that is covering a lot of AWS (and other clouds) resources and makes “Infrastructure as Code” promise into a reality. I developed and shared it using a Jupyter Notebook, which I believe can improve the lives of DevOps engineers, especially those that live in Python environments and interact with Data Scientists and Data Engineers on data and machine learning projects.

Finally, I hope that you will also use the system I developed here (a Lambda function and two CloudWatch events) to reduce your SageMaker notebooks costs. You can use the Lambda code and events configurations from the AWS management console, or better yet, use the Jupyter notebook and Terraform to learn their capabilities.

ML-Guy

Written by

ML-Guy

Guy Ernest is the co-founder and chief engineering officer of Revealio, an AI company serving large enterprise AI transformation in the cloud.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade