Amazon SageMaker ml.p2.xlarge Udacity Nanodegree Request Limit Increase and Troubleshooting ResourceLimitExceeded Error

In your Machine Learning Engineering Nanodegree (version 4.0) you will be asked to request limit increase for the ml.p2.xlarge instance. The screenshot unfortunately does not show the best steps to achieve this. You may get a ResourceLimitExceeded error. Use this one page guide to resolve the issue.

Summary:

  • How to request limit increase for ml.p2.xlarge
  • What is EC2 P2 what is Amazon SageMaker ml.p2.xlarge?
  • How to fix the ResourceLimitExceeded Error

How to request limit increase for ml.p2.xlarge

First of all, this is a manual process for a reason. You should only request this if you know exactly what you are doing or following a tutorial closely. The Udacity nanodegree constantly reminds students to shut down resources to consume resources and more importantly, cost!

It is important not to confuse ml.p2.xlarge with p2.xlarge. The ml stands for machine learning, for a good reason.

This resource limit request is for Amazon SageMaker, AWS’ machine learning engine, for training purpose, not for the EC2 p2.xlarge instance. There’s a link to read more about the EC2 p2 instance at the end of this seciton.

Steps to Request Limit Increase for ml.p2.xlarge

  1. visit aws console https://console.aws.amazon.com/
  2. click on support on the top right corner
  3. click create a case (orange button)
  4. select Service Limit radio button
  5. on the case page, Search and Select SageMaker as Limit Type
  6. select the same region as the region that is displayed on the top right corner of your amazon console.
  7. upon selecting region, Select SageMaker Training as Resource Type
  8. Select ml.p2.xlarge in Limit
  9. New Limit Values 1

It may take 48 hours for this manual support ticket to turnaround.

What is EC2 P2 what is Amazon SageMaker ml.p2.xlarge?

It is important to not confuse EC2 p2 and ml.p2.xlarge!

Read more about EC2 instance type p2 here

Amazon EC2 P2 Instances are powerful, scalable instances that provide GPU-based parallel compute capabilities. For customers with graphics requirements, see G2 instances for more information.

p is for GPU not parallel

Currently there seems to be no way to view your current ml.p2.xlarge limit on a dashboard. Jan 2019 Github issue mentions this as an enhancement request. It might have changed now that it is10 months later.

What is ml.p2.xlarge?

Of course is also an EC2 p2 instance, but it is important note that the namespace is definitely under machine learning — SageMaker. Read more about machine learning instance types here

Amazon SageMaker provides a selection of instance types optimized to fit different machine learning (ML) use cases. Instance types comprise varying combinations of CPU, GPU, memory, and networking capacity and give you the flexibility to choose the appropriate mix of resources for building, training, and deploying your ML models. Each instance type includes one or more instance sizes, allowing you to scale your resources to the requirements of your target workload.

Instance type ml.p2.xlarge
vCPU 4
GPU 1xK80
Mem (GiB) 61
GPU Mem (GiB) 12
Network Performance High

How to fix the ResourceLimitExceeded Error

Why are you still getting a ResourceLimitExceeded Error?

ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateEndpoint operation: The account-level service limit ‘ml.p2.xlarge for endpoint usage’ is 0 Instances, with current utilization of 0 Instances and a request delta of 1 Instances. Please contact AWS support to request an increase for this limit.

For Udacity students, it is very possible that the first time (due to the out-of-date content) you might have requested p2.xlarge increase instead of ml.p2.xlarge increase.

Please revisit the first section and request a limit increase and type in the words exactly as stated above, not the current content on the Machine Learning Nanodegree.

Dan Sun thanks for your question if you followed the exact step, it may be because the instance is not available for your region? What is your region. Dan Sun you can also share a screenshot and send to hi@uniqtech.co Uniqtech I write for them. We can have their data science bootcamp publication team take a look.

--

--

Introductory Data Science, Machine Learning and Artificial Intelligence for Bootcamp and Nanodegree Graduates. By a bootcamp grad for bootcamp grads.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Y Sun

Silicon Valley tech, startup, machine learning, data, food! & travel! Worked at 2 YC startups, quoted on USAToday TechCrunch VentureBeat