AWS Parking Lot Classification

Published in

Slalom Data & AI

9 min readJun 19, 2020

Where can I find a space to park?

Any retailers that own large department stores like Macy’s can analyze the number of customers coming into their stores by watching the capacity of their parking lots. By taking counts of how many cars are parked throughout the business day, brick-and-mortar retailers can discover consumer shopping trends that may not be obvious to them from just analyzing sales.

Knowing this information can improve retail operations as well. Department stores can streamline employee schedules having knowledge of customer flow. Operations can easily detect a spike in consumer traffic in real- time and adjust accordingly by scaling additional cash registers up or down.

Knowing how many and which parking spaces are available does not have to be an expensive undertaking. A quick search on Google will result in many companies that are leveraging IoT technology and installing sensors on each parking space to keep track of availability. This seems like a very cost-intensive method and will require a large amount of ongoing maintenance.

Instead of having someone manually count the number of cars during the day or installing IoT sensors on every parking space, we can leverage machine learning on AWS and Python scripting to automate this process with minimal cost.

How do we accomplish this?

This blog will walk you through how our team used this opportunity to utilize machine learning on AWS to solve this problem. We are going to train an image classification model to detect whether a parking space is occupied or empty. Once this model is trained, we can serve it images of each parking space in a lot. At this point we will know how many total spaces are in the lot and how many of them are occupied, providing us the occupancy rate of the parking lot at that time.

In a perfect world, we would have a live feed of a parking lot that we can base our solution on, but for this demo, we will use a picture of a parking lot to simulate this process.

The suite of AWS services used:

S3: Location of the training and testing data. Final model artifacts were also automatically uploaded to s3 once it was trained.
SageMaker: The environment where the bulk of Python development work was performed, including model training and hyperparameter tuning. The Jupyter notebook IDE made it so simple to code.
API Gateway: It’s a fully managed service to create and publish APIs at any scale. This service hosted the Sagemaker model on an API endpoint so that external applications could access the model.
Lambda: Orchestration service that allows serverless code to be executed whenever the API endpoint is used
CloudWatch: Contains logs to help debug during the development process

Main steps:

Data Gathering
Data Engineering
Image classification Model training
Host model on exposed API
Synchronize using Lambda & API Gateway
Monitoring using CloudWatch

Data Gathering

To train an image classification model, the algorithm needs to understand which images represent an empty space and which images represent an occupied space. We need a set of labeled images that we can train on. Fortunately for us, a kind stranger on the internet has already labeled this set of images that we can use. Each file is an image of a single parking space. All of them are sorted so that the empty spaces are in an “Empty” directory and occupied spaces are in an “Occupied” directory. This will be important later when we start training models.

Occupied

Empty

If this pre-labeled set was not available, another recommendation would be to use AWS SageMaker Ground Truth. Ground Truth is a service provided by AWS that will outsource labeling to a 3rd party vendor. They will label a set of images for you based on your requirements and all you need to do is provide a S3 bucket location.

Data Engineering

Data engineering often takes up most of the time in the ML process and can be the most frustrating part of the process. Luckily, AWS provides documentation on how to format the data before the training can take place.

First, you will need to install some python packages that will allow us to alter and refine our images. The following packages: cv2, glob, and PIL were installed.

When images are loaded as numpy arrays in our python environment, they are represented as 3-dimensional arrays of 8-bit unsigned integers.

The first dimension is the height and the second dimension is the width. Images are stored this way because they are represented by many lines of pixels strung together in order. The third dimension is designated for RGB, a 3-byte value that represents the color of the pixel. You can also have images that do not have a 3rd dimension and are just Height X Width. These pixels don’t have any color associated with them, so they are called greyscale images. There are several other types of images out there, but we will only use these two for this exercise.

The default array order is channel-last, where the RGB dimension is placed in the 3rd dimension, but to be sure you can confirm by checking the shape using this code:

If you find that your images are channel-first, you can reorder your dimensions in this manner:

Once the images are channel-last, we need to make sure the image sizes are uniform. SageMaker model training requires that the train images have the same dimensions. We can do this by using the resize function.

Now we are ready to upload the images to S3. Organize the images so that they are separated by train and validation as well as their respective labels.

Once the images are uploaded into S3, we’re ready to train our model!

Image Classification Model Training

The image classification algorithm supported by AWS SageMaker is a supervised learning algorithm that can handle binary classification as well as multi-label classification. The algorithm is a convolutional neural network, a type of neural network that is highly accurate in image recognition.

Convert all the .jpg files in our S3 directory to RecordIO files. You can train with .jpg files but RecordIO files are optimized for training models so it will reduce time and save costs. If you are training with .jpg files, you will just update the content_type to “image/jpg”.

The Sagemaker session class creates the definitions for the input data. The location of the data needs to specified, as well as its distribution, content type and data type. The valid values are found here. Next, the Sagemaker estimator class sets parameters for model training. You can find the documentation here. Then select which EC2 instance you want to train on. In this case, we used 1 ml.p3.2xlarge which are optimized for high-performance machine learning. The model training time is dependent on the instance type and count. Because the training and validation datasets are comprised of images, AWS recommends instances with sufficient memory for training image classification models.

To kick off model training, run the following script:

Hyperparameters are the different levers you can pull before you start training your model. The model performance will be dependent on the hyperparameters used to train the model. At this point we do not know the best set of hyperparameters to use to maximize model performance. To find the best combination, we used hyperparameter tuning.

Each combination of hyperparameters will result in distinct model performance. AWS SageMaker has a straightforward hyperparameter tuning service that can easily be called from python code:

Because some of my hyperparameters are continuous, we have an infinite range of values that can be tested. The parameters were limited to a set range, so that tuning can search over a reasonably sized grid. Once your hyperparameter tuning job finishes, you select your best performing model.

You will create a SageMaker model from the best training job. This means the best performing model’s artifacts and interference code is saved to a S3 bucket so that it can be used to make predictions. The model will be hosted and deployed on a HTTPS endpoint so that the model inferencing can be used external to AWS.

Lambda & API Gateway

Lambda & API Gateway are AWS services that allow to integrate solutions as a part of a serverless architecture. This means the operational responsibilities of your solutions are handled by AWS, allowing for faster improvement and innovation. Serverless allows you to build and run applications and services without thinking about servers. It is highly scalable and you only pay for execution. It removes infrastructure management tasks like server or cluster provisioning, patching, and operating system maintenance.

The AWS pipeline looks like the following:

A client application calls an API Gateway action and passes parameter values. API Gateway is a layer that provides API to the client.
API Gateway passes the parameter values to the Lambda function.
The Lambda function parses the payload and sends it to the SageMaker model endpoint.
The model performs the prediction and returns the predicted value to Lambda.
The Lambda function parses the returned value and sends it back to API Gateway.
API Gateway responds to the client with that value.

Build Parking Space Pipeline

Now that we have a model ready, we can classify if a parking spot is occupied or not. Next we need to figure out a way to parse out individual parking spots from a parking lot image. The strategy is to locate where the white lines are that are used to mark the boundaries of each space. In this case, Canny Edge Detection was used.

Hough line transform was applied to detect straight lines from the edges. Next we divided each column of parking spots evenly so that the lines delineated each space equally.

We can parse out each spot so that it can be run through the model.

Out of the 52 total spots, the model detected 16 spots that were occupied.

Conclusion

There’s an endless number of ways to improve on this framework. While we only looked at top down images of parking lots, one possible avenue to explore is what would happen if we used lot images taken at an angle.

Image classification can be applied to real-time live videos. This technology enables new opportunities to apply ML classification methods to automate laborious and cost-intensive tasks. This translates to reduction in task hours and ultimately results in savings for the business. Companies throughout various industries see this as a means to reduce costs and leverage the power of image recognition to streamline their outdated processes. This type of modeling has become so in demand that AWS even has a service called Rekognition to do this exact task of identifying and tracking objects in a video.