Kitchenware Image Classification Model (CNN) Deployment to AWS Lambda

9 min readJan 26, 2023

Kitchenware model deployment and framework (Part 2)

Business model always dictate we build a high predictive model guarded with set of milestones to be achieved before deployment to production. A model that lives in our local computer is as useless as a drop of water to a tasty man. This model needs to be deployed for public or business use.

Remember a scenario I painted about John’s situation that needs the efficacy the model brings? If YES, you’re on the right track. If NO, have a glimpse at Part 1 of the whole project here.

We’ve expended resources on building a predictive model using CNN. To complete the final phase of the DL workflow, this model needs to be deployed and served. This way, people can interact with the model when product is created around the model API. John stands to benefit from the effort of DataTalks after they’ve taken it a step further to create a product around the model.

How do we now deploy this model? Where do we deploy the model? I know, these are the questions you have. Follow behind me as I brighten the way with my flashlight!

Model Deployment and Service

I intend creating a Kitchenware Classification Service for the model. This service will house the model together with its dependencies for which they’ll help in its prediction in real-time.

What this service does is for the user (John) to upload any kitchenware image (contained in the model training), then a request will be sent to the service containing the uploaded image, and the service will return with its prediction.

I intend to use the AWS Lambda Service Framework to deploy the model.

Why the choice of AWS Lambda Service?

Serverless service led me here. Wrapping my head around a service that runs through a server is much more of a headache considering the resources it requires. AWS Lambda Service is a solution built for running serverless code, which I desperately need. This creates a framework for me to write some functions without creating EC2 instances or Servers, Lambda embodies everything. It affords the opportunity to deploy my containerize code.

After deciding on the Service for deployment, what next? Follow along as I break down the steps I took to deploy the model.

Kitchenware Service Framework

This contains all the dependencies for which the model will make its prediction

TENSORFLOW LITE

Liter version of TensorFlow (TensorFlow Lite) focuses more on inference (model.predict(X)). Unlike the larger version of TensorFlow, TF-Lite can’t be used for training the model but instead focuses more on making predictions (inference). This makes the prediction runtime faster.

Why the choice of TensorFlow Lite?

1. Due to the size constraints of AWS Lambda, the entire TensorFlow package cannot be used. This prompts the use of TensorFlow Lite, which is smaller in size.

2. Unlike TF-Lite, using TensorFlow leads to a Large Docker Image, which will result in slower running time and payment of resources used within AWS Lambda Service

To use this, the model built earlier needs to be transformed to TF-Lite.

Below captures the processes followed in converting the model

# importing libraries
import tensorflow as tf

# loading our model 
model = keras.models.load_model('xception_v2_299_15_0.967.h5')

# converting the model to lite 
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# saving the model in tensorflow lite format
with open('kitchenware-model.tflite', 'wb') as f_out:
    f_out.write(tflite_model)

Let's make some predictions now that I've converted the model to TF-Lite format. You understand that, while training the model, the full version of TensorFlow was used. All the packages in TensorFlow were used for image preprocessing, model loading, model serving, model prediction, etc. And by using the TF-Lite, all of these wouldn’t be useful. How do we now make predictions without using packages in TensorFlow?

Relax, don’t be rattled, this can be done without TensorFlow infiltration, with the help of “keras_image_helper” and “Tensorflow lite runtime (tflite_runtime)”. Now, let’s roll!

TO-DO:

- Importing create_preprocess from Keras image helper: this helps in preprocessing the sample image
- Importing interpreter from TensorFlow lite runtime: this helps to load the converted model
- Using Interpreter to interpret the model
- Loading the model weight
- Getting the input index
- Getting the output index
- Getting the model architecture
- Getting the sample image URL
- Preprocess the sample image through its URL
- Supplying the interpreter using preprocessed sample image and input index
- Making a prediction using the output index

# importing the frameworks 
import tflite_runtime.interpreter as tflite
from keras_image_helper import create_preprocessor

# interpreting/loading the converted model
# initialize the model
interpreter = tflite.Interpreter(model_path='kitchenware-model.tflite')
interpreter.allocate_tensors()

# get the input index of the interpreter
# get the output index of the interpreter 
input_index = interpreter.get_input_details()[0]['index']
output_index = interpreter.get_output_details()[0]['index']

# get model architecture 
preprocessor = create_preprocessor('xception', target_size=(299, 299))

# preprocessing the image (using image url) 
url = 'https://bit.ly/3IYiPAX'
X = preprocessor.from_url(url)

# supplying the model with sample image for prediction
# model prediction
interpreter.set_tensor(input_index, X)
interpreter.invoke()
preds = interpreter.get_tensor(output_index)

# prediction classes 
classes = ['cup', 'forks', 'glass','knife', 'plate', 'spoon']

# combining label classes and model prediction
dict(zip(classes, preds[0]))

The model predicted the exact image class we served it. We can move forward with turning this into a Python script. The above frameworks and dependencies are what needed to be deployed into lambda service for prediction.

Combine model infrastructural files to be deployed to Lambda Service

Service lambda_function script file

This is an important file that contains all the needed dependencies useful for the model's serving and prediction. You can access the script here. Below are the steps followed in writing the script:

- Importing the dependencies
— Model `Interpreter` from `tflite_runtime`
— Image preprocessor (`create_preprocessor`) from `keras_image_helper`
- Initializing image preprocessor framework using the model's internal dependencies
— type of Keras application (Xception)
— image size (299)
- Model Interpretation:
— Interpret the model using `Interpreter` through the model path
— Get model input and output indexes.
- Label classes: Saving the classes which the model will predict on
- Writing a predict function: This function contains a framework where it’ll do the following:
— Preprocess the supplied image through URL
— Supply the model with the preprocessed image together with the model input index
— Make a prediction on the image using the model output index
— Turn the model prediction to float
— Return model prediction in dictionary; containing each label class and their accompanying model prediction scores
- Writing a lambda handler function: This function helps the AWS Lambda Service make predictions using the following:
— image URL
— predict function containing the image URL

lambda_function script

import tflite_runtime.interpreter as tflite
from keras_image_helper import create_preprocessor


preprocessor = create_preprocessor('xception', target_size=(299, 299))

interpreter = tflite.Interpreter(model_path='kitchenware-model.tflite')
interpreter.allocate_tensors()

input_index = interpreter.get_input_details()[0]['index']
output_index = interpreter.get_output_details()[0]['index']

classes = ['cup', 'forks', 'glass', 'knife', 'plate', 'spoon']

#url = 'https://bit.ly/3IYiPAX'


def predict(url):
    X = preprocessor.from_url(url)

    interpreter.set_tensor(input_index, X)
    interpreter.invoke()
    preds = interpreter.get_tensor(output_index)

    float_predictions = preds[0].tolist()

    return dict(zip(classes, float_predictions))

def lambda_handler(event, context):
    url = event['url']
    result = predict(url)
    return result

Service Dockerfile

This docker file serves as a conduit for our lambda_function script to be wrapped in a docker container image and pushed to AWS Lambda Service. You can access the file here. Resources within the file are:

- AWS base image (using python 3.8)
- Installation of model prediction dependencies:
— Keras-image-helper
— tflite_runtime
- Model file
- lambda_function file
- Command on how AWS Lambda will locate the `lambda_function` using `lambda_handler`

Dockerfile

FROM public.ecr.aws/lambda/python:3.8

RUN pip install keras-image-helper
RUN pip install https://github.com/alexeygrigorev/tflite-aws-lambda/raw/main/tflite/tflite_runtime-2.7.0-cp38-cp38-linux_x86_64.whl

COPY kitchenware-model.tflite .
COPY lambda_function.py .

CMD [ "lambda_function.lambda_handler" ]

To create the Docker container image, I built the Docker image using `Dockerfile` and its accompanying dependencies, then ran the `docker image` to create the `docker container` which will be pushed to AWS Lambda. Afterward, I tested the container locally by sending a request using a `test` script

Service test script

This script contains the request framework that will be served to the model for prediction. Below summarizes the whole framework:

- Importing requests:- This affords us the opportunity to send requests to the lambda service containing the model for which prediction will be made -Urls:
— lambda Url; used for the testing locally
— lambda API (Url); used after the docker image container is published to AWS Lambda Service
- data:- predict image url put in dictionary
- framework to send requests to the model for prediction using Url(any of the above urls depending on your use-case), the data (converting it to json), and also convert the result to json

import requests

url = 'http://localhost:8080/2015-03-31/functions/function/invocations'

# API url 
# url = 'https://********.execute-api.us-east-1.amazonaws.com/test/predict'

data = {'url': 'https://bit.ly/3IYiPAX'}

result = requests.post(url, json=data).json()
print(result)

Get the access file here.

After the preparation of our service infrastructures, is that the end of the model deployment process? Certainly not! Unless the model is not intended for public use. What we’ve strategically done is to wrap the model we built in a Docker container along with its dependencies. Privately, this can be accessed locally through our test script. However, for this project, we want the model open to the public. So, we need to push this Docker container to the cloud. This enables accessibility by the public. Now, let’s deploy to the cloud. We are employing the service of AWS.

AWS Web Services

Amazon Cloud Service is one of the cloud computing platforms available to the public. How do we now deploy our Docker container here? Below summarizes the process of deployment:

AWS Lambda Service

This is one of the Amazon web services. The choice of lambda was explained earlier.

With the creation of a Docker container image, what’s next is to publish the container to AWS Lambda. To achieve this, I did the following;

- Created AWS Elastic Container Registry (ECR)
- Pushed the docker image container to the AWS Elastic Container Registry (ECR)
- Used the AWS Elastic Container Registry (ECR) URI to create an AWS lambda function, using lambda container image
- Tested the lambda function using a sample image URL

How do we now access this Lambda service? API Gateway Service will be a comrade in arms in this endeavor.

AWS API Gateway Service

I did the following here;

- Created and configured a new API gateway for the lambda function
- Deployed the API:- This exposes the lambda function as a web service
- Extracted API URL for testing which can also be used for subsequent request to the AWS Web Service for model prediction

How the Service Works for the Solution Provider and Receiver

For the service to benefit the user, it’s imperative for the solution provider to create a software solution around the API.

API workflow
- A user uploads a kitchenware image (containing the classes for which the model was trained) to the solution provider platform
- A request will be sent through the API gateway using the uploaded image
- API gateway transmits this request to the lambda function
- The lambda function receives this request and make prediction
- Lambda function sends its response through API gateway
- API gateway receives and transmits the lambda function response
- Then the user receives the response

Note: All these happen within seconds of a request by the user.

Check out the project repository here.