The Age of Smart(er) Chatbots
Using AWS to develop a chatbot capable of making business decisions
Introduction to (Serverless) Chatbots
Machines that converse with humans are no longer a part of the realm of science fiction, but a part of daily human lives in the form of personal assistants like Alexa, Google Assistant, Siri and so on. With the advent of cloud computing platforms, the AI- based engines that power these powerful personal assistants have made their way into the business realm as chatbots. They have proven their value in customer support, order processing and tracking, and data-collection applications.
But, have you ever wished for a chatbot that can go a step ahead and predict your business outcomes in addition to providing the necessary business intelligence? If so, that’s what this post will talk about.
In this post, we will be leveraging Amazon Web Services (AWS) to develop an intelligent chatbot that is capable of understanding human intents, making real-time predictions and surfacing the relevant results back to the user — all through a serverless architecture framework.
The AWS products that we will be leveraging in this post to build and deploy the smart chatbot are:
- Amazon Lex — A conversational engine to understand and respond to user intents and questions
- Amazon SageMaker — A Machine Learning service to develop and deploy predictive models at scale through a Jupyter notebook environment
- Amazon Athena — A serverless interactive querying service to analyze data in Amazon S3 using standard SQL
- AWS Lambda — A serverless code execution service that can connect to almost all Amazon services programmatically
- Amazon S3 — The all popular object storage store on Amazon Web Services
Now let us take a deeper dive into building the chatbot using the above AWS services and deploying the same on Slack. We will first develop the prediction model, followed by the chatbot, and then we will integrate the two.
The data used in this post is publicly available from the UIC ML Repository.
Building & Deploying the Prediction Model
Suppose you are part of a credit card company that is interested in building a chatbot for its potential customers to serve as a platform to predict and communicate if they pre-qualify for a credit card approval. For this purpose, the chatbot will converse with the user to collect some vital information and resurface back with the pre-qualification decision. To keep the blog simple, we will train a logistic regression model on the historic credit card approval data and deploy the same via Amazon SageMaker.
Note: We won’t be focusing much on the feature engineering and hyper-parameter tuning in this blog, though they are quite important steps in building any Machine Learning model. Instead we will be focusing on cloud-based model training, accuracy evaluation and model deployment.
Feature Selection
Feature selection/importance is a key step in developing a chatbot. You don’t want the users to enter information that doesn’t have a significant predictive power. Our objective is to develop a simple and conversational chatbot, not a hard to use data entry tool.
After performing the necessary feature engineering, the data set is split into training, validation and test sets with a proportion split of 70%, 15% and 15%. Performing a Recursive Feature Elimination (RFE) with Random Forest to understand the feature importance shows the following results:
Thus, the basic information we are interested in collecting from potential customers to pre-screen them for credit card approvals are:
- Prior Defaults: if the user has made any payment defaults in the past
- Credit Score: the user’s credit score
- Debt: the user’s existing debt in USD
- Income: the user’s annual income in USD
- Years Employed: the user’s years of employment
We will leverage the ‘Linear Learner’ algorithm provided by Amazon SageMaker to build the logistic regression model. More details on this algorithm can be found here.
Model Training & Deployment
In order to train a logistic regression model on SageMaker, we have to provide some parameters including the type of algorithm (regressor vs. classifier), number of predictors and so on. Specifying these parameters will look like this:
Once the training model is created, it is containerized before it is hosted to an API endpoint. The model can be hosted using a few lines of Python code like below:
Now that the prediction model is active, we will test the model for its prediction accuracy using the test data set.
This resulted in a prediction model that is 75% accurate in pre-screening potential customers. Once we configure the chatbot, we will use this hosted model to make real-time predictions.
Note: Model accuracy primarily depends on a couple of key factors including the number of training observations and the type of prediction model used. We obtained a model accuracy of 75% with just close to 700 observations and by using a simple logistic regression model. Have we had more observations or used a tree-based or network-based model, the model accuracy would have increased.
Building & Publishing the Chatbot
Now that we have our prediction model ready, we can start building the chatbot on Amazon Lex . Using deep learning techniques, Amazon Lex performs Natural Language Processing & Understanding to perform conversations with users. For the purpose of this blog, let us build a text-based chatbot capable of answering the following two user questions/interests:
- Average credit score required for credit card approval (a reporting task)
- Whether the user will pre-qualify for the credit card (a prediction task)
After specifying the chatbot’s name and the session timeout window, we need to create the ‘Intents’ and select the respective ‘Slot Types’. An ‘intent’ is an action that the user wants to perform, while a ‘slot type’ is the data type of the expected user input.
While configuring the intents, you need to provide some sample utterances — different ways the user can ask the chatbot the same question. Lex uses the utterance training examples to understand the intent context. The next step is to configure the slot values and provide the corresponding slot types. An example for the intent & slot configuration is below:
Once the chatbot is configured, it needs to be built and deployed on a platform. Amazon Lex lets you deploy chatbots on:
- Facebook Messenger
- Slack
- Kik
- Twilo SMS
For the purpose of this blog, we will deploy the chatbot on Slack. To do the same, we need to create an app on Slack first, gather the app credentials and tie it back to the chatbot, and provide the chatbot’s callback URL to the Slack App. More information on this can be found here.
Connecting the Chatbot to a Data & Prediction Model
Now that we have a working front end to the chatbot and a hosted SageMaker-based prediction model, the next task is to connect them together. Also, the chatbot should be capable of querying the original data for reporting tasks. To perform both these tasks, we will use AWS Lambda.
Lambda can programmatically access the SageMaker model and interact with the data stored in an S3 bucket via Amazon Athena. For an AWS Lambda function written in Python, the boto3 library provides the necessary logistics to interact with different AWS services.
For example, a Python-based Lambda function can interact with the data stored in S3 via Amazon Athena in the following manner:
Similarly, the hosted SageMaker model can be accessed for real-time predictions via AWS Lambda as shown below:
Once the Lambda function is configured, it needs to be hooked back to the Chatbot to fulfill users’ intents in the ‘Fulfillment’ section in the ‘Intent’ as shown below.
This completes the integration process. We will need to save the intents and publish the chatbot to Slack again to reflect the latest changes.
Functional Demonstration of the Chatbot
Now that we have a fully functional, smart chatbot, let’s go ahead and try it out.
Consider a potential customer with a low annual income and a low credit score trying to obtain information on the average credit score required to get a credit card approval, and wanting to know if he/she pre-qualifies for a credit card approval. The interaction with the chatbot can look like this.
The responses are automated through the app — credit_card_approval . But it is a sad day for the user. The user’s request for a credit card approval was declined.
Now, consider another potential customer with a better annual income and a better credit score trying to know his/her chances of getting through the pre-qualification screening. The interaction is shown below:
This user made it through the pre-qualification screening. This shows that our prediction model is capable of making the correct decisions for pre-screening and communicates the same through the chatbot.
Bigger Picture
The cutting-edge research and development in Artificial Intelligence is making it hard for the users to distinguish chatbots from their human counterparts. By acting as a close liaison between the users and data, modern chatbots can now play the role of business decision-makers. This can potentially free up a lot of time from numerous customer care agents, who can now focus on serving customers with larger concerns/questions. With the data residing in a cloud platform like AWS, it is now possible to develop and deploy these ‘smart chatbots’ through a serverless infrastructure. Hopefully, the trend of having automated solutions will continue into the upcoming years, thus making human hours more productive.