How to Get Started with Chatbots — Part 2
From Dialogflow to a Production Environment
So you’ve built your first Chatbot on Dialogflow. PLEASE, GIVE YOURSELF A PAT ON THE BACK (really though, there’s a lot of moving parts, and tools of the trade).
But that’s enough of that, we’re going to use this momentum. The Google Cloud Platform has done a lot of the heavy lifting, in terms of providing accessible databases, NLP, and the ability to create a cloud service out of your own script (App Engine to be discussed soon).
Be sure to go through this tutorial, and use my part 1 article to illuminate what’s going on behind the scenes.
OK, now that you’ve completed part 1, our discussion today will follow the second documentation (https://cloud.google.com/architecture/securing-scaling-chatbots).
What you get from reading this article:
I’m not going to lie to you. It’s going to take a while to get through this, but I firmly believe that to be true for anything that is worth learning about (and this topic is certainly worth the effort).
My goal, for you the reader, is: include more efficiency, clarity, and enjoyment in this process.
1. The webhook: A journey from a temporary local script to a permanent service
- Motivation: From part 1, the webhook connects Dialogflow to Datastore, permitting access to the entities defined through our dataset, and hosting it on our local environment at port 5000.
- HTTP Tunneling: With the local script hosted and available, Dialogflow still cannot reach the entities in Datastore. That’s where NGROK comes in handy, hosting the webhook through their servers on the World Wide Web, making it accessible to Dialogflow.
- HTTP Basic Authentication: Through a chunk of code provided in the documentation (please try to read this chunk of code), we can provide to Dialogflow, and this script, a specific username and password, to further secure our very public code.
- Python Decorator: A quick note that the above authentication is achieved with a function that returns a function (NIFTY!). If you’re interested in the theory of computing, then check out the Wikipedia explanation for a higher-order function.
- App Engine: Google Cloud Platform provides this API to create services out of your python code. GCP converts the code from Datalab to bare python, places it in a directory, runs
gcloud app deploy
, and, more importantly, runsgcloud app browse
, which provides a public URL to access this service.
2. Iframes are nice (kinda), but custom UIs are better
- Useful cloud shell commands: When running through the cloud shell, you’ll need to access your second project, as the documentation states. Here’s a way to list your projects, in case you forget the name:
gcloud projects list
. Here’s how to switch your project:gcloud config set project <project-name>
. Also, once you’re in the project, you may find it difficult to navigate without Datalab, so you can output the file to the terminal command with:cat <file-name>
. Or, if you’re more comfortable with command lines, then writevim <file-name>
to read and edit the file directly. - API Key: WARNING, THE DOCUMENTATION PROVIDED BY GOOGLE HAS NOT BEEN UPDATED FOR V2. In the past, you would integrate the Angular based UI with the client-access token, provided by Dialogflow. As this was found to be insecure, V2 now uses JSON, which leverages service account keys. See this great Medium article for instructions on how to connect your Chatbot UI more securely with API V2.
- NPM: This tool, and others like it (e.g., Yarn), provides access to the many Javascript libraries that exist. With NPM, and a well set up package.json, you can install many libraries, and easily send the package.json to other developers, creating the same environment on their machine. Here, it is used to integrate Angular.
- Angular: Angular is a front-end framework built off of Javascript, allowing for a non-static, or dynamic, implementation of front-end code, to name one thing. As the UI will need to frequently make calls to Dialogflow, this can be a great option for a more modern interface. I recommend taking a look at this Medium article , as a guide to follow in your implementation.
3. Identity Aware Proxy: Google Cloud Platform has it all
- Definition and key terms: From the IAP documentation: “IAP establishes a central authorization layer for applications accessed by HTTPS, so you can adopt an application-level access control model instead of using network-level firewalls.” In other words, a network cannot always be trusted, but a verified user, and device, can be trusted.
- Proxy: A proxy is a kind of server that stands in between the client server, which makes a request, and the server that will provide the resources. The proxy server makes the request for the client server, in a way masking the client, but also allowing for more control over the request. See Wiki for more information on how a proxy server works, and the different kinds of proxies.
- Firewall: This system oversees the incoming and outgoing requests within a network. As this is a safety precaution which should always be included, IAP improves upon this barrier with a user-based authentication. With this type of authentication, an attacker cannot make requests through an already trusted network, and is therefore a zero-trust model.
- Central Authentication Service: As I don’t want to repeat too much that this is a user-based authentication, I want to add that this is a web protocol, which is also used for things like single sign-on.
I hope this inspires you to dig further
Chatbot development isn’t just one thing! In its early stages, you’re going to have to be a jack-of-all-trades to get involved. I find this to be true with almost all emerging technologies, and that’s OK. Software engineering is about not only finding solutions, but also making them readily available. Then, you just have to iterate and improve, so try to have fun doing so.
In part 3, I’d like to talk more about the Machine Learning techniques. Thanks for reading, and thanks for embarking on this journey with me.