Creating and Deploying a Python Flask Microservice on Amazon Fargate — Part II
On this part, we will be learning how to connect to MongoDB using PyMongo, and how to create and validate schemas using Marshmallow.
If you don’t know what this post is about, take a look on Creating and Deploying a Python Flask Microservice on Amazon Fargate — Part I.
Creating Schemas for our TODOs
Now that we have the base routes defined, we have to create the schemas that are going to be used for serialization and deserialization of our content (input and output). First, let’s define which information a TODO should have:
Based on the schema we defined above, we can start creating the Marshmallow schemas that will validate if the payload sent when creating a new TODO is valid and that will also serialize TODOs from the database in JSON format.
We are going to need two different schemas for our TODOs. One for creating and updating a new TODO and another for retrieving information about existing TODOs. Every schema should be stored on the schemas
package. Let’s create a new file inside it named todos.py
and add the following content:
As explained, there are two schemas TODOBaseSchema
and TODODetailSchema
. Note that the second one is a child of the first one, meaning that it has all attributes from the first and also de _id
.
- The
required=True
means that marshmallow will validate if this field is present on the json, if not, an error will be thrown; - The
validate=Length(...)
validates that the size of thetitle
anddescription
should be 50 and 200 respectively, if not, an error will be thrown. - Also, we added a format on the DateTime field. This format tells marshmallow how the data will come from the JSON and how it should be serialized to JSON (for example, if I send the
due_date
as21/02/2020
, marshmallow won’t be able to parse it, and an error will be thrown).
Now that we have the schemas, let’s start using them on our routes.
Adding Schemas to Routes
Since we haven’t setup the integration with MongoDB yet, we are going to “simulate” the create, and get of TODOs by creating a static list of TODOs on our GET endpoint from apis/todos.py
file. Also, we will be adding a validation to the POST endpoint, so we can guarantee that every TODO will be created correctly on the database. Let’s change our apis/todos.py
as follows:
Note that this is not the final version of this file, it is only an example while we are not using MongoDB.
- We’ve created a TODO_LIST variable within the GET (that lists all TODOs). This list is composed by TODOs following the schema we defined above. Note that the
due_date
is not a string, it is a Python DateTime object, as expected by the schema. The lineTODODetailSchema().dump(TODO_LIST, many=True)
is responsible for serializing (dumping) this object to JSON format and returning it to the client. It is important to note themany=True
argument. It is used when we want to dump (or load) a list and not a single object. It will apply the schema to each object of the list; - On the POST method (that creates a new TODO), we’ve added some validation to make sure that the body of the request is correct. First we extract the JSON from the request by
payload = request.get_json()
. Then we load (deserialize) this json using the Marshmallow schema previously created by doingdata = TODOBaseSchema().load(payload)
. It is important to note that this statement is inside atry ... except
. It is done because the load might throw aValidationError
exception if the payload is not in the format defined by the schema. The messages explaining why the error was thrown is going to be one.messages
.
You should now have a small understanding of how marshmallow works and how we should use this with Python and Flask. Of course we can improve the code, we could use decorators to validate the body and avoid code repetition, but it is not the purpose of this article.
The next step is to add the MongoDB integration and starting saving and retrieving TODOs from the Database.
Adding PyMongo and MongoDB Integration
From MongoDB website:
MongoDB is a general purpose, document-based, distributed database built for modern application developers and for the cloud era.
I’m not saying that you should use MongoDb for every microservice you create. I’m using it here because it is simpler to start using (but I do use MongoDB on my projects too).
Creating the MongoDB Cluster on MongoAtlas
First of all, you have to create a Free Account on MongoAtlas. If you already have a MongoDb running, you can skip this step.
After creating your account, follow the steps below to create the cluster:
- Click on
Cluster
on the left-side menu, then go toBuild a New Cluster
; - You can keep everything as it is, the cost should be 0 (Free Tier), but you can change the name of your cluster for whatever name you want;
- Create a new database user by going to
Database Access -> Add New Database User
. The authentication method should bePassword
, add a username and a password, leave everything else as it is and click onadd user
. - It might take a while until the cluster is created, but, after that, you can get the cluster connection url by clicking on
Connect
thenConnect your application
then selectPython 3.6 or Later
and copy that URL. The url has the following format:
mongodb+srv://<username>:<password>@<cluster>.mongodb.net/test?retryWrites=true&w=majority
Configuring our project to connect to MongoAtlas
Now that you have your cluster’s url, let’s, by now, add it to the config/default.py
file of our project:
MONGO_URL = "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/test?retryWrites=true&w=majority"
Note that, in order to keep things secure, this information should not be in the code, you could add this as an environment variable and grab this on the config/default.py
file by doing:
import os
MONGO_URL = os.environ.get("MONGO_URL")
However, since we are only running it locally, let’s keep things simple.
First, we have to create a new package called database
on root path of our application. Then create a new file named mongodb.py
This file has four functions, connect
(responsible for connecting to the database), db
(responsible for setting the database to be used for queries), set_client and get_client
to handle the global variable client.
Now, let’s connect to the database in our app.py
file. In order to do that, let’s first import the newly created file: from database import mongodb
. Then, after the blueprint registration, we can add the connection to mongodb by adding mongodb.connect(application.config["MONGO_URL"])
. So, the current version of our app.py
file can be found below:
Now we can connect to mongo and start using it to store and retrieve the information we need.
Updating our API to use MongoDB
Before start using mongo, we have to define the name of the database and the name of the collection we are going to use to store the TODOs, and guess what, I’ll call it todo_service
and todos
respectively. When using MongoDB, we don’t have to create the database or collections previously, because it will create everything if it doesn’t exist yet (amazing, right?), so, let’s change our routes in order to use MongoDB.
- In order to retrieve all TODOs from the database, we can do the following:
from database.mongodb import db
todos = db("todo_service")["todos"].find({})
Basically we use the db
function created before to use the client with the database todo_service
, then we tell PyMongo that we want to use the collection named todos
. The find
is the query, and there is an empty dict inside it because we are not filtering anything.
- In order to create a new TODO on the database:
todo = db("todo_service")["todos"].insert_one(data)
The data here is the payload parsed from the JSON using Marshmallow (you will see it on the code later).
- To update a TODO on the database, is a bit different:
from bson import ObjectIdtodo = db("todo_service")["todos"].update_one({"_id": ObjectId(tid)}, {"$set": data})
IDs in MongoDB are ObjectIds. In order to be able to query them, we have to cast the string containing the “id” to ObjectID. Also, the update_one
is composed by two parts, the first is the query referring to the object we are updating, and the second is the update itself.
- To delete a TODO on the database, we do the following:
db("todo_service")["todos"].delete_one({"_id": ObjectId(tid)})
So, after this explanation, that’s how our todo_list/apis/todos.py
should be:
Note that before updating or deleting a TODO, we check if it exists. We do that to avoid returning 200
to the client if we haven’t updated anything. MongoDB
doesn’t return error if nothing was updated or deleted based on the query, so, that’s the way we do to return error on those cases.
Conclusion
Now we have a working version of our TODO list service. It is really simple, but it validates schemas, connects to mongodb, retrieves, updates, creates and deletes TODOs. What we could to now is start writing some unit tests to guarantee that everything is working fine (we should do that), but it is not the purpose of this project.
On the next chapter, I’ll explain how we can configure uWSGI, NGINX and Docker in order to run this with a reverse proxy. Also, we are going to see how we can wrap things up and run it on Amazon Fargate with a CI/CD Pipeline on CircleCI.