How to Deploy Highly Scalable MQTT Broker to AWS (HiveMQ)

What is MQTT?

Published in

Geek Culture

7 min readMar 22, 2021

MQTT (Message Queuing Telemetry Transport) is a messaging protocol created for entities with lower network or bandwidth allowances. It is widely used for IoT device communication as it is extremely lightweight and uses a publish/subscribe model with bi-directional capabilities. This protocol is an excellent choice for wireless networks that experience latency due to bandwidth constraints or network reliability issues. It also reduces the complexity on the client-side by pushing all the resource-heavy logic to the server-side.

Why MQTT?

Most of the developers are already aware of the of web services we could simply build one of those to serve the IoT use cases. The IoT devices could simply communicate with the back-end infrastructure via API calls. But there are some disadvantages to it as HTTP is a synchronous protocol, the client needs to wait for the response from the server, this behavior could affect the scalability of the services. The IoT devices could most likely be in an unreliable network circumstance. Also, HTTP communication is a one-way communication but in the IoT world, having a two-way communication would help the devices to receive commands passively and one last thing to consider is that the HTTP is a heavy-weight protocol as it is not suited for the constrained networks.

Advantages of MQTT

Scalability
Reliability
Lightweight and Efficient
Bi-directional communication
Major support in Unreliable network connectivity
Security
Binary Protocol

How does it work?

The most important thing to note is that the MQTT requires TCP/IP(IMP). It uses TCP persistent connections and also supports security on TLS. The MQTT protocol works on the Publish/Subscribe pattern. The pub/sub pattern decouples the clients from backend services. The entities that are sending the data are considered as publishers and the entities that are required to consume the data that is being sent are named as subscribers. The Publishers and Subscribers are not connected directly in fact there would be a third component called broker which handles all the connection overhead. In order to communicate, the publisher/subscriber should know the hostname and port of the broker.

Things to know before diving in…

Quality of Service (QoS):

QoS will define the guarantee of delivery in referring to a specific message. QoS acts as a key feature in MQTT, giving the clients the ability to choose the level of service that it requires. There are three different levels of QoS.

QoS 0 (At-most once): This is the minimal level of service. The message is delivered to the recipient at most once there is no guarantee of delivery. The messages are neither persisted nor retransmitted by the sender.

QoS 1 (At-least once): The message is delivered to the recipient at least once there is a minimum level of guarantee of delivering a message. The messages are persisted by the sender until it gets a PUBACK packet from the receiver. One possible scenario that could occur in this case is message duplication (a single message can be sent multiple times).

QoS 2 (Only once): This is the maximum level of service. The message is delivered to the recipient only once. The sender will publish a PUBLISH packet with a package identifier (The sender and receiver will use the packet identifier of the original PUBLISH message to coordinate delivery of the message) to the receiver, after the receiver gets a message with QoS level 2 it will process the message and send a PUBREC package to the sender that acknowledges the PUBLISH packet. Once the sender receives the PUBREC packet from the receiver the sender can then safely discard the initial PUBLISH packet. The sender will receive the PUBREC packet and respond with PUBREL packet. After the receiver gets the PUBREL packet, the receiver discards all the stored states and responds with a PUBCOMP packet. After the sender receives the PUBCOMP packet, the package identifier will be available for reuse.

Keep Alive Time: It is the maximum time interval that is permitted to elapse between each message sent/received by the client/broker. If the keep-alive time is elapsed the broker will forcefully close the connection with the client. In order to keep long-running connections, the client needs to send PINGREQ packet to the broker and the broker will respond with PINGRES packet in this way we can ensure that the clients can stay connected to the broker for longer durations without sending/receiving any actual messages.

Note: PINGRES packets are automatically sent by most of the popular MQTT libraries without any additional implementation on the client side.

Publisher/subscriber Flow

Client will establish a TCP connection with the broker, this is done by sending the CONNECT packet to the broker. At this point, the broker will authenticate/authorize if required and will respond with a CONNACK packet with the appropriate status code.
The client can specify the keep-alive timeout for the established connection. If the keep-alive time is elapsed the client should send a PINGREQ packet to the broker, then the broker would respond with a PINGRES packet. If the client fails to send a PINGREQ packet or if it doesn’t initiate any sort of communication with the broker before the keep-alive time the broker itself will disconnect the client. Most of the MQTT client libraries handle the PING logic under the covers.
Subscribers can register to a certain topic by using SUBSCRIBE/SUBACK packets.
Subscribers can unregister from a certain topic by using UNSUBSCRIBE/UNSUBACK packets.
If a publisher/subscriber want to terminate the MQTT sessions, it sends a DISCONNECT message to broker first then closes the connection.

Let’s dive into the fun stuff….

Steps to deploy MQTT broker on AWS:

Today we will be deploying a pre-built image of a HiveMQ broker on a AWS EC2 instance.

Step 1: Login to AWS console and navigate to EC2 service and click on launch instance.

Step 2: Click on community AMI tab.

Step 3: Search and select HiveMQ AMI.

Step 3: Select the instance type (m5.xlarge or c5.xlarge are the minimum requirements from HiveMQ) and click next.

Step 4: Configure instance details if you want to make any explicit changes but I am leaving as is for this deployment.

Step 5: Select the storage size (size should not be less than 20GB) and click next.

Step 6: Add instance tags if any and click next.

Step 7: Configure Security Group, we need three ports enabled in-order to communicate with the broker and click next.

SSH port 22 for connecting to the broker for making any changes.
TCP port 80 for broker dashboard.
TCP port 1883 to enable client connection.

Step 8: Review & launch instance configuration.

Step 9: Select and existing key pair or create a new key pair for connecting to the instance via SSH.

Step 10: Go to the EC2 dashboard and copy the public DNS url.

Step 11: Log In to the HiveMQ dashboard using the public DNS url.

The default credentials are user: admin password: hivemq

Step 12: You cannot connect to the broker using any MQTT Client. I am using MQTT Box Client to connect to the HiveMQ broker.

Once the client is successfully connected you can see the connection status on the broker dashboard.

You can compose a test message to any topic. In this demo I will be sending a test message to AWS_Test_Topic. I have also configured a subscriber to this test topic.