AWS SQS Visibility Timeout Explained
Let’s take a simple example of a producer sending a message to a consumer via a SQS queue.
When the producer sends a message to SQS, it is stored in the queue until consumed by a consumer. When the consumer is ready, it polls SQS for new messages and ultimately receives the message.
Once a message is received by a consumer, SQS doesn’t automatically delete the message. Because there’s no way for SQS to guarantee that the message has been received by the consumer. The message might get lost in transit or the consumer can fail while processing the message.
So the consumer must delete the message from the queue after receiving and processing it.
The consumer might take some time to process the message. While the message is being processed, there’s a chance for an another consumer to receive the same message and process. How does SQS prevents processing a message more than once?
Let’s have a look.
The Visibility Timeout
While a consumer is processing a message in the queue, SQS temporary hides the message from other consumers.
This is done by setting a visibility timeout on the message, a period of time during which SQS prevents other consumers from receiving and processing the message.
The default visibility timeout for a message is 30 seconds.
How it works?
The visibility timeout begins when SQS hands over a message to the consumer. During this time, the consumer has to do two things.
1. It has to complete the processing of the message.
2. Delete the message from the queue.
However, if the consumer fails before deleting the message, the visibility timeout will expire and the message becomes visible to other consumers for receiving.
If a message must be received only once, the consumer should delete it within the duration of the visibility timeout.
Configuring the Visibility Timeout
Every SQS queue has the default visibility timeout setting for 30 seconds. The minimum is 0 seconds and the maximum is 12 hours.
This means your consumer can take up to 12 hours to process and delete a message.
However, you can change this value for the entire queue or per message basis.
Configure for the entire queue
Usually, the value of the visibility timeout should be set to the maximum time that it takes your consumer to process and delete a message from the queue. You can perform a series of load tests on the consumer and get a rough idea about how long will it take.
If you know this value beforehand, you can set it when creating the queue. In the AWS Console, you can use the Visibility timeout field to set the visibility timeout for all the messages in the queue.
Configure per message basis
There can be situations where you don’t know how long it will take to process a message. For example, when the consumer is in the middle of processing a message, it may need additional time to complete the processing. Thus, the default visibility timeout set for the entire queue will be insufficient.
In a situation like that, SQS allows you to set a special visibility timeout for the received messages without changing the overall queue timeout.
The consumer can call the ChangeMessageVisibility operation of the SQS API with a new value to shorten or extend the visibility timeout for a message.
For example, you can specify the initial visibility timeout (say, 2 minutes) and then — as long as your consumer still works on the message — keep extending the visibility timeout by 2 minutes every minute.
Calling off the Visibility Timeout
When a consumer doesn’t want to process and delete a message, it can tell the SQS to terminate the visibility timeout for that message immediately. This can be done by calling ChangeMessageVisibility API operation with VisibilityTimeout set to 0 seconds.
This makes the message immediately visible to other consumers in the system and available for processing.
Best practices when setting the Visibility Timeout
Setting the visibility timeout depends on how long it takes your consumer to process and delete a message.
If you set it to a higher value, your consumer has to wait for a relatively long time to attempt to process the message again if the previous processing attempt fails.
Conversely, if you set to a low value, a duplicate message is received by another consumer while the original consumer is still working on the message.
Thus, the best strategy is to either
1. Set the timeout to the maximum value that it takes the consumer to process and delete a message.
0r
2. If you don’t know this value beforehand, set the timeout to a baseline value and keep on extending it until your consumer is done with the message.