People In A Queue

AWS SQS Priority Queue Pattern

11 min readJul 20, 2024

--

Many companies incorporate the concept of “priority” into their systems and architecture. This concept can manifest in various ways, such as prioritizing requests from paid customers over those from unpaid customers, or giving precedence to certain types of notifications in a notification system. The idea of priority is versatile and can be applied across different technologies and patterns.

In this blog post, we will explore how to implement a Serverless Priority System using AWS services. We will discuss two different flows for handling task prioritization and how they can help you efficiently allocate resources and maintain system responsiveness.

Understanding the Problem

Before we dive into our implementation, let’s first understand the problem we’ll be addressing using the Priority Queue Pattern.

Consider a content-sharing platform where users upload videos that need to be processed before they are made available to viewers. This processing includes tasks like resolution adjustment, compression, and adding watermarks. The platform receives videos of varying lengths and qualities from a diverse user base, including both regular users and premium users who pay for enhanced features and faster processing.

Our task is to create a system component that needs to:

  1. Prioritization: Premium users expect faster processing times as part of their subscription benefits. Therefore, we need to prioritize their video processing over videos from regular users.
  2. Resource Efficiency: Efficiently manage compute resources to handle periods of high upload activity without causing long wait times, particularly for premium users
  3. Scalability: Ensure the system scales effectively during high upload periods without incurring unnecessary costs during quieter periods.

When considering this problem and how to develop a solution, the main challenge is efficiently distributing our available workers (AWS Lambda Functions) to ensure both free and paid user-uploaded videos are processed. At the same time, we must ensure that paid users do not experience the same wait times as free users for their video processing. This involves creating a priority system that can dynamically allocate resources based on user type and current system load.

Selecting Our Services

To address the high volume of user requests and efficiently manage video processing, we selected several AWS technologies to create a fully serverless architecture. Here are the services we chose:

  1. AWS Lambda for Video Processing: Lambda functions will be used to process the videos, performing tasks such as resolution adjustment, compression, and adding watermarks.
  2. Amazon SQS (Simple Queue Service): SQS will act as a queue to hold the large volume of requests before they are handled by the Lambda workers. This ensures that video processing requests are managed in an orderly fashion and that the system can scale to handle high volumes of uploads.
  3. Amazon S3 (Simple Storage Service): S3 will be used as a blob storage solution for our videos. Uploaded videos will be stored in S3, where they can be accessed by the Lambda functions for processing and then served to users once processing is complete.
  4. Amazon DynamoDB: DynamoDB will store user data, file information, and other metadata related to the videos and their processing status. This provides a fast and scalable NoSQL database solution to support our application’s needs.
  5. Amazon EventBridge: EventBridge will be used to trigger various components of the system based on specific events. For example, it can invoke Lambda functions when new videos are uploaded or when processing is complete, ensuring seamless integration between different parts of the system.

High-Level Overview of the system

Let’s take a quick look at how our flow will look now that we have all our services selected.

System Flow

The system consists of several components, and it all starts when the user uploads a video to our S3 Storage. The video will be uploaded directly through a pre-signed URL. Here’s how the process works:

  1. Generate Pre-Signed URL: The user sends a request to the server to generate a pre-signed URL that will be used to upload the video file.
  2. Communicate with S3: The server communicates with the S3 bucket to generate the pre-signed URL.
  3. Store Metadata in DynamoDB: The server saves the file key along with the user ID in DynamoDB. This allows us to keep track of the uploaded videos and their respective users.
  4. Return Pre-Signed URL: The pre-signed URL is returned to the user.
  5. User Uploads Video: The user uses the pre-signed URL to upload the video directly from their device to the S3 bucket.
Initial Upload Flow

Once a new video is uploaded to the S3 bucket, EventBridge is triggered. It will notify a Lambda worker that will then pull the information about the file and fetch information about the user using the S3 File Key. Based on the user’s subscription level, the system will prioritize the user’s request to handle their uploaded video.

EventBridge Payload

The EventBridge event payload for a new video upload looks similar to this:

{
"id": "c3677697-xmpl-47b5-bdeb-5b5e5a7b0d66",
"source": "aws.s3",
"region": "us-east-1",
"detail": {
"userIdentity": {
"type": "IAMUser",
"principalId": "EXAMPLPINCIPALID",
...
},
"eventTime": "2022-07-20T21:28:49Z",
"eventSource": "s3.amazonaws.com",
"eventName": "PutObject",
"awsRegion": "us-east-1",
"requestParameters": {
"bucketName": "example-bucket",
"key": "filename.jpg",
"requestSource": "PutObject"
},
"responseElements": {
"x-amz-request-id": "EXAMPLE123456789",
"x-amz-id-2": "ExAmPlEID2"
},
"requestID": "EXAMPLE123456789",
"eventID": "59a4213f-xmpl-4261-9a78-aa3f1eEXAMPLE",
"readOnly": false,
"eventType": "AwsApiCall",
"managementEvent": true,
"recipientAccountId": "123456789012"
}
}

Lambda Worker Processing

  1. Fetch File Information: The Lambda worker fetches the file information from the event payload.
  2. Retrieve User Information: Using the S3 File Key, the Lambda function queries DynamoDB to retrieve information about the user who uploaded the video.
  3. Determine Priority: Based on the user’s subscription level (e.g., regular or premium), the system determines the priority of the video processing request.
  4. Queue the Task: The task is then placed in the appropriate SQS queue based on its priority. Premium users’ videos are placed in a high-priority queue, while regular users’ videos are placed in a low-priority queue.
File Uploaded Event-Driven System Architecture

Now that we have everything in place and the messages are arriving at the correct priority queues, we need to implement priority queue logic to ensure that messages are processed in the correct order of priority.

How Does Priority Queue Flow Work?

Priority queues can be implemented in two different ways or patterns, each with its own advantages and trade-offs. Let’s explore these two flows:

Flow 1: Priority At the Queue Level

AWS allows us to specify the maximum number of concurrent Lambda function executions per SQS queue. This configuration is done at the SQS level rather than the Lambda level, giving you granular control over how resources are allocated to different queues.

SQS Configuration Screen

Let’s assume we have a total concurrency limit of 1000 Lambda functions for our AWS account, and we need to allocate these resources between a high-priority queue and a low-priority queue.

  • High Priority Queue: Allocate 900 concurrent invocations
  • Low Priority Queue: Allocate 100 concurrent invocations

By setting these limits, we ensure that the high-priority queue can handle up to 900 concurrent Lambda executions. This allocation allows the high-priority queue to process a large volume of tasks efficiently. On the other hand, the low-priority queue is restricted to 100 concurrent executions, ensuring it cannot overwhelm the system and consume the resources needed by the high-priority queue.

This flow guarantees that your available worker execution is efficiently divided between the two priority levels. The low-priority queue is capped at 100 concurrent invocations, preventing it from consuming the entire pool of available resources, while the high-priority queue can scale up to 900 concurrent invocations, handling a significantly higher volume of tasks.

Pros & Cons of this implementation:

PROS:

  • Simplicity: The architecture provides a straightforward method for developing and monitoring priority tasks, reducing complexity in implementation.
  • Infrastructure-Based Control: This approach operates on the infrastructure level, eliminating the need for additional logic within the Lambda function when adding a new priority queue. You simply provide a new SQS queue, attach it to the Lambda consumer, and assign the appropriate concurrency based on priority.
  • Concurrent Handling: This method allows for the simultaneous handling of both high and low-priority queues, ensuring that neither queue is neglected.

CONS:

  • Fixed Allocation: This approach can be limiting when the high-priority queue requires more resources than allocated. It cannot borrow resources from the lower-priority queue, potentially leading to bottlenecks as we add more priorities to the system.
  • Resources Wastage: In cases where the high-priority queue is not fully utilized, the allocated resources might remain idle, resulting in inefficiency and underutilization of available capacity.

Flow 2: Priority At the Consumer Level

At this level of implementation, the concept of priority is handled differently

Here, the priority is managed based on the availability of records in the high-priority queue. The idea is to ensure that as long as there are records in the high-priority queue, they are processed first. Only after all high-priority tasks are handled do we start polling from the low-priority queue.

The Following Diagram shows how we would expect to handle the priority queue in a perfect world:

However, in the real world, the implementation will work as this diagram shows:

there is no guarantee that we will be handling things in the right order from the right queue.

We can go over this limitation by following this approach:

  1. We point both queues to the same Lambda Function.
  2. When the function is invoked, check if its source is the High-Priority queue. If it is, process it.
  3. If it isn’t, check if messages are available on the high-priority queue. If there are, return the current message to its queue. If there aren’t, process the message.
  4. Repeat.

The Following Diagram explains how the Logic happens

Logic For Handling Priority At The Function Level

Until now, we have been using NON-FIFO queues. However, when implementing the priority queue logic, there is a significant challenge related to the SQS.GetQueueAttributes API call. This API call is used to determine the number of messages available in a queue, but it is eventually consistent. This means there is a considerable chance that we might process some messages from the low-priority queue even when high-priority messages are available. This is a limitation of the SQS API.

The documentation clearly defines this behavior:

The ApproximateNumberOfMessagesDelayed, ApproximateNumberOfMessagesNotVisible, and ApproximateNumberOfMessages metrics may not achieve consistency until at least 1 minute after the producers stop sending messages. This period is required for the queue metadata to reach eventual consistency.

In our use case the eventual consistency manifests in two ways:

  1. When messages are placed on the high-priority queue, the ApproximateNumberOfMessages value generally becomes larger than 0 after a few seconds.
  2. When all messages on the priority queue are processed, the ApproximateNumberOfMessages value generally resets to 0 after a few seconds.

Eventual consistency is less of a problem with FIFO queues which is preferred for this kind of implementation.

Pros & Cons of this implementation:

PROS:

  • Prioritization: This flow prioritizes all messages from the high-priority queue, ensuring that resources are efficiently allocated to handle high-priority tasks first.
  • No Wasted Resources: With this approach, resources are fully utilized. If the high-priority queue is empty or has a low volume of messages, resources will be used to handle low-priority tasks, ensuring efficient utilization.

CONS:

  • Delay of Low-Priority Handling: Messages from the low-priority queue will experience delays in processing as the system prioritizes high-priority messages. Additionally, the need to check and potentially re-queue low-priority messages adds latency, potentially slowing down overall processing times.
  • SQS API Limitation: Due to SQS’s eventual consistency model, there might be delays in recognizing new high-priority messages. This can lead to situations where low-priority tasks are processed ahead of high-priority ones.
  • Incremental Cost: This flow involves additional API calls and message re-queuing, which can increase overall costs. Each batch containing a low-priority message requires an API call, contributing to higher billing.
  • Less Flexibility in Adding New Priorities: Adding a new priority queue in this flow requires coordination between the infrastructure team and the development team. This necessitates changes to the Lambda function logic and infrastructure setup, making it less agile in adapting to new priority levels.

Achievement Through Our New Flows

With the new flow we’ve created, we were able to separate the processing of each task by assigning it a “Priority.” This priority moves the task into one of two queues (or potentially more, depending on how many priorities you have in your system). For each priority, we looked at two different flows to handle the messages, AWS provided the flexibility to make the creation of this flow straightforward by leveraging SQS features.

  1. Prioritization:
  • Flow 1: By allocating dedicated concurrency limits at the queue level, we ensured that high-priority tasks always have sufficient resources. This straightforward setup allows us to manage priorities effectively without additional complexity in the Lambda function logic.
  • Flow 2: This flow dynamically prioritizes high-priority messages by processing them first and only handling low-priority messages when no high-priority tasks are pending. This ensures that high-priority tasks receive immediate attention, although it introduces complexity and potential latency.

2. Resource Efficiency:

  • Flow 1: Provides a fixed allocation of resources, which ensures that high-priority tasks are not starved of resources by lower-priority tasks. However, this can lead to potential resource wastage if high-priority queues are underutilized.
  • Flow 2: Utilizes resources dynamically, ensuring that all available resources are used efficiently. When high-priority queues are empty, resources are automatically redirected to handle low-priority tasks, minimizing wastage.

3. Scalability:

  • Flow 1: Allows for independent scaling of each priority level by adjusting the concurrency limits for each queue. This can be done easily, enabling quick responses to changing demand.
  • Flow 2: Supports scalability through dynamic resource allocation, allowing the system to adapt to varying volumes of high and low-priority tasks. However, this requires careful management of Lambda function logic and SQS configurations to maintain performance and efficiency.

In summary, our new flows ensure that high-priority tasks receive the attention and resources they need while still efficiently processing lower-priority tasks. This setup offers a flexible, scalable, and resource-efficient solution that meets our operational requirements.

Conclusion

Many companies implement the concept of priority in their system architectures in various ways. In this post, we explored two possible implementations for prioritizing tasks using AWS services, each with its own advantages and trade-offs.

It’s now up to you to determine which flow best suits your needs. Have you ever implemented task priority in your system? If so, have you used any of the flows discussed above? I would love to hear your experiences and insights on this topic.

Follow Me For More Content

If you made it this far and you want to receive more content similar to this make sure to follow me on Medium and on Linkedin

--

--

Joud W. Awad
Joud W. Awad

Written by Joud W. Awad

AWS Community Builder, Principal Software Engineer and Solutions Architect with 10+ years in backend, AWS Cloud, DevOps, and mobile apps.

Responses (1)