Exploring Parallel Processing in AWS Lambda: Leveraging SQS and Other Approaches

Published in

SSENSE-TECH

7 min readJun 16, 2023

AWS Lambda is a serverless computing service for building event-driven applications without infrastructure concerns. It automatically manages computing resources, enabling real-time event responsiveness, ideal for unpredictable workloads and traffic spikes. You can read more about lambda computing here.

While AWS Lambda is often associated with small short-lived tasks, it can also be used for processing a large amount of data. However, doing so can introduce a range of challenges. These challenges include managing resource consumption to avoid exceeding Lambda’s limits and optimizing data processing and storage strategies to handle large data volumes. Even after addressing these challenges, the complexity of processing large amounts of data can create additional problems such as system failures, increased latency, and higher costs.

Single Compute Architecture vs. Big Data

Figure 1. Event-driven architecture for processing large CSV files using AWS Lambda and S3 in a single compute.

In this particular use case, we have an event-driven architecture where a lambda function is invoked in response to the CSV upload to the S3 Bucket, and this singular lambda function bears the responsibility of executing the necessary operations. The workflow entails the successful parsing of the incoming data, followed by iterating through the voluminous dataset to execute the relevant operations on each row. Ultimately, the data will be consolidated into a bulk upsert parameter to be mass inserted into the datastore for durability.

Even though the operations may appear straightforward, there is a significant possibility that they are synchronous. This means that the system has to wait and pause while these operations are running, resulting in increased time complexity. Additionally, if these operations rely on external services beyond the compute context, it can lead to failures in our lambda function. This is especially true if the operations exceed the max timeout of the function, which is set to a maximum of 900 seconds (15 minutes).

In terms of persisting the processed data, let’s consider using DocumentDB in this example. There are a couple of issues that can arise:

A lambda function may encounter timeouts when processing a large dataset that demands extensive compute resources. As a result, incomplete data can be stored in the database, leading to failed or incomplete bulk writes.
DocumentDB has a limit on the number of requests that can be made per second. If a large number of requests are made at once, it can exceed this limit and lead to throttling.

To avoid these issues, it’s important to design your system with scalability in mind. This may involve breaking up bulk operations into smaller batches, optimizing your lambda functions to process documents quickly, exploring multi-threading, and using AWS services like Amazon SQS to manage concurrency.

Concurrency

Concurrency is the ability of a system to handle multiple tasks or requests simultaneously. In general, managing concurrency is important for ensuring that your system can handle large workloads without experiencing timeouts or performance issues.

Multithreading

Multithreading is a powerful concept that enables the concurrent execution of multiple tasks, improving processing speed and mitigating timeouts and throttling. However, managing threads can be complex in languages without native multi-threading support. For example, JavaScript is single-threaded and executes code sequentially.

To overcome this limitation, AWS provides alternative solutions. One approach is a divide and conquer strategy using services like Amazon Simple Queue Service (SQS), AWS Step Functions, and AWS Batch. These services facilitate distributing workloads across multiple components, allowing parallel processing and efficient handling of large workloads. By leveraging these AWS services, developers can simulate multithreading behavior and achieve optimal performance even in languages without native support for multithreading. In this article we will deep dive into Amazon Simple Queue Service (SQS).

Amazon Simple Queue Service (SQS)

One approach to managing concurrency in lambda functions is to use Amazon SQS. This involves breaking up tasks into smaller chunks and adding them to a queue. A lambda function can then retrieve tasks from the queue and process them one at a time, ensuring that the concurrency limit is not exceeded. This approach can also help with managing errors and retries, as failed tasks can be added to a dead letter queue for later processing. For a more detailed exploration of concurrency control using lambdas and SQS, you can refer to this article.

Multiple Compute Architecture vs. Big Data

Figure 2. Event-driven architecture for processing large CSV files using AWS Lambda, Amazon Simple Queue Service and S3 in multiple compute.

In the diagram above, we have a second iteration of the system that utilizes SQS to split the workload across multiple lambda functions. In this system we have 2 lambdas: one is responsible for identifying the necessary operations and efficiently splitting them into batches to be sent to SQS, and the other lambda consumes a single batch and performs the required operation on that batch.

This new approach has allowed us to break up the workload into smaller more manageable components, reducing the time complexity of the operation. By running the bulk of the operation in parallel, we have significantly decreased the likelihood of timeouts or throttling, which can negatively impact system performance.

Overall, this updated architecture demonstrates the importance of designing systems with scalability and concurrency in mind. By exploring approaches like multithreading and leveraging AWS services like SQS, developers can create event-driven applications that are more efficient, scalable, and responsive to real-time events. However, this is not without challenges as several factors can still lead to issues with system performance. Potential issues and likely solutions are listed below.

Exhausting Resources Outside of the Computing Context

One issue you could run into when triggering a lot of parallel processes is that you could exhaust resources outside of the compute context. A lambda function can only scale to accommodate loads that are in its compute context but external dependencies like API gateways, databases, etc. need to rely on different scaling strategies to accommodate the volume of compute resources that may be required. Possible scenarios include:

Using up all the available memory of the database server due to high concurrent connections and operations
Baseline I/O operations per second potentially being a bottleneck for the database server’s performance
Exceeding the maximum number of requests per second of an external API
Timeouts and connection errors due to any of the issues listed above

While this list is not exhaustive, these issues can easily be addressed by either scaling your resources to accommodate your computing needs or adjusting your SQS queue configuration to work within the limits of the external resources.

How SSENSE Leveraged AWS SQS and Lambda to Streamline Large Data Processing

At SSENSE, we have many different services that require an up-to-date copy of our catalog. And while updating the product information in real-time is ideal, it’s not always possible. In some cases, running a bulk update may be necessary, especially during the initial phase of a project. Given the number of products we currently have, it’d be naive to try to process and sync these products using a single compute process.

This is a perfect example of where we can rely on parallel processing to perform large data processing.

To tackle similar challenges, the following 4 steps using AWS Services are proposed:

A Scheduled Event: This is going to be the driver of the whole process and can run as frequently as the business needs it to run. It performs the role of triggering the lambda function in step 2.
Lambda Function: This will have the responsibility of locating, streaming the file and determining how the workload should be divided, in this case using buffer chunks.
AWS SQS: Once the lambda has determined how the workload should be split, it will send the batches to an SQS queue which will then carry out the dispatching of each individual work to different instances of our consumer lambda function. I.e This will trigger several parallel processes to handle the different batches of the total workload.
Consumer Lambda Function: This will have the sole responsibility of taking a predetermined section of the total workload and handling the work independently.

Figure 3. Event-driven architecture for processing large CSV files using a Scheduled event, AWS Lambda, Amazon Simple Queue Service and S3 in multiple compute.

By decoupling the processing from the input and distributing it across multiple instances of our consumer lambda function, we will be able to significantly reduce the processing time. Thanks to the combination of AWS SQS and Lambda, we will be able to streamline our data processing pipeline and achieve efficient and scalable processing of large volumes of data at SSENSE.

If you’ve found the concept of parallel processing intriguing and would like to explore it further, there’s an article titled “Massive Parallel Processing with AWS Step Functions: Distributed Maps to the Rescue” that delves further into the topic. It covers the use of AWS Step Functions to achieve efficient parallel processing.

Conclusion

Success when working with AWS Lambda and other cloud-based computing services requires vigilance around system performance and continual exploration of new strategies and tools to optimize performance, scalability, and reliability. By proactively designing and managing systems, and remaining open to new ideas and approaches, developers can build highly effective, scalable, and responsive event-driven applications that easily handle even the most demanding workloads.

Editorial reviews by Catherine Heim & Mario Bittencourt

Want to work with us? Click here to see all open positions at SSENSE!