Serverless Smart Radio— Part III — Lambda functions

Published in

Big Data WAL

8 min readOct 31, 2018

You’re currently reading Part III of the Serverless Smart Radio article. Head over to read Part I if you missed the introduction or check out Part 2 for Step Functions if you haven’t read it yet.

Abstract

In the previous Step Functions article we talked about our pipeline and how we use the Step Functions as an orchestration layer to our Lambda functions. In this article, we will be looking at the benefits of using Lambda functions, plus the important feature set.

If you are new to Lambda functions, please read this tutorial

At the end of the day, the actual work and heavy lifting is done by the Lambda functions service and that’s where the core of our business.

Introduction

The core of a FaaS platform the “Function” which accepts an input and gives an output. What the managed service Lambda Function does is to ensure your function(ality) is available. The AWS documentation describes as follows;

Every time an event notification is received for your function, AWS Lambda quickly locates free capacity within its compute fleet and runs your code. Since your code is stateless, AWS Lambda can start as many copies of your function as needed without lengthy deployment and configuration delays. There are no fundamental limits to scaling a function. AWS Lambda will dynamically allocate capacity to match the rate of incoming events.

However you have no control over this; meaning you can only throttle the requests if necessary, but you can not configure the scalability. For services such as used by advertisement companies that needs to handle huge spikes of traffic this could be a major problem (affected by cold start problem). While designing your architecture, you have to carefully consider the load patterns and latency requirements to see whether the Lambda functions would be a good fit or not.

Lambda Functions in Smart Radio

Currently our system runs with 30ish lambda functions and most of the lambda functions are called within the context of a step function.

We use Lambda functions for the following purposes ;

(Transcode)Call the AWS Elastic Transcoder service to convert wav into mp3
(Transcribe) Do a HTTP request to a 3rd party software to transcribe the audio into text
(Segmentation) Check XML files generated by the Audio System to split the audio into segments and cut the commercials out of it.
(Prediction) Call the ECS Task that is responsible with predicting topics of each of those segments (using text)
(Storage) Store all the available segments into the transactional database
(Notification) Notify all the interested parties that a new segment is available.

All of these functionality is connected via Audio Pipeline that we talked about in the previous Step Functions article

Transcoding and transcribing functionality requires a “WAIT” operation in our design, and we accomplish this by using the Step Functions wait task loop. An alternative way could be to require the called services to publish a notification when their work is done. Although the transcoding service supports SNS messages, transcription service still lacks the functionality.

Where Lambda shines

Scalability
Language Independent (Java, .NET, Go, Python, JS) Hence all JVM based languages such as Scala, Clojure, Groovy)
Pay per use
Easy integration with rest of the AWS Application Services (SQS, Kinesis, CloudWatch)
Versioning
Updates
Event Triggers
Monitoring

Let’s take a look at some of these features independently.

Language Independent

Currently AWS supports the following languages as a platform for their Lambda functions infrastructure:

Node.js (JavaScript), Python, Java (Java 8 compatible), C# (.NET Core) and Go.

This flexibility enables to help you pick whichever the language suites the job most. You can also use other languages as long as they have interfaces with the host languages. For example, using Rust via a Python or NodeJS container.

One of the great aspect of using language independent Lambda functions to implement any functionality is that the rest of the system is not aware of what technology, O/S, environment does the Lambda operate. Its job is very simple; input in and output out, in complete isolation.

Availability
You may think AWS is the global hub for your functional feature set, and the list of lambda functions and step functions are gateway to your systems and services. Unconnected networks may assign tasks to each other via calling lambda functions or step functions as long as they have authorized to execute the function and use S3 as the data bridge between them. The possibilities are here endless, you may implement a lambda function called “Insert to Database” and accept input; behind the scenes you may place some information into the queue, or start a step function, or write into multiple databases. If the operation fails, the input to the lambda function will be written into the Dead Letter Queue.

Security
Each Lambda function has VPC and the Security Group configurations that they should be operated in. Hence you can pretty much isolate the operation of the lambda execution both from external access and internal access point of view.
For example, in VPC mode Lambda does not have an internet access, thus you have to explicitly configure your VPC to have internet access.

Update
Almost every system needs some kind of update, enhancement, bug fixes and the bigger it gets harder to update. In a Lambda function oriented environment, you have the ability to only update the part where it’s necessary; update the JAR/code file of the corresponding lambda, and you are good to go. Not only that but AWS gives you the ability to version your lambda function.

Versioning
Each lambda function can be versioned, by default $LATEST is used to execute the function. For each of global releases, you may upgrade the version number of your lambda function, hence when necessary the users of the lambda function may choose to use the older version whereas the new version of the functionality is ready to use. We currently do not use this feature, however it may become handy in some cases.

Concurrent
1000 instances of your functionality running at the same time sounds exciting! What would be better ? 10K instances of your function; it is an endless possibility. The pinnacle of scalability for your architecture. No machine to manage, rollout, logs to push, network to configure.

You should be careful with the open connections to your database and other fragile systems. Using a proxy would be a solution to prevent your server to be overloaded with DB requests and dealing with incoming TCP connection flood.

AWS internally manages the instances of your lambda function depending on the incoming requests. You have no control over this and the callers might be affected on this. I highly suggest you to test the performance characteristics for your use case. Although as I mention, we use dry-run feature in our lambda functions to keep them warm, this is no guarantee that when there are multiple requests coming in there will be enough processing power to handle them at the same time. New instances of the lambda function might be started and each of those will be hit with the cold start problem. It is handy to have a proper monitoring system in your overall system to track down these latency metrics to diagnose the possible downside effects of the cold start problems.

Logging

Each Lambda function publishes the log to CloudWatch logs to /aws/lambda/<function_name> log group.

In the logs you can see the details of the start, end, all the logs generated by your lambda function and the total time it took to execute your function.

However by default you are not going to able to search all the logs since each instance of your function will be publishing to a separate stream under the log group. CloudWatch logs supports subscriptions to forward the logs into different systems such as AWS ES(Elastic Search) or Kinesis Firehose

Another way to quickly see the CloudWatch Logs is to use a tool like Saw

Monitoring

AWS has monitoring features built in to display the overall performance of the Lambda function however no information is displayed by AWS about the parallelisation of your lambda functions. It might be alive in 0 or 100 machines but you have no control over this.

Each Lambda function out of the box publishes CloudWatch metrics

We use DataDog aggregate all the CloudWatch metrics to a single location.

Tricky things

Inputs

Although in step functions you can transparently look at every and each step, Lambda does not have that functionality. The only functionality that Lambda supports is something called DLQ(Dead Letter Queue) to publish these failed calls to a SQS or SNS service. These settings can be done in the Lambda function details page.

One way to deal with a possible issue is the either the caller or the function itself can publish the parameters and other necessary information to a event queue/stream.

Updates

While updating your Lambda functions, some of the incoming event requests might end up run with the old versions of your Lambda functions within the couple of seconds of the start update operation.

Cold Start
Each of our lambda functions implement out of the box the dry-run feature, and there is a very important reason for that. One of the downside of lambda functions is that they might have been removed from the container and they have to be started in (possibly) new machine. In order to reduce the overall timeline of the jobs and prevent timeouts, we schedule all lambda functions to be run in the dry mode in every 4 minutes using the AWS CloudWatch Rules. These rules can trigger lambda functions with a parameters.

The same cold start problem will also occur when you have updated your Lambda function, since all the containers that your lambda function deployed must be re-initialised. The moment you have updated your Lambda function, I highly encourage you to do a dry-run execution of your function in order to keep the containers refresh.

Scheduling Function Calls— Cloud Watch Rules

These super handy rules can execute any Lambda function that is available in your account. Using cron like expressions or simple rate definition, AWS will start your functions. You can specify the input JSON into the lambda function.

Seen above, we start our Elastic Container in 4:40 am Monday-Friday via the Event Rule and shut them down in the evening. We also continuously run one rule to keep all the Lambda functions alive approx. every 4 minutes.

Local Testing

Something that will be in our radar for further developments is a local development tool such as

lambci/docker-lambda

Docker images and test runners that replicate the live AWS Lambda environment - lambci/docker-lambda

github.com

If you already have experience with such tools, let us know your case.

Part IV — Clojure