Reusing Connections Lambda Functions (POC)

Lambda being an event-driven, serverless computing does not take care of your application container instances. Instead it tries to rebuild connection for each and every request coming in. This nature of lambda will force you to run out of connections soon.

Let’s look at a simple use-case. I wanted to create a pub-sub system using queueing mechanisms availaible in the market. I looked at different flavours of queueing system availaible such as SQS, ActiveMQ, Kafka. I chose to go with RabbitMQ reason being my prior experience and knowledge with this message broker and it fits the use case as well . SQS was another option but I personally hate involving HTTP protocols used in queueing systems. 😛

Now we had to support 1k requests/sec. Lambda capability to scale automatically does not let developer to worry about system scalability. But the question was how to handle requests when it hits your lambda. Now I had to push all the requests to rabbitMQ. We can have 1 or multiple consumers as rabbitmq supports round robin automatically. Doing it the usual way (publishing a message) we will write something like this.

def lambda_handler(event, context):
print(context)
credentials = pika.PlainCredentials('test', 'test')
connection = pika.BlockingConnection(pika.ConnectionParameters(IP,'5672','/',credentials))
channel = connection.channel()
channel.basic_publish(exchange='simple-direct',routing_key='events-routing-key',body=json.dumps(event))
return {
'status': 'Event Pushed to Queue'
}

Pretty Simple ? Right ?

I was overwhelmed to see that the above code works like a charm and I see events in my queue. Who would have thought it could be this simple. 😄

Well ! My happiness lasted until I tapped on connections tab in rabbitMQ management plugin. 😟

My heart skipped a beat seeing so many open connections.

Any serverless holds 2 major concepts.

  1. Cold Starts
  2. Warm Functions

Could reading about these can solve my problem I wondered. 😕

Work Around !

After execution AWS does maintain your variables context for anticipation of another Lambda function invocation. Any declarations outside your lambda functions remains initialised providing optimisation when it is invoked again.

I just moved two lines(creating connections) outside the lambda_handler event function. Take a look at delta code below.

credentials = pika.PlainCredentials('test', 'test')
connection = pika.BlockingConnection(pika.ConnectionParameters(HOST,'5672','/',credentials))
def lambda_handler(event, context):
print(context)
channel = connection.channel()
channel.basic_publish(exchange='simple-direct',routing_key='events-routing-key',body=json.dumps(event))
print('Sent Event')
return {
'status': 'Event Pushed to Queue'
}

Let’s look at the results after altering the code.

Simple but effective !

Now you can just keep track of your connection object and before doing any event you can just check the status of the object(open & closed). If the connection is open you can continue pubishing data else you can create a new connection object. (This happens when RMQ auto-closes the connection if its idle for more than 2–3 minutes).

Let’s talk an effective way to solve it using AWS infra itself if you are dealing with DB connections (MySQL DB).

There is a school of thought that believes that using a global connection variable inside the handler, though ensures that any new connections stay open for subsequent calls, but for MySQL or PostgreSQL, it will have the risk of session leakage across the open connections, resulting in locking issues.

1.) Serverless-mysql (if your DB is MySQL-compatible including MariaDB). Adds a connection management component to the mysql module for specific use with serverless apps. This module constantly monitors the number of connections being utilized, and then based on your settings, manages those connections to allow thousands of concurrent executions to share them. It will also clean up zombies, enforce connection limits per user, and retry connections using trusted backoff algorithms. This solution is cross-platform.

  • Return promises for easy async request handling
  • Exponential backoff (using Jitter) to handle failed connections
  • Monitor active connections and disconnect if more than X% of connections are being used
  • Support transactions
  • Support JIT connections
  • Assume AWS endorsed best practices from here

See more — https://github.com/jeremydaly/serverless-mysql