How to properly use serverless technologies and AWS Lambdas in our opinion

German Gensetskiy
Go Wombat
Published in
3 min readSep 27, 2019

Recently I read and shared with team a topic about serverless technologies usage in terms of web API: http://einaregilsson.com/serverless-15-percent-slower-and-eight-times-more-expensive/

In short — the author did research to decide if he needs to use serverless technologies for his API instead of the old setup. He found out that Lambda is a little bit slower than EC2 Instance, but that’s not the main problem. The main problem was in the pricing, and actually it was in the price for API Gateway — not the Lambda itself.

I liked the research the author did and his approach. But it reminded us of one of our successful usage of lambdas.

Lambda function is a great asset for small independent functions (that’s all in the name). And especially it’s good when you need to use different APIs or scrap some web-pages. In that case, you can invoke as many lambdas as you need in parallel. Starting from 1 instance to a million without changing the infrastructure code. It just works and that’s it.

Each lambda will be launched on separate VPS with its IP address — which makes it hard to block your scrapper.
Because the only way to block it will be banning the whole pull of the AWS IP addresses, which is probably a bad decision.

Example

Words are cool, but what about some code example? Sure! Here we go.

In our example, we will work with cryptocurrency exchange markets to retrieve the latest info about the rate of exchange between two cryptocurrencies.

We’ll use CCXT API for that, it supports 127 different cryptocurrency exchange markets and has a Python implementation.

First of all, we need to create a simple lambda function. Its responsibility is to retrieve data about specified tickers from the specified market.

Lambda function to retrieve data using CCXT

When Lambda is ready — the next step is to run it. We will retrieve data from many sources and we need to do it as fast as possible because things can change very fast. For that purpose, we will make our code asynchronous and will invoke lambdas in parallel.

To invoke AWS Lambda we need to use Boto — AWS SDK for Python. In our case, the asynchronous version of Boto — aioboto3.

client = aioboto3.client(
'lambda',
region_name='[YOUR_LAMBDA_REGION]',
aws_access_key_id='[YOUR_AWS_KEY_ID]',
aws_secret_access_key='[YOUR_AWS_KEY]'
)
response = await client.invoke(
FunctionName='[YOUR_LAMBDA_FUNCTION_NAME_OR_ARN]',
InvocationType='RequestResponse',
LogType='Tail',
Payload=json.dumps([PAYLOAD]),
)

Here is a simple example of how to invoke a lambda, you can always find more info in the Boto3 docs.

And here is the code of the runner I ended up with:

Asynchronous lambda runner

The main logic here goes in methods __invoke_lambda and invoke.

__invoke_lambda uses code same as above to run lambda and retrieve its response. It’s an asynchronous method.

invoke maps __invoke_lambda coroutine to each payload, and asyncio.gather schedules them as parallel tasks.

Bonus

If you don’t want to write or use asynchronous code — here is the same using the Python multiprocessing module:

Lambda runner using multiprocessing

Thanks for reading! Hope you liked it.

German Gensetskiy under the support of Go Wombat Team.

P.S. In case of any troubles I will be happy to hear your feedback. You can reach me out by the email: Ignis2497@gmail.com

--

--