Making Strides Toward Serverless

Matt Robinson
DraftKings Engineering
8 min readMar 30, 2021

If you have explored recent trends in cloud application development, you have probably come across mentions of the “serverless” deployment model. This model of application deployment places heavy emphasis on the business logic itself, moving the onus to the cloud provider to provision, deploy and scale the application code to servers. As our business grows and expands at DraftKings, we’ve found ourselves spending a considerable amount of development time on managing scalability of processes that are constantly being fed more and more data, leading us to explore serverless as an option.

Cloud providers like Amazon Web Services (AWS), Microsoft Azure, or Google Cloud have developed deep integrations between their function-as-a-service (another term for a cloud provider’s serverless offering) platforms and a majority of their other product offerings. These integration points allow for serverless functions to be executed in response to changes that occur in these products. For example, triggers that fire when new data is inserted in a database, or when new events are published to an event bus. DraftKings is an Amazon Web Services customer, and we have benefited greatly from the variety provided within their serverless triggers. At the time of writing, AWS’s function-as-a-service product, known as Lambda, has 20+ integration points with other AWS services or third party SaaS providers. These integrations cover most of the core product offerings:

  • Messaging services like SQS, SNS, and Amazon MQ
  • Streaming data services like Kinesis, DynamoDB Streams, and Apache Kafka
  • HTTP proxying services like API Gateway and EC2 Application Load Balancing
  • Data storage services like S3 and RDS
Available AWS serverless triggers. Not listed are third party SaaS integrations via the EventBridge product.

Engineering teams at DraftKings have found the serverless deployment model to be beneficial for scalability and speed of iteration. The emphasis on simple functions that play one small part in a larger application architecture has allowed us to express larger, more complex chains of code execution that meet the needs of the business, with less operational overhead. The on-demand nature of Lambda also allows us to automatically scale down when resources are not needed, providing a cost savings for the business and helping to make our global infrastructure more environmentally sustainable. For example, we have an event streaming architecture that is comprised of 8 separate lambda functions that manage integration points with third party SaaS providers for analytics, monitoring, and CRM purposes.

Serverless Functions vs Microservices

Using the example above of our event streaming architecture, we can compare two different hypothetical solution architectures. A standard industry recommendation for this problem might be to develop a microservice whose single responsibility is to manage the delivery of this input data to the various integrations. A simple solution to this problem might look like the following:

fetch data from sourcefor each integration:
transform data to correct format
send data to integration API
record success / failure, optionally retry

In order to achieve greater scale, it becomes obvious that these tasks may need to be parallelized. Developers tend to reach for queues in order to accomplish this — either in-memory queueing, or an external queueing service like Amazon SQS. This allows for separate code flows for each integration.

fetch data from sourcefor each integration:
add data to integration queue
...(each integration queue has a listener with the following)while (data is available on integration queue):
retrieve data from integration queue
transform data to correct format
send data to integration API
record success / failure, optionally retry

Which, in turn, allows each integration queue to be processed in parallel. You would then deploy this microservice application to one or more servers, scaling to more servers as the volume of input data increases.

The Serverless Architecture Approach

The serverless approach to this problem still requires queueing logic in order to achieve parallel work. However, each consuming process does not execute on the same virtual machine as the others. This avoids the “noisy neighbor” problem — where a process or application is constrained due to a separate process using up the available resource. In the microservice architecture, if a bug is shipped that accidentally causes one integration flow to use up 1GB memory, all the other integrations executing on that machine could be affected! In a serverless function architecture, these separate integrations are scaled entirely independently. Each process has its own allocated amount of vCPU and memory, as well as upper and lower bounds on desired concurrency.

It may initially sound daunting to manage 8 separate applications as serverless functions in place of what could be one microservice — suddenly there are 8 different processes to deploy and monitor! We realized that while these tasks all share the similar DNA of transforming data and exporting it to a secondary destination, they did not need to share substantial business logic. Each integration requires its own logic that is specific to the provider. The initial task of creating these functions was really just an exercise in defining boundaries between application logic, and cutting out piece by piece into separate code packages. For the few pieces of logic we have identified that should be shared, we have pulled common code into shared packages. Ongoing maintenance has been less risky with these simpler functions, as there are less moving parts within one process to consider.

An example of how DraftKings monitors serverless functions.

Over the long run, it has proven to be very beneficial to have these separate integrations maintain isolated resource pools, especially when dealing with a single unhealthy dependency or SaaS provider. Having the ability to decouple the resource utilization of producers and consumers of data allows for finer grained performance tuning of each DraftKings process. Since Lambda provides ample monitoring out of the box, it has become much simpler to identify the resource allocation needs of each process and identify code changes with hidden performance impacts when the scope is isolated to a single function execution. Resource-demanding jobs that can be parallelized are great candidates to evaluate if a serverless solution fits the problem, as Lambda (and other function-as-a-service products) generally do not require any intervention to increase the concurrency of the functions — these products intelligently detect when functions should be scaled up or down based on the volume of data coming through the configured function trigger.

Patterns for Hybrid Serverless Adoption

Many sources around the web preach about the benefits of a fully serverless software solution. But in some instances, moving an application to a serverless deployment model is not feasible, or would provide little value to the business. DraftKings has opted for a hybrid server/serverless approach to develop cloud native services with clean integrations between our cloud provider’s serverless triggers. This has allowed us to approach adoption in smaller steps, without requiring major changes to existing architecture. With a bit of research into the available integration points, we found that cloud products we were already using (or planning on using) have a rich set of serverless function triggers to kick off business logic in response to changes within that product. These function triggers have opened doors to leverage serverless functions as application glue — a simple way to connect separate processes (whether it be another serverless function, or to an existing server-deployed application) without writing a ton of code to link these two together. I will provide two common scenarios where serverless functions have been a great piece of foundational technology to connect data producers and consumers within the DraftKings architecture.

Streaming Data

A typical architecture for consuming various stream providers from multiple serverless functions.

One of the more popular opportunities that we’ve found for a hybrid server/serverless software solution are stream data scenarios. For example, DynamoDB has a feature called DynamoDB Streams that emits a log of database change data. AWS Lambda has an integration that allows a function to be executed in response to new data in this stream, enabling continuous replication of changes to secondary locations such as data warehouses, search indexes or third party APIs. By keeping the replication isolated into its own function, our DK engineering teams are able to ensure that the primary writes performed by a server-deployed application do not compete with these secondary replication tasks for compute resources, and we have flexibility to tune knobs that affect the cadence (batch size and batch window) and resource allocation (memory and CPU) of the replication tasks.

File Storage Processing

Another great way we found to take advantage of a cloud providers function triggers is in static file storage. Some applications are required to ingest file uploads for further processing or transformation. We traditionally leverage Amazon S3 for storage of static contents, and the S3 Lambda integration provides a great way to execute processing logic when new files are uploaded to a bucket. There are two patterns you can follow to leverage the S3 Lambda integrations:

  • Invoke the transformation and ingestion of new data into a database directly from a serverless function
  • Kick off long-running processing jobs that are better suited to execute on a server by issuing a REST call to notify the application that a file is available
Two patterns for using serverless triggers to process file uploads.

Deciding which of the two file processing approaches is best for your application really comes down to whether or not you expect your processing jobs to be able to execute within the constraints of your function-as-a-service product. At DraftKings, we leverage both approaches within our applications. Some processes expect files that can be processed in a few seconds or minutes, well within the Lambda resource limitations. However, other file processing flows may take a number of hours to complete, which might require handing off the processing to an application running on a powerful server.

Next Steps

With the recent news of AWS Lambda increasing the maximum CPU and memory allocation to 10GB / 6vCPUs, it will become easier to reach for serverless functions to solve new challenges! As DraftKings iterates on our back-end technology, we continue to evaluate if new development could benefit from the serverless deployment model. We are also refining our shared infrastructure code to enable migration of existing microservices to serverless functions. To learn more about that, check out Dave Musicant’s article about how we migrated our microservices to .NET Core!

--

--