Using lambdas to integrate new services

Kyle Zurawski
ITHAKA Tech
Published in
4 min readAug 8, 2024

Over the last few years, JSTOR content has expanded far beyond text-only journal scans. One of our first ventures in this area was the 2023 addition of audio content to support a richer collection of primary sources and the integration of Artstor content.

Audio offers users a more immersive and engaging path into certain kinds of research and can prompt richer insights, but ingesting audio presents a technical challenge when your platform infrastructure is historically geared toward text-only or text-first content. The Ingest team needed a time- and cost-effective way to process audio content, without significantly increasing complexity and size of the primarily image- and PDF-based media systems. Lambdas and AWS media solutions to the rescue!

Lambdas vs. microservices

One of the best ways to support audio on JSTOR is to create streamable formats and transcription files. AWS provides these capabilities in their MediaConvert and Transcribe products, with a variety of configuration options.

Our first efforts involved trying to use AWS services like MediaConvert to process audio, but our existing application didn’t have great ways of integrating these external services with our current internally orchestrated designs. We had to decide on the best way to perform new tasks needed to support audio on JSTOR and determine how to integrate that work into existing systems. Lambdas offered a low-cost option.

If we used lambdas to handle the triggering, result processing, and timeout checking on these AWS products, we could abstract that work from our internal systems. The AWS products would help us perform specific media-related functions without building that capability in-house; we wanted to avoid increasing complexity in an already large application, and we wanted a cost-effective solution. We could have accomplished something similar by building a bunch of microservices, but those microservices would have been sitting idle whenever content wasn’t being uploaded to the application. (Our load is sparse, but it also comes in spikes.)

We designed a reusable and asynchronous pattern for integrating these AWS services with our internal systems. Once our system requests an AWS process to start, like a MediaConvert job, our systems move on to other steps while the MediaConvert job executes asynchronously. When the job completes, an event is sent to our systems, processed, and saved for that media item.

Diagram depicting the system described in the article: services progressing from internal systems to lambdas, AWS services, and finally back to internal systems.
Using SNS/SQS events between Internal Systems, Integration Lambdas, and AWS Services

Integrating the services

We started by setting up a common pattern that would be used for an externally processed job. For example, when we needed to make a streaming derivative for an audio file, we built a set of lambdas around AWS MediaConvert to schedule a job with the appropriate configuration. Scheduled jobs go through an ordered set of steps:

  1. Internal systems send a job request event, starting the trigger lambda.
  2. The trigger lambda initiates the task, such as sending a job request to AWS MediaConvert, and the job info is saved and correlated with internal identifiers.
  3. The AWS service processes the job and saves the results to S3.
  4. Lambda listens to AWS EventBridge. When a task completes, or when it fails within certain parameters, another lambda is triggered to process the results. This lambda looks up the job info in the correlation database and saves the results to internal systems.
  5. In the background, the job request also begins a scheduled timeout check on items that haven’t been updated in more than two hours, with a maximum lifespan of 12 hours. If any matching jobs are found, the status checker lambda is triggered.
  6. The status checker lambda checks job progress and attempts to resolve any issues causing timeouts.

Each lambda is designed to be driven by SQS messages. Consistent error-handling and logging gives us much greater visibility into errors and allows us to automate error handling. If any lambda encounters an error, the request goes back into the SQS queue to try again later. Any lambda that fails too many times ends up in a “dead letter” queue for further investigation and alerting.

Downstream benefits

Most of our job processing logic is the same, so we’ve multiplied the effectiveness of this project by turning these lambdas into base templates. When we integrate a new service — for example, AWS Transcribe — we just need to add logic describing how to configure the request to Transcribe and how to read the results it produces. This saves us development time and maintains consistency across tasks.

So far, we’ve used this flow for creating streamable audio, streamable video, transcriptions, and downloadable formats, accelerating development for each new service. We write less code, write fewer new tests, and focus only on changing the things that matter for each job we’re doing. We’ve saved money over traditional solutions — each lambda costs just a few dollars a month — and we’ve saved development time as we tie to new external services.

Adopting this approach will also create efficiencies for other ITHAKA programmers too. We’re creating a template repository based on this code and plan to leverage tools like Backstage to make these templates even easier and faster to use.

Interested in learning more about working at ITHAKA? Contact recruiting to learn more about ITHAKA tech jobs.

--

--