This post gives you a quick walkthrough on AWS Lambda Functions and running Apache Spark in the EMR cluster through the Lambda function. It also explains how to trigger the function using other Amazon Services like S3.
AWS Lambda is one of the ingredients in Amazon’s overall serverless computing paradigm and it allows you to run code without thinking about the servers. Serverless computing is a hot trend in the Software architecture world. It enables developers to build applications faster by eliminating the need to manage infrastructures. …
Apache Beam(Batch + Stream) is a unified programming model that defines and executes both batch and streaming data processing jobs. It provides SDKs for running data pipelines and runners to execute them.
Apache Beam can provide value in use cases that involve data movement from different storage layers, data transformations, and real-time data processing jobs.
There are three fundamental concepts in Apache Beam, namely: