Case Study — Continuous Applications — Spark Structured Streaming

Durga Gadiraju
itversity
Published in
1 min readNov 20, 2018

Continuous Applications is new buzzword where enterprises can achieve real time reports with the lowest latency possible. Sarath Varma, Data Engineer at GrubHub is going to share his experience using Spark Structured Streaming to achieve Continuous Applications.

Agenda

  • A quick overview of Apache Spark on Amazon Elastic Map Reduce (EMR)
  • Overview of Spark Structured Streaming
  • Demo — Continuous Application using Spark Structured Streaming
  • Read data from s3
  • Process using Spark Structured Streaming
  • Write data back to s3
  • Q&A between me and Sarath. We will be sharing questions up front so that you guys have an idea about questions that are going to be asked.
  • Q&A with public

An announcement about the live course related to AWS Analytics (Beginning of March). Curriculum might change a bit.

  • Elastic Map Reduce
  • Apache Spark on EMR
  • Continuous Applications using Spark Structured Streaming
  • Integration between Kinesis and Spark Structured Streaming
  • Case Study using boto
  • Setup of Azkaban
  • Create ETL workflow using Azkaban and EMR

--

--