The Startup
Published in

The Startup

Running PySpark Applications on Amazon EMR

Methods for Interacting with PySpark on Amazon Elastic MapReduce

Introduction

According to AWS, Amazon Elastic MapReduce (Amazon EMR) is a Cloud-based big data platform for processing vast amounts of data using common open-source tools such as Apache Spark, Hive, HBase, Flink, Hudi, and Zeppelin, Jupyter, and Presto. Using Amazon EMR, data analysts, engineers, and scientists are free to explore, process, and visualize data. EMR takes care of provisioning, configuring, and tuning…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gary A. Stafford

Gary A. Stafford

3.4K Followers

AWS Principal Solutions Architect | 9x AWS Certified Pro | Polyglot Developer | DataOps | DevOps | Technology consultant, writer, and speaker