Javarevisited
Published in

Javarevisited

Cost effective way of running spark in AWS EMR

Using transient clusters

Problem statement

Running spark in EMR can help in analyzing big datasets in a very little time, but when it comes to cost it’s definitely not cheap. It starts becoming very expensive as we get billed by number of nodes & depending upon machine types * duration till it was running. I definitely love the fact amazon can…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store