MapReduce vs Spark (!)

People most generally make a mistake by comparing MapReduce with Spark.

Actually, MapReduce is a programming paradigm, so we cannot compare MapReduce with Spark. But we can compare how Hadoop uses MapReduce and Spark uses MapReduce.

In Hadoop MapReduce, each job has one Map and one Reduce phase; but in Spark MapReduce, the Map and Reduce phases can be made together. Secondly, while in Hadoop MapReduce the output of jobs is written as a file, Spark writes them to the memory. As a result, it accelerates the overall execution time of the master job.


Originally published at Emre Calisir.