Ai DengSpark on YarnSpark need the resources to do the computation and Yarn manage all the resource (CPU cores and memory) on the cluster. Understand how Spark…Jan 21, 2017Jan 21, 2017
Ai DengSimple queries in Spark Catalyst optimisation (2) join and aggregationBack to my first Spark blog in August about simple query in Spark Catalyst, it’s time to write the part II now. \o/ This is about the join…Dec 5, 20161Dec 5, 20161
Ai DengLearn from Facebook 60 TB Spark use caseAfter read the Facebook’s engineer blog about Spark 60TB use case…Nov 19, 2016Nov 19, 2016
Ai DengLeak a page warning in Spark tasksRecently in one of our Spark application’s logs, we got some warning like this:Nov 12, 2016Nov 12, 2016
Ai DengSimple queries in Spark Catalyst optimisation (1)In Spark 1.6, the Spark SQL catalyst optimisation get very mature. With all the power of Catalyst, we are trying to use the Data frame…Aug 16, 20162Aug 16, 20162
Ai DengSpark: Out of memory for driver’s result sizeIn one of our Spark application, we try to using the DataFrame to process our 200Millions XMLs.Jul 31, 2016Jul 31, 2016