pysparkling
Originally published on May 29, 2015.
pysparkling
is a native Python implementation of the interface provided by Spark’s RDDs. In Spark, RDDs are Resilient Distributed Datasets. An RDD instance provides convenient access to the partitions of data that are distributed on a cluster…