Manish Kumar PalHive Table for avro filesApache Avro is a data serialisation standard for compact binary format widely used for storing persistent data on HDFS. It is lightweight…2 min read·Jul 21, 2018--3--3
Manish Kumar PalCreate Spark RDD for MongoDB collection in pythonYou can create RDD for a MongoDB collection using pymongo-spark library, which integrates PyMongo (the python driver for MongoDB) with…2 min read·Jul 21, 2018--2--2