Manish Kumar PalHive Table for avro filesApache Avro is a data serialisation standard for compact binary format widely used for storing persistent data on HDFS. It is lightweight…Jul 21, 20183Jul 21, 20183
Manish Kumar PalCreate Spark RDD for MongoDB collection in pythonYou can create RDD for a MongoDB collection using pymongo-spark library, which integrates PyMongo (the python driver for MongoDB) with…Jul 21, 20182Jul 21, 20182