Due to the diversity of data sources, and the volume of data that needs to be processed, traditional data processing tools fail to meet the performance and reliability requirements of modern machine learning and data analytics applications.
Part 1 of this series will focus on big data processing engines — Hadoop, Spark, Presto & Airflow. Following Parts will cover how to setup cost efficient and highly scalable & reliable Data pipelines on GCP and AWS.
Hive is an Apache-open source project built on top of Hadoop for querying, summarising and analysing large data sets using a SQL like interface (similar…

With the rapid R&D of cloud based AI services from Google, Microsoft, IBM Watson and many other players who offer ground breaking AI services on the cloud, it has now become quite easy to utilise these AI services in different segments.
One of the segments in which AI has recently been used to fullest is AI based IVR systems. With these intelligent IVR systems, a customer does not need to waste time in listening and following IVR’s lame instructions and press different keys, in order to route his problem or query to a specific department. …

I am a Google certified Professional Cloud Architect who loves to work on distributed computing and designing highly available systems.