Boning ZhanginTowards Data ScienceA Python implementation of concurrent consumers for Google Cloud Platform Pub/SubAn example shows how to publish messages to Pub/Sub and build a service to consume the messages concurrentlyMay 31, 20202May 31, 20202
Boning ZhangA proactive monitoring/alerting system for ETL data pipelines using table dependency graph and…In this blog, we will build a monitoring system for ELT data pipelines which can visualize table dependency and send alerts for failed…Feb 18, 20201Feb 18, 20201
Boning ZhanginAnalytics VidhyaAn Airflow Sub-Dag to Sync Data from On-Premise Hadoop Cluster to Google Cloud StorageWe are now in the process of data migrating from on-premise Hadoop cluster to Google Cloud Platform (GCP). Since there are a lot of data…Feb 3, 20201Feb 3, 20201
Boning ZhanginAnalytics VidhyaGoogle Cloud Platform User Manual for Big Data ProjectsThis blog was originally for our team’s internal use, introducing the architectures of our Google Cloud Platform (GCP) projects, our best…Feb 2, 2020Feb 2, 2020
Boning ZhanginTowards Data ScienceA Python API for Background Requests Based on Flask and Multi-ProcessingFirst things first, let me briefly explain what API is and what it can help us. Simply speaking, API is a program running in a host and…Dec 9, 20195Dec 9, 20195
Boning ZhanginAnalytics VidhyaA python solution to run query in Google Cloud Dataproc using API and return back query resultsThis post can also be found in Big Data Daily. See Git for source code.Nov 24, 2019Nov 24, 2019
Boning ZhanginAnalytics VidhyaAn Airflow sensor for stamp files in Google Cloud StorageThis blog was originally posted in Big Data Daily.Nov 14, 2019Nov 14, 2019