PinnedhwangdbAzure Databricks Unity Catalog — up and runningThis is the navigation page for mini series — Azure Databricks Unity Catalog up and running. We include 3 articles to cover from UC…1 min read·Mar 12, 2023----
hwangdbQuick Blog: Ingest Spanner Tables to GCP Databricks Delta Tables using Dataflow and Change Data…In this quick blog, we solve one problem: how to replicate Spanner Tables into Databricks as Delta Tables and keep them in sync?5 min read·Oct 29, 2023----
hwangdbQuick Blog: Databricks DLT consuming from AWS MSK Kafka TopicsThis quick blog shows an example architecture of Databricks deployed using Customer Managed VPC, with private MSK cluster in the same VPC…4 min read·Sep 17, 2023----
hwangdbAzure Databricks Unity Catalog — up and running — Part 4: UC Storage Account Networking Set UpCredit: this blog is a collaboration work with som.natarajan@databricks.com4 min read·May 23, 2023--3--3
hwangdbAzure Databricks Unity Catalog — Part 3: Automate Unity Catalog set up using TerraformThis is Part 3 of series — Azure Databricks Unity Catalog — up and running; we walkthrough how to automate the provisioning of Azure…2 min read·Mar 12, 2023----
hwangdbAzure Databricks Unity Catalog — Part 2: Get the infra — build UC metastore and initial set upThis is Part 2 of series — Azure Databricks Unity Catalog — up and running; we talk about how to implement the infra required to set up…3 min read·Mar 12, 2023----
hwangdbAzure Databricks Unity Catalog — Part 1: UC Concepts and ComponentsThis is Part 1 of series — Azure Databricks Unity Catalog — up and running; we layer out key components of Unity Catalog on Azure…5 min read·Mar 12, 2023----
hwangdbSMOTE implementation in PySparkBeing probably the most common method of doing oversampling on imbalanced dataset, SMOTE introduces some randomness in creating synthetic…2 min read·Aug 3, 2020--5--5
hwangdbAn approximated solution to find co-location occurrences using geohashOften when dealing with location data, we need to find neighbours of location points, this is a challenging problem as if we were to pick…2 min read·Jul 31, 2020----
hwangdbPyTorch implementation of Autoencoder based recommender systemAutoencoder is a type of directed neural network that has both encoding and decoding layers. By learning the latent set of features we can…3 min read·Jul 21, 2020----