PinnedEmmanuel DavidsonBuilding a Robust Data Engineering Infrastructure with Apache Spark, Apache Hive, and Delta Lake…In this guide, we’ll walk you through the step-by-step process of creating a robust data engineering infrastructure using Apache Spark and…Dec 28, 2023Dec 28, 2023
PinnedEmmanuel DavidsonOrchestrating Your Data Engineering Platform with Docker ComposeIn the dynamic landscape of data engineering, constructing a seamless and scalable platform is paramount. Docker Compose offers a robust…Dec 12, 2023Dec 12, 2023
Emmanuel DavidsonA Comprehensive Guide to Running Apache Spark on Kubernetes with Airflow in GKE.Apache Spark is a widely-used engine for big data processing, providing powerful capabilities for handling large datasets. Running Spark on…Aug 31Aug 31
Emmanuel DavidsonUnderstanding Kubernetes StatefulSetsKubernetes provides several ways to manage and deploy applications, with StatefulSets being one of the key components for stateful…Jul 28Jul 28
Emmanuel DavidsonVisualizing GIS Data with D3.js in a Quarto DocumentIn this guide, we’ll explore how to visualize GIS data using D3.js within a Quarto document. We’ll cover map projections, shapefiles, and…Jul 19Jul 19
Emmanuel DavidsonGuide to Exploring Geospatial Data within Exclusive Economic Zones (EEZs)OverviewJul 5Jul 5
Emmanuel DavidsonExploring Spatial Data Operations with Python: An Example ScriptSpatial data operations are essential in many fields, enabling the analysis and visualization of geographical information. This article…Jul 3Jul 3
Emmanuel DavidsonEfficient Data Management: Vacuuming a Delta LakeIn the world of big data, efficient data management is crucial for maintaining performance and managing storage costs. Delta Lake, a…Jun 13Jun 13
Emmanuel DavidsonA Developer’s Guide to Structuring RESTful APIs for Hierarchical Data RelationshipsWhen building a web application that involves complex hierarchical relationships between entities like organizations, projects, and users…May 13May 13