George ZefkilisinData Engineer ThingsBuilding a Local Data Lake from scratch with MinIO, Iceberg, Spark, StarRocks, Mage, and DockerHello again, fellow technology enthusiasts! I am a software/data engineer who transitioned from data science. The learning curve in this…Jul 134
Lubomir FrankoinPython in Plain EnglishThe Truth About PySpark’s Repartition: Prepare to Be Surprised!Paritioning concept, Image by AuthorJul 31Jul 31
Rindhuja Treesa JohnsoninTowards Data ScienceApache Hadoop and Apache Spark for Big Data AnalysisA complete guide to big data analysis using Apache Hadoop (HDFS) and PySpark library in Python on game reviews on the Steam gaming…May 81May 81
George ZefkilisinData Engineer ThingsBuilding a Local Data Lake from scratch with MinIO, Iceberg, Spark, StarRocks, Mage, and DockerHello again, fellow technology enthusiasts! I am a software/data engineer who transitioned from data science. The learning curve in this…Jul 134
Lubomir FrankoinPython in Plain EnglishThe Truth About PySpark’s Repartition: Prepare to Be Surprised!Paritioning concept, Image by AuthorJul 31
Rindhuja Treesa JohnsoninTowards Data ScienceApache Hadoop and Apache Spark for Big Data AnalysisA complete guide to big data analysis using Apache Hadoop (HDFS) and PySpark library in Python on game reviews on the Steam gaming…May 81
Moshe ZadaAnalyzing Prometheus Metrics with Spark and Athena: Uncovering Hidden InsightsPromQL is a powerful language for querying time series data, but what happens when you want to run some analytics on top of Prometheus? Use…2d ago
Meni ShmueliinSystem WeaknessDid you know that your Apache Spark logs might be leaking PIIs?IntroductionFeb 8