Manuel MouratoELK Stack + Alerting: How to Monitor your Business and Infrastructure Data (Part One)Monitoring your data has never been more importantMay 14, 20192May 14, 20192
Manuel MouratoWhen Airflow isn’t fast enough: Distributed orchestration of multiple small workloads with CeleryDISCLAIMER 1: If you already have a firm knowledge on what orchestration, Apache Airflow and Celery are, go directly to the “An Airflow use…Apr 20, 20184Apr 20, 20184
Manuel MouratoHow Spark dataframe shuffling can hurt your partitioningFor those who work with Spark as an ETL processing tool in production scenarios, the term shuffling is nothing new. It happens when we…Feb 21, 20183Feb 21, 20183
Manuel MouratoApache Kudu: A baby with potentialKudu’s place in the Hadoop ecosystemFeb 7, 20181Feb 7, 20181
Manuel MouratoBuilding a Dockerized Cloudera Pseudo Cluster, aka testing the Big Data ecosystem when you have no…DISCLAIMER 1: Do not ever start your articles with a disclaimer.Jan 23, 20187Jan 23, 20187