Marin AglićinData Engineer ThingsUsing Marquez as a lineage tool for Celery — adding the parent-run facetThis story is the second one about integrating Celery with Marquez using the OpenLineage Python package. In this story, we take the first…4d ago4d ago
Marin AglićinData Engineer ThingsA fun experiment: using Marquez as a lineage tool for CeleryData lineage refers to the journey data takes — where it comes from, how it gets transformed, and where it ends up. This information…Sep 14Sep 14
Marin AglićinData Engineer ThingsCircumventing the problem of using data intervals when backfilling dataset scheduled DAGsAirflow’s data-aware scheduling feature allows event-based triggering between DAGs. It enables us to split large pipelines into smaller…Sep 4Sep 4
Marin AglićGetting Celery snapshots with PolaroidBuilding scalable systems poses a challenge — from designing the system, to implementation, and tackling unforseen issues.Aug 31Aug 31
Marin AglićinPython in Plain EnglishUsing Celery RedBeat for basic dynamic scheduling of periodic tasksWhen using Celery without Django, a question that kept bugging me was — how to disable and re-enable periodic tasks? In this story we take…Feb 15Feb 15
Marin AglićLearning Apache Iceberg — looking at append, update and delete operationsBridging the gap between what I knew and what I wanted to learn. This is the fourth in a series of articles about Apache Iceberg. In the…Feb 5Feb 5
Marin AglićLearning Apache Iceberg — storing the data to Minio S3Bridging the gap between what I knew and what I wanted to learn. This is the third in a series of articles where I continue my progress in…Jan 30Jan 30
Marin AglićinPython in Plain EnglishGetting DAG data from the Airflow API — working with PolarsThe Airflow API allows us to programmatically gather information about our Airflow deployment, such as DAG runs, task instances, pools…Nov 6, 2023Nov 6, 2023
Marin AglićProblem with using data intervals when backfilling dataset scheduled DAGsData aware scheduling is an Airflow feature introduced in Airflow 2.4. The feature basically introduced event-based scheduling between…Oct 15, 20231Oct 15, 20231
Marin AglićinData Engineer ThingsDesigning Dynamic Workflows with Celery and PythonThe why and how to insert chains into chainsAug 7, 2023Aug 7, 2023