Open in app

Sign in

Medium Logo
Write

Sign in

Ronald Ángel
Ronald Ángel

344 followers

Home

About

Miro Engineering

Published in

Miro Engineering

Writing data product pipelines with Airflow

A framework to make your data organization transition from simple DAGs to trustworthy data products.

Jan 26, 2023
5
Writing data product pipelines with Airflow
Writing data product pipelines with Airflow
Jan 26, 2023
5
Miro Engineering

Published in

Miro Engineering

5 Strategies for data workflows scheduling at Miro

Some strategies we follow to make our scheduling capability escalable

Mar 14, 2022
5 Strategies for data workflows scheduling at Miro
5 Strategies for data workflows scheduling at Miro
Mar 14, 2022
Miro Engineering

Published in

Miro Engineering

Agile Data Engineering at Miro

This is how we do it: at Miro we leverage Agile concepts to drive data engineering projects. We get improved predictability, and a smoother…

Nov 22, 2021
Agile Data Engineering at Miro
Agile Data Engineering at Miro
Nov 22, 2021
TDS Archive

Published in

TDS Archive

Introducing a new pySpark’s library: owl-data-sanitizer

A library to democratize data quality within companies with pySpark data pipelines.

May 5, 2020
Photo by Todd Steitle on Unsplash
Photo by Todd Steitle on Unsplash
May 5, 2020
inganalytics.com/inganalytics

Published in

inganalytics.com/inganalytics

How to review ETL pySpark pipelines

A guide about how to perform pySpark code reviews to comply with python standards, guarantee data quality and keep your code extensible.

Mar 23, 2020
How to review ETL pySpark pipelines
How to review ETL pySpark pipelines
Mar 23, 2020
TDS Archive

Published in

TDS Archive

Write Clean and SOLID Scala Spark Jobs

nowadays extensive pipelines are written as simple SQL queries, neglecting important development concepts as writing clean and testeable…

Dec 30, 2019
6
Write Clean and SOLID Scala Spark Jobs
Write Clean and SOLID Scala Spark Jobs
Dec 30, 2019
6
TDS Archive

Published in

TDS Archive

Understanding the Spark insertInto function

Problems found while using the spark insertInto with Hive

Oct 22, 2019
2
Understanding the Spark insertInto function
Understanding the Spark insertInto function
Oct 22, 2019
2
The Startup

Published in

The Startup

Ingesting Raw Data with Kafka Connect and Spark Datasets

In this blog post I will explain how we use kafka-connect and spark orchestrated by platforms like kubernetes and airflow to create a Raw…

Oct 15, 2019
Ingesting Raw Data with Kafka Connect and Spark Datasets
Ingesting Raw Data with Kafka Connect and Spark Datasets
Oct 15, 2019

Dockerizing a Hand Written Digits Predictor Service

Dockerizing a Flask service that uses keras to classify Hand Written Digits.

Mar 21, 2019
Dockerizing a Hand Written Digits Predictor Service
Dockerizing a Hand Written Digits Predictor Service
Mar 21, 2019
TDS Archive

Published in

TDS Archive

Dataset deduplication using spark’s MLlib

Spark Machine Learning: 2 approaches to deduplicate Dataframes using sparkMLib and Scala.

Mar 17, 2019
1
Dataset deduplication using spark’s MLlib
Dataset deduplication using spark’s MLlib
Mar 17, 2019
1
Ronald Ángel

Ronald Ángel

344 followers

Data Products Manager

Following
  • Data Science Collective

    Data Science Collective

  • Product Managers Club

    Product Managers Club

  • The Medium Blog

    The Medium Blog

  • Miro Engineering

    Miro Engineering

  • Reenu Saluja

    Reenu Saluja

See all (42)

Help

Status

About

Careers

Press

Blog

Privacy

Rules

Terms

Text to speech