Data Engineering Digest #9 (February 2020)

Maycon Viana Bordin
data.plumbers
Published in
8 min readMar 13, 2020
Photo by Pixabay from Pexels

New Tools

Data Engineering Role

Events

Podcasts & Presentations

Publications

Experiences with Managing Data Ingestion into a Corporate Datalake

We explain our experiences in designing, building and running a large corporate Datalake. Our platform has been running for over two years and makes a wide variety of corporate data assets, such as sales, marketing, customer information, as well as data from less conventional sources such as weather, news and social media available for analytics purposes to many teams across the company. We focus on describing the management of data and in particular how it is transferred and ingested into the platform.

Real Data Architectures

Data Culture

Data Lake

Data Governance

Event Sourcing

Data Formats

Delta Lake

Data Pipelines

Data Processing

Apache Spark

Apache Hadoop

MR3

Stream Processing

Apache Flink

Apache Spark

Change Data Capture

Messaging

Apache Kafka

Workflow Management

Apache Airflow

Data Quality Libraries

Cloud Providers

AWS

Azure

Databases

NoSQL

Storage

Modern Data Warehouses

--

--