Anup Moncy
Data Engineering
Published in
1 min readJun 28, 2023

--

Foundations of Data Engineering (5 days):

  • Relational Databases (e.g., PostgreSQL, MySQL, Oracle, SQL Server)
  • SQL
  • Data Modelling
  • Indexing and optimization

Data Warehousing and ETL Tools (5 days):

  • Cloud Data Warehouses (e.g., AWS Redshift, Google BigQuery, Azure Synapse Analytics)
  • Modern Data Integration Tools (e.g., Apache Airflow)
  • Informatica PowerCenter or ETL Tools

Distributed Computing and Big Data Technologies (5 days):

  • Apache Hadoop
  • Apache Kafka
  • Apache Flink
  • Apache Spark
  • Data streaming architectures

Data Governance and Security (5 days):

  • Apache Atlas
  • Data privacy regulations
  • Data security

Advanced Data Engineering Concepts (5 days):

  • Apache Spark: 2 days
  • Change Data Capture (CDC): 2 days
  • Data streaming architectures: 1 day

Cloud Data Platforms and Serverless Computing (5 days):

  • AWS Data Lake
  • Google Cloud BigQuery or Azure Data Factory or AWS Redshift
  • Snowflake
  • Databricks

Patterns:

  • Lambda Architecture
  • Data Warehousing Patterns (dimensional modeling, star schema, snowflake schema)
  • Event-Driven Architecture (event sourcing, CQRS, message brokers)
  • Microservices Architecture (bounded contexts, domain-driven design, API-first)
  • Data Streaming Patterns (event-driven streaming, windowing, time-based aggregations)

--

--