Building an end-to-end data pipeline using Azure Databricks (Part-4)
2 min readSep 14, 2022
Use Case Explanation
We will be working with transactional data referred to loan transactions and customers from GeekBankPE (a famous bank around the world).
You have two requirements from different areas of the bank.
- The Marketing area needs to have updated customer data to be able to contact them and make offers.
- The Finance area requires to have daily loan transactions complemented with customer drivers to be able to analyze them and improve the revenue.
To comply with the request, we are going to perform incremental loads and also using techniques like upsert.
Architecture
We are going to work following the delta lake architecture.
Bronze: Raw data (Data stored in original format)
Silver: Transformed data (Data stored in delta format)
Gold: Feature/Agg data (Data stored in delta format)