ELT Fundamentals 101:

Habibsahab
2 min readMar 19, 2023

--

What is ELT?

ELT stands for Extract, Load, and Transform is another type of data integration process, like ETL which is Extract, Transform, and Load in which we extract data from one system, load it into a target repository, and then transform it for downstream uses such as business intelligence and data analytics.

How does ETL works?

ETL process works in three steps:

1. Extract the relevant data from data sources.

2. Transform the data for data analysis and business intelligence.

3. Load the data into the target database.

Extract:

Data extraction is the process of obtaining raw data from a source and replicating that data somewhere else. The raw data can come from various sources, such as a database, Excel spreadsheet, a SaaS platform, web scraping, or others. It can then be replicated to a destination, such as a data warehouse. This can also include unstructured data or simply data that is poorly organized. Once the data has proceeded now it waits for transformation.

Load:

Data loading is the process to move the transformed data from the staging area into the target data warehouse. For most organizations that use ETL, the process is automated, well-defined, continuous, and batch driven. Typically, ETL takes place during off-hours when traffic on the source systems and the data warehouse is at its lowest.

Transform:

Data transformation is the process of converting data from one format to another, typically from the format of a source system into the required format of a destination system. This step includes filtering, cleansing, de-duplication, validating, normalizing, and authenticating the data.

ETL vs ELT

The main difference between the two processes is how, when, and where data transformation occurs. The ELT process is most appropriate for larger, nonrelational, and unstructured data sets and when the timeline is important. The ETL process is more appropriate for small data sets which require complex transformations. ELT extracts the data from the source locations, but instead of moving it to a staging area for transformation, it loads the raw data directly to the data store where it can be transformed as needed. ELT does not transform any data in transit.

In ETL, after the extraction of data, all the transformations are done before the data is loaded into a destination system. ELT is more useful that ETL in situations when data needs to be processed in real-time, for example streaming data from IoT devices. If this is the case, ELT is a good choice because the transformation happens after the data is already in its final location and if the analysts want to measure something new, they don’t have to change the whole pipeline, they can just change the transformation part at the end.

To learn about Data Engineering and ETL process please visit the profile.

--

--