What is ETL? | Extract, Transform & Load | Data Integration

ETL: Extract Transform and Load

Click here for images
The ETL & Data Integration Guide 2022 is a thorough, vendor-independent analysis of the ETL tools landscape.

Expert opinions of 20 ETL tools
Access all research data
Over 90 selection criteria
Used in over 50 countries

To the ETL & data integration guide

Extract, Transform, and Load

ETL stands for the three words Extract, Transform, and Load. The ETL process covers extracting data, mostly from different types of systems, transforming it into a structure that’s more appropriate for reporting and analysis, and finally loading it into the database and/or cube(s).

But these days, ETL is much more than that

Most ETL software also covers:

1. Extract data

Extract data from different internal and external sources, structured and/or unstructured. Plain queries are sent to the source systems using native connections, message queuing, ODBC, or OLE-DB middleware. The data will be put in a Staging Area (SA), usually with the same structure as the source. Sometimes we only the data that is new or has been changed. Some ETL tools can do this automatically, providing a changed data capture (CDC) mechanism.

2. Transform the data

Once the data is available in the Staging Area it’s all on one platform and one database. Then we can easily combine tables, filter and sort the data using specific attributes, pivot to another structure, and make calculations. Check on data quality and cleanse the data if necessary. After preparing all the data we can implement slowly-changing dimensions, in which case we want to keep track of when attributes change over time in reports and analyses.

3. Load the data

Finally, data is loaded into a data warehouse, usually into fact and dimension tables. From there, the data can be combined, aggregated, and loaded into datamarts or cubes as is deemed necessary. The business user analyzes and uses the transformed data with BI tools and instruments like data visualization software, dashboards, OLAP tools, and reporting tools. The data can then be used to improve the quality of the organization’s decision-making.

An ideal ETL architecture contains a data warehouse

Below you’ll find the ideal ETL architecture supporting the three major steps in ETL.Business Intelligence Tools

An overview of a data warehouse and ETL architecture displaying what is ETL

Read more about ETL & data

Passionned Group has written extensively about data and everything that comes with it. Take a look at the articles below or consult our knowledge base.

Data profiling and data quality control

Profiling the data will give direct insight into the data quality of the source systems. It can display how many rows have missing or invalid values, or what the distribution is of the values in a specific column. Based on this knowledge, one can specify business rules in order to cleanse the data, or keep really bad data out of the data warehouse. By profiling the data before designing your ETL process, you are better able to design a system that is robust and has a clear structure.

“ETL stands for extract, transform, and load. ETL can be compared to purifying water. First, undrinkable water (data) is extracted from various rivers. This dirty water is purified using a tool. The purified water is stored in a container that you can drink from,” says Rick van Der Linden. The biggest problem with data, according to Rick, is its quality. “I’ve seen many organizations that drive on low-quality data. But it’s pointless to have a dashboard showing the wrong data. It’s like drinking dirty water – you can do it, but it’ll make you sick. So the “T” in ETL is a crucial part of the process.”

Metadata management & ETL

Information about all the data that is processed, from sources to targets by transformations, is often put into a metadata repository; a database containing all the metadata. The entire ETL process can be ‘managed’ with metadata management. For example, one can query how a specific target attribute is built up in the ETL process, called data lineage. Or you want to know what the impact of a change will be, for example the size of the order identifier (id) is changed, and in which ETL steps this attribute plays a role.

Learn more about ETL

Learn more about ETL & Analytics and download our ETL & Data Integration Guide 2022.

Also consider taking our our unique Data Warehousing & Data Governance training course which can help you to understand all aspects of ETL and Data Integration.

About Passionned Group

Logo of Passionned Group, ETL expertPassionned Group is an analyst and global consultancy company specializing in business intelligence, analytics and data integration. We offer in-depth and 100% vendor independent research, consulting and masterclasses.

Contact us

Our ETL experts

Daan van Beek, Author The intelligent organization


Author The intelligent organization

Rick van der Linden, Senior ETL Analist


Senior ETL Analist

What is ETL? ✦ Extract, Transform & Load ✦ Data Integration
Training courses
Knowledge base
Advice & consulting
Interim management

A selection of our customers

Become a customer with us now

Do you also want to become a customer of ours? We are happy to help you with what is etl? (extract, transform & load) or other things that will make you smarter.

Rick van der Linden, Senior ETL Analist


Senior ETL Analist

Contact me directly

Fact sheet

Number of organizations serviced
Number of training courses
Number of participants trained
Overall customer rating
Number of consultants & teachers
Number of offices
Number of years active