Contents
What is ETL?
Why now ELT?
Comparing ETL, ETLT & ELT
What is ETL?
With ETL, data is extracted from the source, is then cleansed/transformed and then loaded into a database. If the logic applied during cleansing/transformation is updated, applying this to the original data may not be possible.
Some Business Intelligence (BI) tools offer manipulation/transformation of data (ETLT), which is fast and great for working on local and offline data files. Manipulating and transforming data, even creating models in an isolated environment, like a BI tool. This is a legacy setup that pre-dates data-warehousing methodologies.
For an analyst to be successful, they need to be trusted.
Analysts are tasked with ensuring that the data presented is consistent and reliable. If data is stored and transformed in a number of places, this becomes increasingly difficult to remain consistent, therefore more difficult to be trustworthy.
Why now ELT?
In ELT, the concept is simple. Load your data to the warehouse in the format, with little to no processing, transformation or cleansing, then transform it in the warehouse (or equivalent) later on. The main benefit is keeping the business logic close to the data, which improves data governance. In literal terms, this means keeping logic away from individual analysts, working in silos within their own divisions (e.g. different teams using their own fx exchange rates).
Storing raw data is much cheaper now, this means that resolving bugs and adjusting business logic in data transformations is faster and easier than ever.
The shift to ELT, is the shift to the cloud data warehouse, which organically follows with cloud analytics, directly on top of the warehouse. With Astrato, the best-fit setup for common use cases is:
Extract from source
Load into snowflake tables (in raw format)
Create a view (or data mart - see our data mart article here)
Views can be dropped and re-created
Astrato reads from updated views (including new fields) without needing to re-create a data view, or adjust joins
ELT has become more feasible and more cost-effective with the shift to cloud-based data warehousing. Shifting to a cloud data warehouse does not only reap the benefits that ELT brings, but also:
Low total cost of ownership
Improved speed and performance
Increased data storage
Near-infinite scalability
Better disaster recovery
Native support for unstructured data (JSON)
Comparing ETL, ETLT & ELT
| ETL | ETLT | ELT |
Advantage | Reduce cost by storing processed data with business logic applied before data load | Better flexibility after loading transformed data, further transformations can be applied in the BI tool without the need to engage DB engineers. | Improve scalability, sharing, disaster recovery by applying business logic on top of raw data. |
Risks / Disadvantage | Logic changes over time, impacting accuracy of historical data and trends. | Logic can change at different points in time, but also per destination and even per business function. | All data needs to be loaded into the warehouse. |