Defining the ETL Process
-
Extract data from an external source into a data lake
-
These external sources could include:
- Operational systems
- Third party APIs
- RDBMS
- etc.
-
-
Next, transform that data from the data lake
-
Transformation tools include:
- Sqoop
- HQL
- etc.
-
-
Then, load the transformed data either:
- Back into the data lake
- Or into a data warehouse
Defining the ELT Process
-
Extract data from an external source into a data lake
-
These external sources could include:
- Operational systems
- Third party APIs
- RDBMS
- etc.
-
- Next, load the data from the data lake into a data warehouse
-
Then, transform the data in the data warehouse
- Afterwards, we can load it back into the data warehouse
References
Previous
Next