ETL Definition

ETL refers to the process of extracting, transforming, and loading data into a new host source. It’s a necessary process if you want to optimize your data for analytics. ETL solves two core problems that enable better analytics:

  • Doing analytics in an environment optimized for that purpose: Transactional databases like MySQL and Postgres are excellent at processing transactional workloads. They're great at reading single rows and updating single rows with low latency. They’re not great for conducting large-scale analytics across huge datasets.
  • Cross-domain analysis: By joining data from disparate data sources, business leaders can answer deeper business problems.

You can navigate to the ETL Process Overview section, which provides an explanation of exactly what's going on in the ETL process, as well as modern shifts that are the result of advances in technology. Or, just click one of the buttons below for more specific information about each step:

More Info on ETL

  • The ETL Architecture section explains the planning process that's required before you start building.
  • The ETL Challenges section lists a number of pitfalls to look out for during the project. 
  • The ETL Tools section provides a number of composable options available for each stage of the ETL process.