Introduction

ETL is a fundamental step of a Machine Learning process as it is the stepping stone on which all the dataset for the model definition is based. Because of that, data scientists and MLOps experts carefully plan jobs and pipelines to manage the extraction of data from databases, often of different natures, clean and normalize data, and finally generate a data lake to make further enhancement on data during the investigation process.

Usually, this process involves different steps, coordinating their resolution, accessing different databases with different technologies, preparing many scripts, knowing different languages to query the relevant data, and so on.

Taking care of all these steps is a daring task and requires a lot of expertise, and of course, is time-consuming, undercutting the efficiency of the entire project at hand.

#etl #aws-glue-elastic-views #aws

AWS Glue Elastic Views! An almost no code ETL & Aggregation Framework
1.50 GEEK