Rory  West

Rory West

1619214480

ETL Orchestration on AWS with AWS Step Functions

In the latest years, the engineering, governance, and analysis of data has become a very common talking point.

The need for data-driven decision-making, in fact, has grown the need of collecting and analyzing data in many ways and AWS has shown a particular interest in this field developing multiple tools for achieving these business goals.

Before being able to allow the figure of the data analyst to explore and visualize the data, a crucial step is needed. This procedure is commonly identified as ETL (extract, transform, and load) and, usually, it’s far from being simple.

#aws-step-functions #aws #etl #aws-lambda

What is GEEK

Buddha Community

ETL Orchestration on AWS with AWS Step Functions
Rory  West

Rory West

1619214480

ETL Orchestration on AWS with AWS Step Functions

In the latest years, the engineering, governance, and analysis of data has become a very common talking point.

The need for data-driven decision-making, in fact, has grown the need of collecting and analyzing data in many ways and AWS has shown a particular interest in this field developing multiple tools for achieving these business goals.

Before being able to allow the figure of the data analyst to explore and visualize the data, a crucial step is needed. This procedure is commonly identified as ETL (extract, transform, and load) and, usually, it’s far from being simple.

#aws-step-functions #aws #etl #aws-lambda

Orchestrating ETL pipelines on AWS with Glue, StepFunctions and Cloudformation

Big Data analytics is becoming increasingly important to draft major business choices in corporations of all sizes. However collecting, aggregating, joining, and analyzing (wrangling) huge amounts of data stored in different locations with a heterogeneous structure (e.g. databases, CRMs, unstructured text, etc.) is often a daunting and very time-consuming task.

Cloud computing often comes to the rescue, by providing cheap and scalable storage computing and data lake solutions, and in particular, AWS is the pack leader with the very versatile Glue and S3 services which allow users to ingest transform, and normalize store datasets of all sizes. Furthermore, Glue Catalog and Athena allow users to easily run Presto-based SQL queries on the normalized data in S3 data lakes, whose results can easily be stored and analyzed in business intelligence tools such as QuickSight.

#aws-step-functions #aws-cloudformation #etl #aws-glue #aws

Alycia  Klein

Alycia Klein

1590895500

Combine AWS Step Functions with CloudWatch Events using aws-cdk

AWS Step Functions allow one to execute & coordinate long-running processes. Step Functions fall into serverless AWS services, and the platform manages the function execution state completely.
In the example below we will use the following AWS services:
The example demonstrates how Step Functions manage execution of a process, which involves external events e.g. human interaction.

#workflow #aws-step-functions #aws-cdk #aws-lambda

Seamus  Quitzon

Seamus Quitzon

1601341562

AWS Cost Allocation Tags and Cost Reduction

Bob had just arrived in the office for his first day of work as the newly hired chief technical officer when he was called into a conference room by the president, Martha, who immediately introduced him to the head of accounting, Amanda. They exchanged pleasantries, and then Martha got right down to business:

“Bob, we have several teams here developing software applications on Amazon and our bill is very high. We think it’s unnecessarily high, and we’d like you to look into it and bring it under control.”

Martha placed a screenshot of the Amazon Web Services (AWS) billing report on the table and pointed to it.

“This is a problem for us: We don’t know what we’re spending this money on, and we need to see more detail.”

Amanda chimed in, “Bob, look, we have financial dimensions that we use for reporting purposes, and I can provide you with some guidance regarding some information we’d really like to see such that the reports that are ultimately produced mirror these dimensions — if you can do this, it would really help us internally.”

“Bob, we can’t stress how important this is right now. These projects are becoming very expensive for our business,” Martha reiterated.

“How many projects do we have?” Bob inquired.

“We have four projects in total: two in the aviation division and two in the energy division. If it matters, the aviation division has 75 developers and the energy division has 25 developers,” the CEO responded.

Bob understood the problem and responded, “I’ll see what I can do and have some ideas. I might not be able to give you retrospective insight, but going forward, we should be able to get a better idea of what’s going on and start to bring the cost down.”

The meeting ended with Bob heading to find his desk. Cost allocation tags should help us, he thought to himself as he looked for someone who might know where his office is.

#aws #aws cloud #node js #cost optimization #aws cli #well architected framework #aws cost report #cost control #aws cost #aws tags

ETL Data Pipeline In AWS

ETL (Extract, Transform, and Load) is an emerging topic among all the IT Industries. Industries often looking for some easy solution and Open source tools and technology to do ETL on their valuable data without spending much effort on other things.

There is AWS Glue for you, it’s a feature of Amazon Web Services to create a simple ETL pipeline.

AWS Glue Introduction

AWS Glue is another offering from AWS and is a serverless ETL (Extract, Transform, and Load) service on the cloud. It is fully managed, cost-effective service to categorize your data, clean and enrich it and finally move it from source systems to target systems.

#etl #aws #aws-glue #etl