handoff is a serverless data pipeline orchestration framework simplifies the process of deploying ETL/ELT tasks to AWS Fargate. Introducing Handoff: Serverless Data Pipeline Orchestration Framework
handoff is a serverless data pipeline orchestration framework simplifies the process of deploying ETL/ELT tasks to AWS Fargate.
This article presents the business context and a technical deep dive of our serverless approach to data pipeline deployment. Our approach is open-sourced as handoff: framework for serverless data pipeline orchestration.
More and more companies have started to leverage cloud applications and their data in order to make their operation lean and effective. For example, D2C (direct to consumer) companies produce niche household products and ship directly to their customers instead of relying on a traditional retail distribution channel.
In doing so, a D2C company may use a marketing automation platform like Marketo or Pardot, run ads on Facebook and Google, and manage the supply chain with Flexport: All of these applications produce customer experience data. It is a common practice to extract data from cloud services and load them into a data warehouse (DWH). By combining data sources and analyzing them together, the business continuously improves the quality of the customer experience.
We are a technology-leveraged service company offering custom ETL/ELT solutions. ETL stands for Extracting, Transforming, and Loading of data. ELT workflows are becoming popular. Data are extracted and loaded into the data warehouse before transformation is executed by a massively parallel modern DWH engine.
The demand for implementing data pipelines is growing at an unprecedented pace. Data replication services such as Fivetran and StitchData provide data connectors for popular cloud applications such as Salesforce and Zendesk. However, those companies cannot support the long-tail of cloud applications in the Software as a Service (SaaS) market. If a cloud application is not supported, a business needs a custom data engineering job, but not every company can afford to developing an internal data engineering team.
That is why fully-managed ETL services like ours are becoming the new option: With a fraction of the cost of hiring a data engineer, businesses can install the tailor-made data pipeline service.
While we serve our customers with active monitoring and maintenance, we decided to open-source the core framework as a contribution to data engineering community. In the rest of the article, we will share the technical solution that made our service possible.
What is AWS DevOps? - AWS DevOps Tutorial. In this blog on what is AWS DevOps, you will learn about AWS & DevOps along with the implementation of the concepts of DevOps with the help of AWS cloud platform.
AWS KMS is a Key Management Service that let you create Cryptographic keys that you can use to encrypt and decrypt data and also other keys. You can read more about it here.
To set up a serverless CI/CD pipeline in your AWS environments, there are several key services that you need to use. Find out more here.
Serverless is a great approach to build highly scalable applications quickly with mentioned services. When AWS launched Lambda back in 2014, the whole new concept of Serverless evolved. It became one of the most successful services from AWS today. It is the right time to learn Cloud computing as a developer.
AWS DevOps Tutorial will help you understand what is Continuous Integration and Continuous Deployment through the various tools & services of the AWS suite. This tutorial also explains how to achieve Continuous Integration & Deployment through AWS CodePipeline & EC2 through the means of a Hands-On session by the end of which you will learn how to deploy a demo application using AWS.