This article gives you an overview of configuring AWS RDS SQL Server with AWS Glue service that is used in AWS for cataloging and ETL operations.

Introduction

AWS Cloud offers a variety of data repositories like AWS RDS, AWS DynamoDB, AWS Redshift and many others. AWS RDS supports six different types of databases namely Aurora, MariaDB, SQL Server, Postgres, MySQL and Oracle. With a variety of data repositories on the cloud, there is often a need to hold inventory of all the data repositories and database objects held in those repositories in a central location. This central inventory is also known as the data catalog. AWS Glue is a serverless managed service that supports metadata cataloging and ETL (Extract Transform Load) on the AWS cloud. To perform these operations on AWS RDS for SQL Server, one needs to integrate AWS Glue with AWS RDS for SQL Server instance. We will learn how to enable this integration in this article.

AWS RDS SQL Server Instance

The first thing we need to have in place to perform this exercise is a working Amazon RDS for SQL Server instance. For those who are new to AWS RDS for SQL Server, they can read this article, Getting started with AWS RDS SQL Server, to create a new instance. Once you have a working instance, it would look as shown below. You can create the AWS RDS SQL Server instance using any edition of SQL Server supported by the RDS service. Ensure that you have the required privileges to connect and access data from the instance.

AWS RDS SQL Server instance

Introduction to AWS Glue

AWS Glue is a serverless service offering from AWS for metadata crawling, metadata cataloging, ETL, data workflows and other related operations. AWS Glue can be used to connect to different types of data repositories, crawl the database objects to create a metadata catalog, which can be used as a source and targets for transporting and transforming data from one point to another. AWS Glue supports workflows to enable complex data load operations. Usually, the first step for any operation is connecting to the data source of interest by creating a new connection. To learn the required configurations for creating a new connection, navigate to the AWS Glue home page from the AWS Search console by searching for the Glue service as shown below.

AWS Management Console

The left pane contains different options which are categorized majorly into Data catalog, ETL and Security. Once you are on the home page of AWS Glue service, click on the Connection tab on the left pane and you would be presented with a screen as shown below.

AWS Glue Console

Now it’s time to create a new connection to our AWS RDS SQL Server instance. Click on the Add connection button to start creating a new connection. A new wizard screen would appear which will have multiple steps to collect details regarding the data source to which we intend to create a connection. The first step is to provide a connection name. Provide a relevant name for the connection.

AWS Glue Connection Properties

Next, we must select the type of connection. In the Connection type dropdown, you can find the options as shown below. Of all the supported options, we need to select Amazon RDS as it’s the service that holds our AWS RDS SQL Server instance.

#aws #aws rds #sql

How to connect AWS RDS SQL Server with AWS Glue
16.50 GEEK