Migration of Hive Metastore To Azure

Introduction

While moving the Hadoop workload from an on-premise CDH cluster to Azure, we also had a task to move the existing on-premise Hive metastore. This article provides two of the best practices for Hive Metadata migration from on-premise to Azure HDInsight.

Method 1: Hive Metastore Migration Using DB Replication

Set up database replication between the on-premises Hive metastore DB and HDInsight Hive metastore DB. The ollowing command can be used to setup the replication between the two instances:

./hive --service metatool -updateLocation hdfs://<namenode>:8020/ wasb://<container_name>@<storage_account_name>.blob.core.windows.net/

The above ‘hive metatool’ will replicate the hive metastore data from the given HDFS to the target WASB/ADLS/ABFS

Recommendation: This approach is recommended when either the source and target metadata DB are identical, or, when you are setting up or migrating existing applications.

Method 2: Hive Metastore Migration Using Scripts

  • Generate the Hive DDLs from the on-premises Hive metastore for myTable as an example, using the following script in the hive_table_dd.sh file:

SQL

  • Run the above shell script by using ‘metastoreDB’ as a parameter: bash hive_table_dd.sh metastoreDB
  • Edit the generated DDL into HiveTableDDL.hql and replace the HDFS URL with WASB/ADLS/ABFS URLs.
  • Run the updated DDL on the target Hive metastore DB being used on HDInsight cluster:

SQL

Ensure that the Hive metastore version is compatible between on-premises and Azure         HDInsight Hive instance.

Recommendation: This approach is recommended when either the source and target metadata DB are not identical, or when you are trying to set up a new environment.

Validation: In order to validate that the Hive metastore has been migrated completely, run bash script in step 1 on both the metastore DBs (i.e. source and target) to print all the Hive tables and their data locations.

Compare the outputs generated from the on-premise and Azure HDI to verify that no tables are missing in the new metastore DB.

#azure #migration #hive #metastore

What is GEEK

Buddha Community

Migration of Hive Metastore To Azure

Migration of Hive Metastore To Azure

Introduction

While moving the Hadoop workload from an on-premise CDH cluster to Azure, we also had a task to move the existing on-premise Hive metastore. This article provides two of the best practices for Hive Metadata migration from on-premise to Azure HDInsight.

Method 1: Hive Metastore Migration Using DB Replication

Set up database replication between the on-premises Hive metastore DB and HDInsight Hive metastore DB. The ollowing command can be used to setup the replication between the two instances:

./hive --service metatool -updateLocation hdfs://<namenode>:8020/ wasb://<container_name>@<storage_account_name>.blob.core.windows.net/

The above ‘hive metatool’ will replicate the hive metastore data from the given HDFS to the target WASB/ADLS/ABFS

Recommendation: This approach is recommended when either the source and target metadata DB are identical, or, when you are setting up or migrating existing applications.

Method 2: Hive Metastore Migration Using Scripts

  • Generate the Hive DDLs from the on-premises Hive metastore for myTable as an example, using the following script in the hive_table_dd.sh file:

SQL

  • Run the above shell script by using ‘metastoreDB’ as a parameter: bash hive_table_dd.sh metastoreDB
  • Edit the generated DDL into HiveTableDDL.hql and replace the HDFS URL with WASB/ADLS/ABFS URLs.
  • Run the updated DDL on the target Hive metastore DB being used on HDInsight cluster:

SQL

Ensure that the Hive metastore version is compatible between on-premises and Azure         HDInsight Hive instance.

Recommendation: This approach is recommended when either the source and target metadata DB are not identical, or when you are trying to set up a new environment.

Validation: In order to validate that the Hive metastore has been migrated completely, run bash script in step 1 on both the metastore DBs (i.e. source and target) to print all the Hive tables and their data locations.

Compare the outputs generated from the on-premise and Azure HDI to verify that no tables are missing in the new metastore DB.

#azure #migration #hive #metastore

Adaline  Kulas

Adaline Kulas

1594166040

What are the benefits of cloud migration? Reasons you should migrate

The moving of applications, databases and other business elements from the local server to the cloud server called cloud migration. This article will deal with migration techniques, requirement and the benefits of cloud migration.

In simple terms, moving from local to the public cloud server is called cloud migration. Gartner says 17.5% revenue growth as promised in cloud migration and also has a forecast for 2022 as shown in the following image.

#cloud computing services #cloud migration #all #cloud #cloud migration strategy #enterprise cloud migration strategy #business benefits of cloud migration #key benefits of cloud migration #benefits of cloud migration #types of cloud migration

Eric  Bukenya

Eric Bukenya

1624713540

Learn NoSQL in Azure: Diving Deeper into Azure Cosmos DB

This article is a part of the series – Learn NoSQL in Azure where we explore Azure Cosmos DB as a part of the non-relational database system used widely for a variety of applications. Azure Cosmos DB is a part of Microsoft’s serverless databases on Azure which is highly scalable and distributed across all locations that run on Azure. It is offered as a platform as a service (PAAS) from Azure and you can develop databases that have a very high throughput and very low latency. Using Azure Cosmos DB, customers can replicate their data across multiple locations across the globe and also across multiple locations within the same region. This makes Cosmos DB a highly available database service with almost 99.999% availability for reads and writes for multi-region modes and almost 99.99% availability for single-region modes.

In this article, we will focus more on how Azure Cosmos DB works behind the scenes and how can you get started with it using the Azure Portal. We will also explore how Cosmos DB is priced and understand the pricing model in detail.

How Azure Cosmos DB works

As already mentioned, Azure Cosmos DB is a multi-modal NoSQL database service that is geographically distributed across multiple Azure locations. This helps customers to deploy the databases across multiple locations around the globe. This is beneficial as it helps to reduce the read latency when the users use the application.

As you can see in the figure above, Azure Cosmos DB is distributed across the globe. Let’s suppose you have a web application that is hosted in India. In that case, the NoSQL database in India will be considered as the master database for writes and all the other databases can be considered as a read replicas. Whenever new data is generated, it is written to the database in India first and then it is synchronized with the other databases.

Consistency Levels

While maintaining data over multiple regions, the most common challenge is the latency as when the data is made available to the other databases. For example, when data is written to the database in India, users from India will be able to see that data sooner than users from the US. This is due to the latency in synchronization between the two regions. In order to overcome this, there are a few modes that customers can choose from and define how often or how soon they want their data to be made available in the other regions. Azure Cosmos DB offers five levels of consistency which are as follows:

  • Strong
  • Bounded staleness
  • Session
  • Consistent prefix
  • Eventual

In most common NoSQL databases, there are only two levels – Strong and EventualStrong being the most consistent level while Eventual is the least. However, as we move from Strong to Eventual, consistency decreases but availability and throughput increase. This is a trade-off that customers need to decide based on the criticality of their applications. If you want to read in more detail about the consistency levels, the official guide from Microsoft is the easiest to understand. You can refer to it here.

Azure Cosmos DB Pricing Model

Now that we have some idea about working with the NoSQL database – Azure Cosmos DB on Azure, let us try to understand how the database is priced. In order to work with any cloud-based services, it is essential that you have a sound knowledge of how the services are charged, otherwise, you might end up paying something much higher than your expectations.

If you browse to the pricing page of Azure Cosmos DB, you can see that there are two modes in which the database services are billed.

  • Database Operations – Whenever you execute or run queries against your NoSQL database, there are some resources being used. Azure terms these usages in terms of Request Units or RU. The amount of RU consumed per second is aggregated and billed
  • Consumed Storage – As you start storing data in your database, it will take up some space in order to store that data. This storage is billed per the standard SSD-based storage across any Azure locations globally

Let’s learn about this in more detail.

#azure #azure cosmos db #nosql #azure #nosql in azure #azure cosmos db

Seamus  Quitzon

Seamus Quitzon

1595205213

How to perform migration rollback in laravel

As we know that laravel migration provides very simple way to create database table structure. We need to create migration file and write table structure then migrate that migration. Sometimes we need to rollback that migration. So here we will discuss about the migration rollback in laravel.

We can run the rollback artisan command to rollback on a particular step. We can execute the following artisan command.

php artisan migrate:rollback --step=1

Every time when we will rollback, we will get the last batch of migration.

**Note: **This rollback command will work on laravel 5.3 or above version. For the version below 5.3, there is no command available for migration rollback in laravel.

We can also use the following command to rollback and re migrate.

php artisan migrate:refresh --step=2

It will rollback and remigrate last two migration.

You can also checkout the article for executing single migration by clicking on the link below.

How to migrate single migration in laravel

#laravel #how to perform rollback migration in laravel #laravel migration rollback #migration refresh in laravel #migration rollback batch in laravel #migration rollback for one specific migration #migration rollback in laravel

Ruthie  Bugala

Ruthie Bugala

1620435660

How to set up Azure Data Sync between Azure SQL databases and on-premises SQL Server

In this article, you learn how to set up Azure Data Sync services. In addition, you will also learn how to create and set up a data sync group between Azure SQL database and on-premises SQL Server.

In this article, you will see:

  • Overview of Azure SQL Data Sync feature
  • Discuss key components
  • Comparison between Azure SQL Data sync with the other Azure Data option
  • Setup Azure SQL Data Sync
  • More…

Azure Data Sync

Azure Data Sync —a synchronization service set up on an Azure SQL Database. This service synchronizes the data across multiple SQL databases. You can set up bi-directional data synchronization where data ingest and egest process happens between the SQL databases—It can be between Azure SQL database and on-premises and/or within the cloud Azure SQL database. At this moment, the only limitation is that it will not support Azure SQL Managed Instance.

#azure #sql azure #azure sql #azure data sync #azure sql #sql server