Introduction

While moving the Hadoop workload from an on-premise CDH cluster to Azure, we also had a task to move the existing on-premise Hive metastore. This article provides two of the best practices for Hive Metadata migration from on-premise to Azure HDInsight.

Method 1: Hive Metastore Migration Using DB Replication

Set up database replication between the on-premises Hive metastore DB and HDInsight Hive metastore DB. The ollowing command can be used to setup the replication between the two instances:

./hive --service metatool -updateLocation hdfs://<namenode>:8020/ wasb://<container_name>@<storage_account_name>.blob.core.windows.net/

The above ‘hive metatool’ will replicate the hive metastore data from the given HDFS to the target WASB/ADLS/ABFS

Recommendation: This approach is recommended when either the source and target metadata DB are identical, or, when you are setting up or migrating existing applications.

Method 2: Hive Metastore Migration Using Scripts

  • Generate the Hive DDLs from the on-premises Hive metastore for myTable as an example, using the following script in the hive_table_dd.sh file:

SQL

  • Run the above shell script by using ‘metastoreDB’ as a parameter: bash hive_table_dd.sh metastoreDB
  • Edit the generated DDL into HiveTableDDL.hql and replace the HDFS URL with WASB/ADLS/ABFS URLs.
  • Run the updated DDL on the target Hive metastore DB being used on HDInsight cluster:

SQL

Ensure that the Hive metastore version is compatible between on-premises and Azure         HDInsight Hive instance.

Recommendation: This approach is recommended when either the source and target metadata DB are not identical, or when you are trying to set up a new environment.

Validation: In order to validate that the Hive metastore has been migrated completely, run bash script in step 1 on both the metastore DBs (i.e. source and target) to print all the Hive tables and their data locations.

Compare the outputs generated from the on-premise and Azure HDI to verify that no tables are missing in the new metastore DB.

#azure #migration #hive #metastore

Migration of Hive Metastore To Azure
1.85 GEEK