There is an alternative to run Hive on Kubernetes. Spark can be run on Kubernetes, and Spark Thrift Server compatible with Hive Server2 is a great candidate. That is, Spark will be run as hive execution engine. I am going to talk about how to run Hive on Spark in kubernetes cluster .
It is not easy to run Hive on Kubernetes. As long as I know, Tez which is a hive execution engine can be run just on YARN, not Kubernetes.
There is an alternative to run Hive on Kubernetes. Spark can be run on Kubernetes, and Spark Thrift Server compatible with Hive Server2 is a great candidate. That is, Spark will be run as hive execution engine.
I am going to talk about how to run Hive on Spark in kubernetes cluster .
All the codes mentioned here can be cloned from my github repo: https://github.com/mykidong/hive-on-spark-in-kubernetes
Before running Hive on Kubernetes, your S3 Bucket and NFS as kubernetes storage should be available for your kubernetes cluster.
Your S3 bucket will be used to store the uploaded spark dependency jars, hive tables data, etc.
NFS Storage will be used to support PVC
ReadWriteMany Access Mode which is needed to spark job.
If you have no such S3 bucket and NFS available, you can install them on your kubernetes cluster manually like me:
Our original Kubernetes tool list was so popular that we've curated another great list of tools to help you improve your functionality with the platform.
Use Spark in a simple and portable way on-promise and in the cloud. In this blog, I will explain how to run Spark with Kubernetes using the Spark on Kubernetes Operator. I will also describe the configurations for fast S3 data access using S3A Connector and S3A Committers. This architecture works for both cloud object storage and on premise S3 compatible object storage like FlashBlade S3.
This article explains how you can leverage Kubernetes to reduce multi cloud complexities and improve stability, scalability, and velocity.
Earlier this year at Spark + AI Summit, we had the pleasure of presenting our session on the best practices and pitfalls of running Apache Spark on Kubernetes (K8s).
Get Hands-on experience on Kubernetes and the best comparison of Kubernetes over the DevOps at your place at Kubernetes training