Preface

Kubernetes is a lot of fun, has lots of features and usually supports most of one’s whims as a container orchestration tool in a straight-forward fashion.

However, one request from my direct manager had made me sweat during my attempts to achieve it: auto-scale pods according to a complicated logic criteria.

Trying to tackle the task, my online research yielded partial solutions, and I ran through so many brick walls trying to crack this one, that I had to write an article about it in order to avoid future confusion regarding this matter for all poor souls who might try to scale-up their micro-services on a criteria that’s not CPU/Mem.

The Challenge

It all started when we needed to scale one of our deployments according to the number of pending messages in a certain queue of RabbitMQ.

That is a cool, not overly complicated task that can be achieved by utilizing Prometheus, Rabbitmq-exporter, and Prometheus-adapter together (hereinafter referred to as “the trio”).

With much enthusiasm and anticipation, I jumped right into the implementation only to later discover that one of my manager’s magic light-bulbs had switched on in his brain. It happens quite often, fortunately for him, and less fortunately for me as this usually means stretching the capabilities of the technology at hand with advanced and not-often-supported demands.

He came up with a better, more accurate scaling criteria for our deployment. In a nutshell: measures how long a message has been waiting in queue “A” using the message’s timestamp, and then performs some logic to determine the final value of the metric, which is always returned as a positive integer.

Well, that’s nice and all, but as far as my knowledge extends, the trio mentioned above is not able to perform the advanced logic my manager desired. After all it relies solely on metrics that RabbitMQ exposes, so I was left to figure out a solution.

The experience from trying to implement the trio has helped me gain a better view on how the Horizontal Pod Autoscaler works and reads data from sources.

As per the documentation, HPA works mainly against 3 APIs:

  • Metrics
  • Custom Metrics
  • External Metrics

My plan was to somehow harness the ‘custom metrics’ API and have it work against an internal application metrics API of our own, with the intention that the HPA would be able to read data from the internal API and scale accordingly.

This API could, in the future, be extended and serve as an application-metric for other deployments that need scaling based on internal application metrics or any kind of metrics for that matter.

This in essence involves the following tasks:

  1. Writing the code for our internal API
  2. Creating a Kubernetes deployment and service for our internal API
  3. Creating a Custom Metrics APIService in Kubernetes
  4. Creating the HPA resource

And with that in mind, let’s get to work.

Please note that for the sake of demonstration, I used the ‘custom-metrics’ namespace in all yaml definitions. However, it’s an arbitrary selection so feel free to deploy it anywhere you want.

#kubernetes #autoscaling #hpa #api-development #docker #api

Building Your Own Custom Metrics API for Kubernetes Horizontal Pod Autoscaler
2.00 GEEK