Over the past few years, as Red Hat OpenStack Platform has matured to handle a wide variety of customer use cases, the need for the platform to scale has never been greater. Customers rely on Red Hat OpenStack Platform to provide a robust and flexible cloud, and with greater adoption we also see the need for our customers to deploy larger and larger clusters.

With that said, the Red Hat Performance & Scale Team has been on a mission over the last year to push OpenStack scale to new limits. Last summer we undertook an effort to scale test Red Hat OpenStack Platform 13 to more than 500 overcloud nodes and in the process identified and fixed several issues that led to better tuning for scale, out of the box.

Around the beginning of this year, we repeated the exercise with Red Hat OpenStack Platform 16.0, and achieved the same level of scale of 500+ nodes. More recently, over the last few weeks, we tested Red Hat OpenStack Platform 16.1 to scale to more than 700 overcloud compute nodes pre-GA, setting a new record for the largest Red Hat OpenStack Platform Director driven bare metal deployment tested by our team.

In the first chart, you can see how we have been increasing the number of nodes we test with each new release. Additionally, we have invested in building an internal lab with sufficient bare metal nodes to facilitate these kinds of large scale efforts.

image

Figure 1: Increasing number of nodes from OpenStack Platform 7 to 16.1

While we have also invested in building tools that help simulate scale clusters without needing as much hardware for testing, we still believe that scale testing with real hardware presents a different set of challenges and adds value by exposing potential problems that customers could run into.

In this post, we will talk about our journey to more than 700 overcloud nodes using Red Hat OpenStack Platform 16.1, lessons learned, issues identified and fixed. We will also talk a little bit about how scale testing Red Hat OpenStack Platform 16.0 earlier in the year led to a successful scale test with the latest version of our long-life release, Red Hat OpenStack platform 16.1.

#node #python #git

Scaling Red Hat OpenStack Platform 16.1 to more than 700 nodes
10.20 GEEK