Last week, I shared some emergency ColdFusion code that I had written and deployed that would allow me to view the processlist and kill slow queries in MySQL in one of InVision’s read-replicas on Amazon RDS. After we upgraded to MySQL 5.7.32, a few of our oldest SQL queries started to perform really poorly; and, I needed to create a little breathing room for the database CPU while I refactored the problematic SQL statements. After my post last week, Martie Henry asked me what the performance bottleneck ended-up being; so, I thought I would share a quick case study about what was going on and how I “fixed it” (depending on how hard you squint).

In early December, we put InVision into “maintenance mode”, took the database servers offline, upgraded to MySQL 5.7.32, and then brought everything back online. And when we did this, one our MySQL read-replica CPUs went B-A-N-A-N-A-S:

MySQL database CPU graphs after MySQL 5.7 upgrade showing sustained 100% CPU utilization.

For about 5-hours one of the read-replica CPUs shot up to 100% utilization and stayed there. And, the most terrifying part of this is that the offending SQL queries were being run by just a handful of users. Essentially, a subset of queries started performing so poorly that they gave a single user the power to degrade the performance of an entire database.

#sql #mysql #lucee

Removing Massive IN (ID-List) Clauses For Performance In MySQL 5.7.32
1.35 GEEK