Adopting SRE best practices can be difficult, especially when you need approval from managers, VPs, CTOs, and more. In this blog post, we’ll walk you through crafting a winning pitch for each level of leadership to ensure that SRE buy-in will succeed in your organization. Let’s start at the beginning with your team lead or manager.

The Situation

As one of the first steps towards SRE adoption, incident management is key. You want to implement an effective incident management system within your team. Now it’s time to convince your lead/manager. How will you accomplish this?

First, we need to recognize that your manager will need a lot of support from engineering and DevOps teams for this transition. These teams will need training in this incident management system to use it each time an incident occurs.

Second, you need to define what you mean by incident management. We’ll define incident management as the assembling, investigating, resolution, and learning process. This includes incident response playbooks, measuring tim-to-detection, monitoring systems, and ticketing workflow.

Once you have a handle on the basic proposal, it’s time to think about what the team (manager included) will gain from an incident management system.

The Incentives

There are four incentives that will motivate your team to adopt incident management best practices:

  • Incident management best practices restore your systems as fast as possible when an incident occurs.
  • A playbook gives everyone a sense of control amidst the chaos. It defines a set of repeatable practices to drive consistency while helping everyone to be thorough with their problem-solving.
  • Measuring time to resolution (TTR) and time to detection (TTD) allows the manager to quantify the team’s improvement on TTR and TTD moving forward.
  • Integration with alerting and ticketing systems reduces context switching between different apps. This lowers the stress from mentally keeping track of many systems.

Yet, explaining these incentives to your manager and hoping for immediate support will not guarantee buy-in. You need to anticipate the resistance your manager will have towards this big change.

#devops #incident management #site reliability engineering #site reliability #site reliability engineer #incident response #site reliability engineering tools

Getting SRE Buy-in From a Manager or Lead for Incident Response
1.35 GEEK