In recent polls we’ve conducted with engineers and leaders, we’ve found that around 70% of participants used MTTA and MTTR as one of their main metrics. 20% of participants cited looking at planned versus unplanned work, and 10% said they currently look at no metrics. While MTTA and MTTR are good starting points, they’re no longer enough. With the rise in complexity, it can be difficult to gain insights into your services’ operational health.
In this blog post, we’ll walk you through holistic measures and best practices that you can employ starting today. These will include challenges and pain points in gaining insight as well as key metrics and how they evolve as organizations mature.
It’s easy to fall into the trap of being data rich but information poor. Building metrics and dashboards with the right context is crucial to understanding operational health, but where do you start? It’s important to look at roadblocks to adoption thus far in your organization. Perhaps other teams (or even your team) have looked into the way you measure success before. What halted their progress? If metrics haven’t undergone any change recently, why is that?
Below are some of the top customer pain points and challenges that we typically see software and infrastructure teams encounter.
With these pain points in mind, let’s look at some key metrics other organizations we’ve spoken to have found success with.
#devops #metrics #site reliability engineering #site reliability #site reliability engineer #metrics monitoring #site reliability engineering tools