A lot of tech companies struggle with creating an effective and efficient on-call schedule internally for their product and service, which results in longer downtimes when something goes wrong. They often over-burden their team members with repeated on-call duty, resulting in team fatigue. Here’s how to create an on-call schedule that your team might just love.

On-call doesn’t have to suck the life out of your employees. There’s another side to it. A better one.

An on-call schedule ensures that someone competent is available to bring services up and running if they go down so that the customers don’t have trouble using your product or service. Though on-call isn’t a new concept in the world of DevOps and IT Ops, the execution and roles have greatly evolved over the years.

How Has On-Call Evolved Over the Years?

In the past, being on-call and resolving issues as they occur used to be the sole responsibility of Sysadmins and Operation Engineers. With the evolution of DevOps, Software Developers now find themselves part of an on-call rotation and this has worked well for most companies.

On-call schedules used to be created on spreadsheets (some still use this method) and intimated to the team without looking into their specific availability. The person on-call had to be available at that time or day. It lacked flexibility, it was a nightmare to find a replacement if the person on-call had an emergency and it was a hassle to find someone who could help resolve an issue if the person on-call wasn’t able to resolve it on their own.

Thanks to ops platforms like Fyipe which has an inbuilt, on-call scheduling feature, we don’t have to worry about creating schedules in spreadsheets anymore or informing the person on-call.

What still remains an issue, however, is the negative attitude towards being on-call. No-one wanted to be on-call then and no-one wants it now but it’s an absolute necessity.

Being on-call doesn’t have to suck! An effective on-call schedule can help reduce friction and help keep your engineers happy. Happy on-call team means happy customers!

The only way this is possible without draining your team is to ensure the schedule takes care of their work-life balance and doesn’t deplete any single engineer completely.

Why Do You Need to Have Someone On-Call?

Being on-call is the first step an organization takes towards improving its availability and reliability for its customers or users. On-call engineers are the last line of defense to defend against customer-impacting outages and ensure that the issues are resolved as quickly as possible. You need to be there when your customers need you. On-call ensures this.

“If the idea of being ‘on-call’ sucks to your team, it means they are responding negatively to a symptom.

The cause is less systemic and more a reflection of the team/organization’s basic engineering prowess.

An organization should have a “No Downtime” engineering and ops process in place. Having an on-call schedule for your team is an emergency last line of defense against downtimes.

#devops #devops-tools #sre #incident-management #incident-responsiveness #incident-response-plan #incident #on-call

How to Build an Effective and Sustainable On-Call Schedule For Your Team
1.15 GEEK