Nowadays, organizations address a high volume of incidents everyday. With so much happening, responders can be overwhelmed by the volume of incidents and may end up de-prioritizing certain important incidents. Hence, it is important to have an efficient on-call scheduling and escalation process in place. In this blog, we will explore how Round Robin Escalations can help distribute on-call load and set up efficient on-call schedules.
This blog covers the following pointers:
On-call schedules ensure that there’s someone available 24/7 to fix or escalate any arising issues. Using an on-call schedule helps keep things running smoothly. These on-call responders can be anyone, from doctors required to respond to emergencies, to IT and software engineering staff needed to fix service outages or significant bugs.
Ideally, you would want on-call responders to fix incidents as soon as they arrive. Of course, real-world scenarios are not that straightforward. At times, you need multiple people with specialized skills and knowledge to resolve incidents. Hence, organizations with multiple responders require a process that considers multiple parameters and escalates incidents to right responders for a speedy response.
Incident escalation in Squadcast happens when a responder hands off the task/incident to another member, and this handoff is subject to specific rules. An escalation policy is a collection of rules used to define how and when an incident should be escalated. It answers questions like:
Whenever an organization experiences an incident, a responder is notified about the incident, which is then acted upon. This situation is manageable as long as there is one incident coming in for one responder. But, certain services may have a higher volume of incidents, this can create problems if a single responder is pulled in to address multiple incidents. This is where Round Robin Escalations can help.
Round Robin Escalation is an incident assignment strategy where users are placed in a ring and incidents are sequentially assigned to them. This strategy can help ensure that incidents are equitably distributed. It can also lower incident response time if a service experiences concurrent incidents since the incidents will not all be assigned to the same responder.
Certain teams often benefit from Round Robin Escalations:
Support teams: These teams are expected to take calls and route alerts 24/7/365. If these teams get stuck in peak hours, they could get overburdened with the volume of requests received. Round Robin Escalations helps set predefined schedules for a group of people so that they can take turns in handling incidents, thus distributing the load.
IT or Operations teams: IT, Ops or NOC Teams often cater to multiple services. Especially the teams that respond first to numerous incidents, can function better with additional help, like experts or senior responders. Round Robin Escalations help by distributing the load and making sure responders are always available.
Users, Squads and Schedules are placed in an order in which they are assigned to an incident, within an assignment ring. This order is followed to reach out to assignees when an incident is triggered, with the start pointer then pointing to the one next-in-line in the assignment ring.
The starting point of the ring is determined by the assignee order when the Escalation Policy is created.
While the Round Robin rotation is enabled, you will see a green arrow pointer next to the User, Squad or Schedule who is next in line for incident assignment. The pointer visually indicates who will be notified for an incident that is next in the Assignment Ring. By default, when an Escalation Policy with Round Robin rotation is created and configured, it will point to the first User, Squad or Schedule of the assignment ring.
You can create an Escalation Policy as desired. For each of the levels, Round Robin rotation can be enabled.
Step1: To enable simple Round Robin rotation, check the option that says Enable Round Robin assignment for this layer.
Step2: To enable rotation through the entire Assignment Ring and then to jump to the next escalation level, check the option that says Enable rotation within the Assignment Ring.
Step3: Additionally, you can also specify after what time (in minutes) should the next person in the Assignment Ring be notified.
When an incident that is using this Escalation Policy is triggered, the incident will be assigned in sequential order to the Users, Squads or Schedules that are participating in the Round Robin rotation.
Step1: Navigate to Escalation Policies. For an Existing Escalation Policy, click on the top-right icon and select Edit.
Step2: In the level(s) where you would like to enable Round Robin assignment, check the option Enable Round Robin assignment for this layer.
Step3: Add additional Users, Squads or Schedules if not added already. In case of a Schedule, whoever is on-call at the time, will be notified of the incident. Then, click Save.
You can further enable and configure other options as needed even for existing Escalation Policies.
Escalation Policies can be configured granularly to further suit custom requirements within organizations. In addition to the basic Escalation Policy and Round Robin Escalations, users can configure:
This is how you can enable Round Robin Escalations in Squadcast, to know more checkout this page.
Reliability and availability are the need of the hour. Organizations take every possible step to ensure seamless service delivery to users. Hence, having Round Robin Escalation and Scheduling can play a vital role in having functional teams 24/7 for 365 days and help prevent burnouts.
Squadcast is an incident management tool that’s purpose-built for SRE. Get rid of unwanted alerts, receive relevant notifications and integrate with popular ChatOps tools. Work in collaboration using virtual incident war rooms and use automation to eliminate toil.