Anti-patterns in Incident Response that you should unlearn

August 2, 2022
Share this post:
Anti-patterns in Incident Response that you should unlearn
Table of Contents:

    It is important to invest time and effort in understanding why a system performs the way it does and how we can improve it. Companies continue with practices that yield successful results, but ignoring anti-patterns can be far worse than choosing rigid processes. In this blog, we will explore anti-patterns in incident response and why you should unlearn those.

    Common Anti-patterns in Incident Response

    Just get everyone on the call

    Alerting everyone each time an incident is detected is not the best of practices. Sometimes notifying everyone is easier or it adds value. For example,

    • Organizations have smaller teams, and it is easier to notify the entire team.
    • The issue is critical and getting everyone on board is a better option.

    This practice may not be ideal when teams scale. You will end up notifying people who have nothing to do with the incident. This may result in alert fatigue where people get accustomed to not paying attention and often ignore incidents where their attention is needed.

    Having on-call rotations and targeted alerting can help with efficient routing and prevent burnouts.

    Using up bandwidth to give status updates

    Responders deal with critical incidents where stakeholders expect constant status updates. Updates are great as it keeps everyone in the loop and may potentially offer more solutions. Sometimes, teams deal with minor incidents, which they can resolve quickly and then pass on updates to concerned members. However, while dealing with critical incidents, teams may be forced to focus more on sending updates rather than just resolving the incident. This may compromise the resolution process.

    To address this issue a dedicated person can be assigned for handling communication and to provide timely updates to the stakeholders.

    Progress follows Chaos

    There is a perception that while dealing with critical incidents, people will move around with lots of discussions chaos and panic. This is not always true. When multiple people are responding to an incident, it is absolutely critical that they collaborate and keep everyone in sync with the actions being taken. Chaos and panic can worsen the situation and should be avoided by defining clear roles and responsibilities. Teams should have an incident commander who takes decisions and authorizes changes that can impact the outcome. Teams also use chat rooms to give updates and maintain records effectively. By setting up these processes, teams can ensure effective communication and prevent chaos and panic.

    Incident Severity and policy discussion during an Incident Call

    Debating over the severity of the incident at the last minute is a waste of people’s time. This time should be used in resolving incidents. It is important to define unambiguous severity levels for incidents, as responses, plans, and policies are chosen based on the severity. Ideally, rules should be technically driven, clear and automated so that every incident comes with a pre-defined severity level.

    Training and drills should be conducted to educate teams on how to handle these situations better.

    Not escalating incidents to the right responders

    Teams fail to inform the right responders when they don't have mechanisms to associate/relate incidents to the right responders. In order to find the right person, teams go back and forth, slowing down the process. Another reason why the right people aren't notified is when there are multiple teams involved and team structures are complex. It is important to have an identifiable and reachable person for every team. There should be a clear, well-oiled mechanism to route alerts to the right responders to ensure smooth routing and escalations.

    Postmortem Failures

    Postmortems are important for incident response because they help you learn from the events that happened in the past and help you plan your future actions.

    There are various reasons that result in postmortem failures,

    • Some teams are frequently stressed with deadlines and unplanned incidents. Therefore, once the incident is resolved, no postmortems are done.
    • Sometimes postmortems end up in blame games. A good postmortem happens only when people are open to discussing problems honestly. If you are afraid of getting blamed during a postmortem, it kills the purpose of having postmortem to find solutions to problems.
    • In some cases, postmortems are done just because the process demands it and not to find answers.

    Without postmortems, you fail to recognize what’s working and where you can improve. Most importantly, they help you avoid making the same mistakes in the future. Hence, postmortems should be an integral part of the incident response process and must be done sincerely.

    (Checkout Squadcast's Postmortem Templates)

    Inflexible Policies and Processes

    Organizations find comfort in practices that return successful results and like to continue with those practices. However, at times you cannot anticipate certain events and established solutions do not work. Having flexible policies and processes can help you adapt to changing requirements and find the right solutions when needed. You don't have to be reckless and should try to introduce sensible changes. Also, don't be afraid to make changes. Some changes will slow down proceedings in the short-term but promise faster and better results in the long run.

    Putting on multiple hats

    Incidents are confusing at the best of times. People taking up different roles uninformed just adds to the confusion. In high-pressure situations, people are expected to act quickly. Also, there is limited information coming in and a lack of clarity on who needs to do what. This only makes the situation worse. Hence, it is important to define the right roles and responsibilities for people. Also, as an individual, one should keep others involved and informed about a change when needed.

    Conclusion

    Incident response is a field where we constantly look for processes and stability, but ignoring anti-patterns can be far worse than choosing optimal solutions or rigid processes.

    Incident response teams need to identify issues early on, so they can help save time, prevent frustration, and reduce refactoring in the long run. Hence, it is very important to unlearn anti-patterns and learn new processes that can help accelerate incident response.

    Squadcast is an incident management tool that’s purpose-built for SRE. Your team can get rid of unwanted alerts, receive relevant notifications, work in collaboration using the virtual incident war rooms, and use automated tools like Runbooks to eliminate toil.

    squadcast
    Written By:
    August 2, 2022
    August 2, 2022
    Share this post:
    Subscribe to our LinkedIn Newsletter to receive more educational content
    Subscribe now

    Subscribe to our latest updates

    Enter your Email Id
    Thank you! Your submission has been received!
    Oops! Something went wrong while submitting the form.
    FAQ
    More from
    Vishal Padghan
    Demystifying Digital Operations: A Comprehensive Overview
    Demystifying Digital Operations: A Comprehensive Overview
    February 16, 2024
    Introducing Squadcast and ServiceNow Integration For Enhanced Operational Efficiency & Faster Incident Management
    Introducing Squadcast and ServiceNow Integration For Enhanced Operational Efficiency & Faster Incident Management
    February 14, 2024
    System Reliability Metrics: A Comparative Guide to MTTR, MTBF, MTTD, and MTTF
    System Reliability Metrics: A Comparative Guide to MTTR, MTBF, MTTD, and MTTF
    January 29, 2024
    Learn how organizations are using Squadcast
    to maintain and improve upon their Reliability metrics
    Learn how organizations are using Squadcast to maintain and improve upon their Reliability metrics
    mapgears
    "Mapgears simplified their complex On-call Alerting process with Squadcast.
    Squadcast has helped us aggregate alerts coming in from hundreds...
    bibam
    "Bibam found their best PagerDuty alternative in Squadcast.
    By moving to Squadcast from Pagerduty, we have seen a serious reduction in alert fatigue, allowing us to focus...
    tanner
    "Squadcast helped Tanner gain system insights and boost team productivity.
    Squadcast has integrated seamlessly into our DevOps and on-call team's workflows. Thanks to their reliability...
    Alexandre Lessard
    System Analyst
    Martin do Santos
    Platform and Architecture Tech Lead
    Sandro Franchi
    CTO
    Squadcast is a leader in Incident Management on G2 Squadcast is a leader in Mid-Market IT Service Management (ITSM) Tools on G2 Squadcast is a leader in Americas IT Alerting on G2 Best IT Management Products 2022 Squadcast is a leader in Europe IT Alerting on G2 Squadcast is a leader in Mid-Market Asia Pacific Incident Management on G2 Users love Squadcast on G2
    Squadcast awarded as "Best Software" in the IT Management category by G2 🎉 Read full report here.
    What our
    customers
    have to say
    mapgears
    "Mapgears simplified their complex On-call Alerting process with Squadcast.
    Squadcast has helped us aggregate alerts coming in from hundreds of services into one single platform. We no longer have hundreds of...
    Alexandre Lessard
    System Analyst
    bibam
    "Bibam found their best PagerDuty alternative in Squadcast.
    By moving to Squadcast from Pagerduty, we have seen a serious reduction in alert fatigue, allowing us to focus...
    Martin do Santos
    Platform and Architecture Tech Lead
    tanner
    "Squadcast helped Tanner gain system insights and boost team productivity.
    Squadcast has integrated seamlessly into our DevOps and on-call team's workflows. Thanks to their reliability metrics we have...
    Sandro Franchi
    CTO
    Revamp your Incident Response.
    Peak Reliability
    Easier, Faster, More Automated with SRE.
    Incident Response Mobility
    Manage incidents on the go with Squadcast mobile app for Android and iOS devices
    google playapple store
    Squadcast is a leader in Incident Management on G2 Squadcast is a leader in Mid-Market IT Service Management (ITSM) Tools on G2 Squadcast is a leader in Americas IT Alerting on G2 Best IT Management Products 2022 Squadcast is a leader in Europe IT Alerting on G2 Squadcast is a leader in Enterprise Incident Management on G2 Users love Squadcast on G2
    Squadcast is a leader in Incident Management on G2 Squadcast is a leader in Mid-Market IT Service Management (ITSM) Tools on G2 Squadcast is a leader in Americas IT Alerting on G2
    Best IT Management Products 2022 Squadcast is a leader in Europe IT Alerting on G2 Squadcast is a leader in Enterprise Incident Management on G2
    Users love Squadcast on G2
    Copyright © Squadcast Inc. 2017-2024