Our last blog post ‘Using a Status Page in your Incident response process’ established what a status page is and how it plays an important role in your incident response processes, this article is to understand the different information components of a status page.
When something goes down, the first thing a customer does is check if there is something wrong with their systems or if it is an issue with one of their service providers. So it’s important to make sure that your status page has all the information that is needed where they don’t feel the need to raise an issue or create a ticket, adding to your support costs.
The most essential information needed on the Status Page is to understand which component of the service has been degraded or had an outage.
Every system has some users who may need to know the status of the system from its Status Page. Now, the system can be thought to be made up of multiple components, which in turn consist of services and microservices - some are user facing and some are integral in the functioning of a user facing component while other components could be purely internal. Note that users of these components are not just your end customers but could also be internal to your team like other engineers or folks from the product or customer success teams. It is important to lay down a branched overview of all the user facing and their respective dependent components to show their operational status. The easiest way to break this down would be to pick a component that is:
Once you have a branched overview of the various components that your users are interested in, the first section of your status page will have a list or table outlining these components against which the operational status of the components need to be added. The operational status indicates the current working conditions of the service or component - that is, if it is fully operational or partially degraded or if there’s an outage.
In some cases, the first section can be a graphical representation of the uptime history of the components to showcase the reliability of the system over a period of time. This also holds the uptime statistic such as “99.988% uptime for the last 90 days” which is useful for the customer to understand any downtime patterns.
The next section is typically the incident history which provides a day-wise account of system impacting incidents that have occurred along with their current resolution status.
This gives users an idea of the type of incidents that have occurred and the resolution actions taken on it which builds trust with your customer because they’re constantly informed of the resolution activity and the level of impact any incident has on them.
When a service impacting incident occurs, it is important to not just communicate that a service impacting incident has occurred but also keep them in the loop to update them on what’s happening on the resolution front. This updation is usually shown in the incident history section of the status page.
Most status pages allow their users to subscribe to updates in some form (usually email). Any incident related update that is pushed to status page will go out to the subscribers. This is the easiest form of automated customer communication that lets teams focus on resolution and just push updates to the status page instead of worrying about writing emails or dealing with communication silos within the organization.
Before status pages became popular as a communication tool, organizations updated outages and service degradations on twitter. This was the easiest way to reach a massive audience and most importantly customers that pay for their service.
Today, you can add twitter updates also as a section on the status page just to let your customers know that there is one other form of communication you use to update people on any downtime. However, in today’s world, having a status page has become the de facto standard and people rarely think of checking their twitter feeds to see if a service is facing downtime and instead subscribe to status page updates or just go straight to the status page.
To know more about what we’re building, check out our public Product Roadmap. Feel free to drop us your ideas and feature requests at [email protected].