Is your organization on the lookout for an incident management tool? If yes, you may wonder- am I better off building my own? This blog outlines some of the key factors to consider while choosing whether to build or buy an incident management software.
When your organisation realises that it needs an Incident Management System (IMS), the first question is almost always, “Build or Buy?” Superficially, the requirements seem simple and being a technical organisation you probably have the skills you need as well. With your deep knowledge of your internal setup, surely you can build one that’s best suited to your needs? This may seem like a solid argument towards building your own IMS, however there are some hidden factors that you may not have considered. In this blog, we look at the costs involved in building your own IMS and help you determine if the return on investment (ROI) makes it worth building one.
First, let’s quickly look at some advantages of building your own IMS.
The biggest advantage is that you can build an IMS that suits your needs perfectly. If your organization does not use Sensu, for example, then why build support for it? Instead, you can directly integrate with any on-premise monitoring tool you have in-place. If you restrict access to your production network, an off-the-shelf SaaS IMS will be difficult to use. An in-house IMS will not face such issues.
Now that we have had a look at the advantages, let’s look at the disadvantages.
When building your own IMS, it can be very difficult to estimate the total cost of ownership. In general, it is easier to get approval for a one-time cost rather than an open-ended project. The trouble with building an IMS is that more often than not the costs do not include long-term maintenance, usability and reliability.
While getting budgetary approval for the IMS, it's hard to communicate the benefits the system will bring. This is because many of the benefits of having a strong IMS platform in place are qualitative. While the on-call experience and effectiveness of engineers will definitely improve, it is hard to measure the benefits quantitatively. Convincing your management to increase the budget may become harder as time passes and additional features are required. Many organizations may not have the measuring tools necessary to decisively prove the return on investment. You may build your own dashboard that tracks the MTTR (Mean Time To Respond) but unless such metrics have been tracked even earlier, it will be a hard sell to convince management.
Off the shelf systems, on the other hand, often don’t have high upfront costs and require little commitment. A small pilot of a commercial product is an easier sell than a potentially long and expensive development project.
But is it, in fact, expensive? Let’s break it down:
Development costs: This includes the cost of assigning programmers to the task, tools required to build the IMS, and infrastructure to test and deploy it. Given a list of features, it is possible to estimate this particular cost, and it is reasonably straightforward to get funding for these expenses.
Maintenance Costs: Like any other piece of software, there will be maintenance costs associated with building an IMS. The costs associated with maintenance include fixing the bugs that crop up during the development and use of the IMS. You will also need to factor in costs when your requirements grow - this can be changes in the production applications, databases, vendor tools, or any other dependencies. As the underlying software is updated, you will also need to consider the associated security fixes. This involves setting aside time to ensure that any newly discovered security vulnerabilities don’t compromise your system. In certain scenarios, you may also need to hire external contractors to validate your security.
Since it is a mission-critical piece of software that alerts you to any problems in your entire infrastructure, you cannot neglect its maintenance. You cannot afford to delay patching any vulnerabilities or making critical fixes to your IMS. Therefore you will need to set aside dedicated and continuous engineering capacity for the maintenance of the IMS. Even if it is a part-time team, there must be someone available at short notice to make any critical fixes required. This team and the overhead of maintaining it is likely to be your single highest cost and the one most difficult to sustain.
Opportunity cost: This is one of the hidden costs that are harder to measure. While developing your own IMS, you will take away engineering capacity from other aspects of your organization. These people could have been working on your organization’s product instead of working on the IMS.
Now that we have looked at the cost of developing your own in-house platform, let us have a look at the cost incurred if you opt for an off-the-shelf incident management platform.
Usually, off-the-shelf platforms are more expensive to develop because they have to be more flexible in terms of feature set and be able to scale to a higher number of users. Fortunately, you will end up paying only a fraction of that cost, because it is shared among all the customers of the product. In fact, if you have a small team, you can get many features free of cost from several incident management platforms. In general, for a particular feature set, the cost of acquisition will be far lower with off-the-shelf systems.
Deployment and Training Cost: Off the shelf systems are usually quite flexible, but you may have to spend some time and effort to adapt your systems to it. You may have to change some of your processes or deprecate old, unsupported monitoring tools, for example. This also includes any training costs for the users in your organisation.
Usability and Features: Due to the competitive nature of the market, any off-the-shelf incident management platform will need to keep up and add features to ensure it does not fall behind. An in-house platform often stops being developed as soon as basic minimum functionality is in place. In-house platforms can have poor usability as they are built in an ad-hoc fashion by SREs without input from UX professionals. A better user interface ensures more efficiency and ease of use. Any external product will already have been used by hundreds if not thousands of users in other organizations and therefore will have a highly optimized layout. An external platform will also have the added benefit of a customer support team to answer any queries not covered by the support documentation.
These were the costs and benefits of having an in-house versus an external system. If you factor in the hidden costs, compliance, and support issues unless you are operating at the scale of Google or Facebook or are operating an esoteric system that is incompatible with external tools, investing in an in-house incident management platform makes little sense. However in the majority of cases, be it a growing or a small SRE team in a large organization, an off-the-shelf solution is significantly desirable. For most organizations, the return on investment is not substantial enough to warrant planning and developing an in-house incident management system.
Squadcast is an incident management tool that’s purpose-built for SRE. Your team can get rid of unwanted alerts, receive relevant notifications, work in collaboration using the virtual incident war rooms, and use automated tools like runbooks to eliminate toil.