The Options to Improve Outcomes and Keep Users Happy

Because system outages occur, the main choices to get past the challenges are –

Option 1 – Learn How to Apologize for System Outages

In this scenario, when an outage occurs, it’s important to indicate “ The System is down, please don’t panic ”. With this, be mindful how IT communicates an issue often means more to users than the resolution entails or the speed of rectifying the matter. With this, an admission of struggle is more forgivable than silence – especially when Client facing applications or services are unavailable. An example of how to inform people of the issue is “Sorry, the system is down – will advise of an ETA on recovery “ — and recognize neither the IT department responsible for those systems, the end users, or Customers want to see or hear this message. Even if the failure is out of IT’s control, saying sorry alone frequently is not enough – since many people are experiencing challenges with the disruption, or adapting to performing different tasks or working remotely, stress associated with the pandemic or other issues, etc. Because of this, there is a need to effectively communication with users with straightforward and clear messaging. In addition, an automated IT incident management system can lead to quicker action and a more detailed overview of how IT is handling the crisis — including reporting and managing system outages. Importantly, when relaying news on service unavailability, realize there is an art to apologizing well. While Help Desk Personnel often bear the brunt of this challenge, others in IT, including management, often need to be involved in providing assurances the organization is doing what it takes to re-establish normal operations – asap.

To apologize for system outages in a proactive way, encourage open dialogue with end users and minimize frustration with stakeholders by doing the following –

A. Be Good at Apologizing

Since “ Apologies for inconvenience “ has become an over-used phrase with no meaning any more, use other terminology to show a more progressive understanding of the challenges an outage is causing and insights on what is being done to avoid similar problems in the future that includes the defined by psychologists with –

Acknowledgement – Define what you’re apologizing for : What was the outage ? How long the outage will / did last. What was the scale of the issue – local, regional, worldwide. Own the problem and, if possible, communicate a root cause.
Empathy – State that the IT team recognizes the negative impact on the business of the system outage — and shares the frustration – even if the failure was due to a third-party issue.
Resolution – Address what IT has done or will do to avoid an outage in the future, as well as lessons learned to mitigate similar situations from happening again as well as being pro-active to minimize disruptions from other causes of downtime. With this, if the issue itself can’t be fixed, advise of a workaround. If there is no permanent fix or temporary workaround, then explain why. If there’s action IT can take, provide an ETA for its execution. This estimation is useful in the event the issue repeats itself before a resolution.

To confirm, the explanation needs to be clear and brief; and avoid jargon. This is important for users to continue having faith IT has the situation under control. Further, tact must guide honesty since if users don’t trust you, it doesn’t really matter what you say !

B. Address User Frustration

Disgruntled users will share their frustrations with the first point of contact — usually the help desk. However, since users aren’t typically frustrated with the help desk personnel, they’re really annoyed with the IT system failure and the people responsible for causing the disruption with the resulting inconvenience as well as loss of productivity, revenue, brand damage, etc. Unfortunately, things can get worse with some people becoming abusive if there is a huge negative impact of an outage or there are multiple outages in a short time frame. With there being no right or wrong in trying to reconcile these very challenging situations, it’s essential that after an incident occurs that IT be very proactive to mitigate outages going forward – or be prepared for more abuse ! Further, recognize disgruntled users share their frustrations with the first point of contact and many others including peers, the help desk, management, and anyone else who will listen ! The rule of thumb is bad news is heard by 7 times more people than good news. Because of this, don’t be the cause of issues ! To help with managing system outages situations, let users vent their concerns then offer an answer or an apology. While it’s tempting to cut people off because you have an answer, don’t – give them a chance to be heard before moving on. Once a user has expressed their frustrations, apply a similar approach to broader communication guidelines; empathize and validate users’ irritation, and remind them that you’re there to help. The help desk is on their side and wants to see them working happily again.

If a user complaint comes in after the server has been restored, rather than during the resolution process, reiterate what was already covered in your mass communication to the organization. From there, ask users if they want more information. Additionally, ask if they would like some suggestions for workarounds in the case of future issues. Explain that your goal is to reduce their inconvenience if the situation reoccurs. They might not be aware of options that seem obvious to IT – for example, Outlook is often still accessible through a web browser when there’s an issue affecting the Outlook desktop app.

Next Steps

Option 1 – Continue with the current systems
…. with the view outages are unavoidable

In this scenario, be good at apologizing by empathizing with users about the issue and clearly communicate the root cause, along with an action plan to avoid it in the future. This is important to demonstrate sincerity and you are remorseful about system outages. With this, see The Importance of High System Availability for additional insights and information on the Challenges in the On-Line, Real-Time, All-The-Time World to be aware of the consequences of outages.

— OR —

Option 2 – Select a platform or service providing 99.99999 % (7 9’s) or higher uptime
…. from when the application goes live

An example of this is the HPE Nonstop system – either as a platform or in the cloud. For insights on getting past outage issues (and not having to learn how to be good at apologizing), see Improving Business Outcomes with Nonstop .

To learn more, please contact CAIL or HPE about this strategy to –

increase IT value to the organization
improve business outcomes
reduce the work and risk to evolve IT and the business
achieve success – in the on-line, real-time, all-the-time world