Why no downtime is impossible.
Downtime, server outages, software problems, lack of internet, website failure, we know the symptoms of it well. Have you heard the words: “We cannot have downtime, ever” when discussing IT needs?
Of course the goal of good IT is to prevent and reduce downtime within the best of our ability, what we find though is that many vendors are now actively selling solutions which promise the impossible “zero downtime” when these solutions can and have increased downtime for organisations.
Zero downtime is one of the most expensive and difficult concepts to implement. We’re going to walk you through a basic system design to show you why this is an unrealistic request.
In this scenario we have the following hardware for a 10 user office.
- 1 server, sharing files and controlling usernames/passwords.
- 1 file backup storage device
- 10 computers/laptops.
- 2 printers.
- 1 power supply battery for the server. (UPS)
- 2 Fibre Internet connections with different providers.
We’ll assume the hardware was bought at the mid-range end of the market, favourable recommendations and reviews online, good support from the manufacturer and has a reliable track record of being fit for purpose.
Think about each of these things and what can and does go wrong with them, does the internet line fail because someone dug up the road and broke the cables? Power goes down to the building you are in? Burst water pipe floods the equipment?
Not everything is within our control but here are 3 areas you can look at to control downtime.
1 = Schedule Routine Maintenance
Did you know that your server needs to be patched and updated? Whether the software is Microsoft, Apple, Linux, most operating systems need patching and maintaining, often requiring a restart of the server.
This block of downtime has to happen somewhere, if your server has not been rebooted in 3–6 months, or in some cases 12–18 months. You are most likely running with live security holes in the system.
Now replicate that to your computers, your printers and your server backup power supply battery. If proactive maintenance does not happen when planned for, you will find yourself with an outage when you least want it.
2 = Test Your File Backups
Hopefully your files are being backed up on a daily basis but do you know how long it takes to restore those files if disaster strikes?
If you are using a cloud backup, have you tested how long it takes to download those files back? It might even take days or weeks depending on how much data you have.
3 = Hardware Failure
Numerous things break down, from cars to the plumbing system, sometimes your IT hardware will fail on you when you least want it to.
You need to work out how long it would take to replace the broken component, does it take 7–14 days to deliver the new part? Do you have a warranty with the manufacturer to get you the parts within a specified period of time?
How much should I spend to prevent downtime?
With this, we are going to make some very big assumptions as every business, every environment needs to be reviewed for the best solution. So whilst we are making grand assumptions with the below, try to think about the principle.
Let’s pretend that Box 1 costs £2000, Box 1 will prevent 4 hours of downtime. The cost of 4 hours of downtime to the business where no one can work is £7000. Clearly it makes sense to invest in Box 1.
For another business, the cost of the 4 hours of downtime might be £1000, so initially Box 1 might not be needed, but what if you had multiple failures across a year? 3 blocks of 4 hour downtime = £3000.
You have to review the cost of downtime versus the cost to remove that risk. Based on the goals and needs of the business. Why not seek out an experienced consultant to help you with this process?
But we are too big to have downtime!
Amazon, Microsoft, Google, Apple, Four huge tech companies all having issues affecting downtime, from faulty settings to hardware failures. Are you really too big to have downtime? If the largest companies in the world suffer from it, is it a realistic goal to set the challenge of zero downtime?
Now that is not to say that we cannot design a system which removes downtime. There is plenty of good work that can be done. Before you spend any money, ask yourself the following.
- How much downtime can we afford versus the cost of removing that downtime?
- Do we need to spend our money not just on hardware but on the right people?
- Will spending lots of money make the system safer? Does the design work?
We can and will have downtime, natural disasters and environmental problems make it impossible to prevent. What we must do is determine what downtime is acceptable to the business, 15 minutes? 1 hour? 4 hours? 1 day? Then design our IT systems based on that target goal.