datacenter-downtime

Reducing Downtime in Data Centres

Increasingly, modern business is reliant on data centres. These facilities contain the multitude of server machines on which your business and its customers rely. Any outage will lead to an interruption in trading – and if the business is large enough, this interruption could conceivably cost a significant amount directly. What’s arguably worse, however, is the reputational damage that comes from downtime – especially if it’s persistent. If your customers come to perceive that they can’t rely on your business to be available, then they’re unlikely to keep returning for more.

Anything that can be done to minimise downtime in data centres, and mitigate the impact of the downtime you do experience, is therefore worth pursuing.

datacenter-downtime

1. Equipment Maintenance

If the equipment in your data centre isn’t properly cared for, then it’s a near-inevitability that you’ll eventually experience downtime. It’s difficult to say with certainty when a piece of hardware is nearing the end of its lifespan, but we can anticipate where failures are most likely, and put in place redundancies and failsafes to mitigate the impact of any failures.

In some cases, it might be possible to conduct repairs without bringing in replacement items. For example, you might repair a power distribution unit using cheaply available components, which you’ve kept to hand, such that the system can safely handle that little bit more amperage.

2. Consider the possibility of Natural Disasters

Dealing with the staff and equipment at your data centre is possible – but preventing natural disasters is quite another. These events might occur only very rarely, but they can inflict significant and lasting damage on the data centre.

The best defence against natural disaster comes from your choice of location. If you’re in the middle of a flood-plain, then it’s a question of when rather than if when it comes to natural disaster. If you can’t avoid the threat, then the next best thing is to prepare for it.

Conduct drills, and put in place structures to repel the problem while you figure out a way to counter it. You might want to also consider the protection offered by your insurance.

3. Human error

Human error accounts for a significant portion of data centre outages. The right training and procedure will help you to lower the impact of human error. Have your staff adopt the right habits, and use methods like pointing and calling to minimise oversights. You might also introduce more AI-driven decision-making into your centre, thereby freeing yourself from the fallibility of human employees!

4. Guarding against Power Outages

If the power goes out unexpectedly, it’s vital that your servers have time enough to shut down properly and minimise data loss. A Uninterruptable Power Supply isn’t designed to keep a whole data centre online for a significant length of time, but it will buy you the half-an-hour you need to stay online during short outages.

Salman Zafar

Your Thoughts

This site uses Akismet to reduce spam. Learn how your comment data is processed.