Hotmail downtime and the need for judicious failover testing


Microsoft has confirmed that is was a firmware upgrade that caused the meltdown of the Hotmail and Outlook email systems last month, leaving users without email for 16 hours. As well as losing email functionality, the company’s Skydrive service was also temporarily disrupted. The software upgrade encountered an unexpected issue, leading to a substantial, fast temperature rise within the data centre and a removal of functionality from key services.

Though the problems could not have been predicted, this incident represents another example of data centres that do not have the necessary monitoring and failover testing protocols in place to prevent issues from escalating into a full-blown outage. Keeping the temperature constant is vital for efficient data centre management, and there are several methods to ensure it stays put. But when a fault occurs, what is important is that you have been carrying out regular failover testing on full load, ensuring that when power is lost, the failover works.

For a private user, the Hotmail downtime will probably have been little more than an annoyance. For a business however, 16 hours without email, and without access to the Cloud, could be disastrous. It’s a risk not many can afford to take, and for that reason it is important that you keep lines of communication with your data centre provider open at all times. For guidance on what to look for when trying to find a truly business-grade data centre, I’d recommend a read of our ‘top eight reasons why data centre solutions fail’ document – this should arm you with a wealth of knowledge of good data centre practice, allowing you to effectively identify potential bumps in the road ahead of time.



