Latest Posts

Latest Comments


Amazon fails to learn redundancy lesson


Posted by |

Who says lightning doesn’t strike twice in the same place?

Thousands of Amazon customers worldwide have been left to twiddle their thumbs for an estimated 24-48 hours while the company seeks to restore their connections to its Cloud based services in its Ireland data centre.

A significant power outage last night took down a large portion of Amazon’s hosting/virtual hosting estate and has also impacted its data storage capabilities. According to Amazon’s status page (Aug 7 11:04 PM PDT update):

“Due to the scale of the power disruption, a large number of EBS servers lost power and require manual operations before volumes can be restored. Restoring these volumes requires that we make an extra copy of all data, which has consumed most spare capacity and slowed our recovery process.”

Unbelievably, this isn’t something new. On 21 April this year, the BBC reported: “Scores of well-known websites have been unavailable for large parts of Thursday because of problems with Amazon’s web hosting service.”

Of course, any service provider can suffer an outage due to circumstances beyond its control, be it a lightning strike or otherwise. What distinguishes one provider from another is what mechanisms it has in place to deal with these events. And as Amazon’s customers – and their customers’ customers – face up to two days lost productivity, they must be asking themselves: “How can this keep happening?”

This prolonged outage for its customers looks like a failed commitment to providing levels of redundancy to properly underpin customer care. We know from relentless investment in our own hosting services that it IS possible to ensure customers have almost 24/7 access to their data in third-party hosted or virtual environments – in fact we’ve hit 100% availability for the last 3 years due to relentless rolling investment in our infrastructure.

5 questions to ask your Cloud services provider
So if you’re considering moving your data and applications to the Cloud, here are five key questions to ask:

1. When did you last suffer an outage?
2. How long were your customers offline?
3. Are your hosted services in your own data centre or are you reliant on third parties?
4. What is your guaranteed failover time?
5. Do you automatically replicate data to a geographically diverse location?

If the Amazon incident proves nothing else, it certainly shows that placing your faith and your business’s reputation in the hands of big brand names isn’t necessarily the best way forward.



Post a comment

Comment submitted! Comments needs approval before being displayed.