Latest Posts

Latest Comments


Data centre design and the introduction of ‘unreliability’


Posted by |

The question of ‘when more becomes less’ is one that has often confronted designers of everything. Arguably the most successful tank of World War II was the simplest, the Russian T Series tanks where simpler than the German Tigers and inferior on paper. However, in the real world they were far more effective, using a more simplistic design allowed for fast mass production and easy battle field repair. The sloping armour also gave it strength without excessive weight, and ultimately these three things made it the saviour of the eastern front. Yet if you play Top Trumps the Tiger always wins!

I see the same in data centres, Top Trumps would have a Tier4 data centre winning every hand, but in the real world every layer of equipment adds another layer of possible failure, as well as additional cost and no added daily benefit. As more and more redundancy is added, complexity increases, adding more points of possible failure. Mike Christian of Yahoo! made a wonderful point when he said, “Something as simple as a border router failure can effectively knock an entire building of compute offline, regardless of how many generators it has”.

Much like the T54’s versus the Tigers, when it’s called upon to do battle will an overly-complex Tier 4 data centre, with all its layers of redundancy be any better? As Mike points out in the video many of the Yahoo! data centres have no HVAC and no generators, they simply fail-over when there’s a fault.

Don’t take this from me, I urge everyone to watch the following video for some added enlightenment. The presentation is from 2012 but highlights my point perfectly. The video includes a history of huge outages from the last decade with proven recovery solutions and strategies – it’s also very amusing!



Post a comment

Comment submitted! Comments needs approval before being displayed.