Open Compute Project cuts failure rates by factor of three

Frank Frankovsky, vice president of infrastructure at Facebook

Headed by Facebook’s vice president of infrastructure Frank Frankofvky, the Open Compute Project was launched two years ago in a bid to integrate open source principles into the design of datacentre infrastructure including servers, switches, cooling and rack design. And though none of the big three – Amazon Web Services, Microsoft and Google – have joined OCP the designs being developed by the community have already demonstrated significant optimisations over today’s datacentre kit.

“There was so much pent up demand for something like this because there are so many people that care about their infrastructure, the same reasons that people engaged with open source software – so they can modify the software for their particular needs – that same thing is now happening in the physical infrastructure,” Frankovsky said at the Structure Europe event in London in September.

The project began with the work that was being done at Facebook, where the company designs its own datacentre infrastructure – much like Google, Microsoft and Amazon. To catalyse the project the social-media giant open sourced its datacentre designs, server designs and power distribution infrastructure. Now the project has expanded to tackle high I/O storage integration through a project called Open Vault (codenamed “knox” for those who have been following the project), and network switches.

“We had these lovely crafted islands of open source technology that were connected to the rest of the world through proprietary black boxes which is when we decided it was time to open up the appliance models, which is really to design a bare minimum hardware switch, a pre-boot environment so that people can really make the choice between the best software and the best hardware at the switch layer, unlike today where when you purchase a switch and it comes pretty much pre-installed with that vendor’s software,” Frankovsky says.

“If its ancillary to the design it should be removed, not only from a cost and efficiency perspective – because a lot of those ancillary features just consume power without doing useful work, but it also helps reduce the amounts of materials you would decommission later on,” he says, adding that the server designs even call for removing the front mesh and plastic housing to boost cooling and energy efficiency.

When Frankofsky compared average failure rates between Facebook’s US datacentres, which run on a combination of OCP and non-OCP hardware, and the company’s new datacentre in Luleå, Sweden, which is 100 per cent OCP-based, he says failure rates (measured in instances where technicians are called to sort out an issue) declined from three per cent to one per cent, a fairly significant gain in efficiency.

“From a testing perspectives, the depth of testing that can be achieved when you’re doing bespoke design is so much greater.  If you look at the challenge of some of the large incumbent providers have, they try to design the smallest number of products to meet the widest possible market spec, and that leads to pretty wide and shallow amount of testing. The ability to really do deep, targeted testing from a QA perspective also leads to higher quality.”

Frankofsky believes that the fact open source has been so successful in software, particularly with Linux and – for clouds specifically – Open Stack means that the same approach should be leveraged in the hardware space for the benefits it can deliver in the form of scalability, efficiency and interoperability.

OCP is now looking at what is known as the disaggregated rack, which Frankovsky says will help enterprises and cloud service providers standardise on lower-cost, commodity hardware, and allow suppliers like HP or IBM devote their resources to other more value-adding activities.

“The vision of the disaggregated rack is, how can we create sleds of commodity components – CPU sleds, memory sleds, NAND sleds, disk sleds, so that at the last hour – as we learn more about the way the software is going to exercise the hardware, we can modify the hardware much closer to the time of need. So it’s almost like a just in time kind of inventory technique, but applied at the technical level,” Frankovsky says, adding that emerging low-latency interconnection technology like silicon photonics like those recently commercialised by Intel will be a key enabler here. He says they expect to be at the proof-of-concept stage by the beginning of 2014.

The idea of creating a very modular physical system also means that datacentre infrastructure can be upgraded at a granular level that has never really been possible before, which has the potential to fundamentally alter datacentre economics – in favour of service providers – and provide the industry with proven standard architectures and designs to build against.

“These aren’t new promises, these things have been promised to us by the industry for many many years, and there has been some really beautiful but proprietary closed designs that have attempted this in the past. I think the reason this is going to be successful this time is because it’s going to be open sourced,” Frankovsky says.

 

 

 

 

 
[“Source- businesscloudnews”]