Cloud IaaS: Converged Infrastructure or Reference Architecture?

This last fortnight there’s been a cacophony of hyperbole and at times marketing fluff from vendors and analysts with regards to Reference Architectures and Converged Infrastructures. As IBM launched PureSystems, NetApp & Cisco decided it was also a good time to reiterate their strong partnership with FlexPod. In the midst of this, EMC decided to release their new and rather salaciously titled VSPEX. From the remnants and ashes of all these new product names and fancy launch conferences, the resultant war blogs and Twitterati battles ensued. As I poignantly watched on from the trenches in an almost Siegfried Sassoon moment, it was quickly becoming evident that there was now an even more ambiguous understanding of what distinguishes a Converged Infrastructure from a Reference Architecture, what it’s relation was with the Private Cloud and more importantly whether you, the end user should even care.

There’s a huge and justified commotion in the industry over Private Cloud because with lower costs, reduced complexity and greater data center agility, the advantages are compelling for any business looking to streamline and optimize its IT. In the pursuit of attaining such benefits and ensuring a successful Private Cloud deployment, one of the most critical components that need to be considered is that of the infrastructure and its underlying resource pools. With resource pools being the foundation of rapid elasticity and instantaneous provisioning, a Private Cloud’s success ultimately depends on the stability, reliability, scalability and performance of its infrastructure. With existent datacenters commonly accommodating legacy servers that require a refresh or new multiprocessor servers that are entrenched between an old and insufficient network infrastructure, one of the main challenges of a Private Cloud deployment is how to upgrade it without introducing risk. With this challenge and the industry’s pressing need for an economically viable answer, the solution was quickly conceived and baptized as “Converged Infrastructure”. Sadly like all great ideas and concepts, competition and marketing fluff quickly tainted the lucidity of such an obvious solution by introducing other terms such as “Reference Architectures” and “Single Stack Solutions”. Even more confusing was the launch of vendor products that used such terms synonymously, together or as separate distinct entities. So what exactly differentiates these terms and which is the best solution to meet the infrastructure challenge of a Private Cloud deployment?

Reference Architectures for all intents and purposes are essentially just whitepaper-based solutions that are derived from previously successful configurations. Using various vendor solutions and leveraging their mutual partnerships & alliances, Reference Architectures are typically integrated and validated platforms built from server, network and storage components with an overlying hypervisor. NetApp’s FlexPod and EMC’s VSPEX fall into this category and both invariably point to their flexibility as a major benefit as they enable end users to mix and match as long as there remains a resemblance to the reference. With open APIs to various management tools, Reference Architectures are cleverly marketed as a quick, easy to deploy and risk free infrastructure solution for Private Clouds. Indeed Reference Architectures are a great solution for a low budget SMB that is looking to introduce itself to the world of Cloud. As for a company that is either in or bordering on the Enterprise space and looking to seriously deploy their workloads onto a Private Cloud, it's important to remember that sometimes things that are great on paper can still end up being a horrible mess in reality – anyone who's watched Lynch's Dune can pay testament to that.

The difficulty with Reference Architectures is that fundamentally they still have no hardened solution configuration parameters and ironically what they term an advantage i.e. flexibility, is actually their main flaw as their piece by piece approach of using solutions from many different vendors merely masquerades the same old problems. Due to being whitepaper solutions, integration of specific components is only documented as a high level overview with component ‘a’ being detailed as compatible with component ‘c’. With regards to the specifics and how these components integrate in detail, these are simply not available or realized until the Reference Architecture is cobbled together by the end user, who ultimately assumes all of the risk and financial obligation to ensure it not only works correctly but is also performing at optimum levels. This haphazard trial and error approach is counterproductive to the accelerated, pre-integrated, pretested and optimized model that is required by the infrastructure of a Private Cloud.

Furthermore Reference Architectures are based on static deployments of sizing and architecture that typically has little relation to the end users actual environment or needs, posing a problem whenever reconfiguration or resizing is required. With end users being left to resize and consequently reconfigure & reintegrate their solution, they also have to constantly find a way to integrate their existing toolsets with the open APIs. This subsequently eliminates a lot of the benefits associated with “quick time to value” as many deployment projects get caught up in the quagmire of such triviality. Added to this, once you’ve begun resizing or customizing your architecture, you’ve actually made changes that are a deviation from the proposed standard and hence no longer recognizable to the original reference. This leads to the other complication with Reference Architectures, namely support issues.

With more than 90% of support calls being related to logical configuration issues, they are more often than not an occurrence of bugs or incompatibility issues. When the vendor has no responsibility or knowledge of that logical build based on the fact that they meet your “requirements” to be flexible, the situation doesn’t bode any better than when you have a traditional infrastructure deployment. Vendor finger pointing is one the most frustrating experiences you inevitably have to face when deploying an IT infrastructure in the traditional way. Being on a 4am conference call during a Priority 1 with the different organizational silos and the numerous vendors that make up the infrastructure is a painful experience I’ve personally had to face. It’s not a pretty sight when you’re impatiently waiting for a resolution while the networking company blames the firmware on the Storage and the Storage vendor blames the bugs with the servers while all the time you are sitting their watching your CEO’s face turn into a tomato while the vein in his neck throbs incessantly. When you log a support call for your reference architecture who is actually responsible? Is it the company you bought it from or one of the many manufacturers that you used to assemble your self-built masterpiece? Furthermore which of those manufacturers or vendors will take full responsibility when you’ve ended up building, implementing and customizing the architecture yourself? Even at the point of deployment, the Reference Architecture carries elements of ambiguity for the end user ranging from which software and firmware releases to run to who is responsible for the regression testing of the logical build. For instance what if you decide to proactively update to one of your components’ latest firmware releases and then find out it’s not compatible with another of your components? Who owns the risk? Also for example if you buy a “flexible” Reference Architecture from vendor X, how will vendor X be able to distinguish what it is you’ve actually deployed and how it’s configured without having to spend an aeon on the phone doing a fact finding session, all while your key applications are down? Reference Architectures are great for a test environment or simple cheap and cheerful solution but using them as a platform to take key applications to the Cloud reeks of more 4am conference calls and exploding tomatoes.

Single Stack Infrastructures on the other hand while also sometimes marketed as a Converged Infrastructure or a “flexible” Reference Architecture (or sometimes both!) are another completely distinct offering in the market. These solutions are typically marketed as “All-in-one” solutions, and come in a various number of guises. Products such as Oracle’s Exadata and Exalogic, Dell’s vStart, HP’s CloudSystem Matrix and IBM’s PureSystems are all examples of the Single Stack solution where the vendors have tightly defined software stacks above the virtualization layer. Such solutions will also combine a bundled infrastructure and service offerings making them potential “Clouds in a Box”. While on the outset these seem ideal and quick to deploy and manage, there are actually a number of challenges with the Single Stack solution. The first challenge is that the Single Stack will always provide you their own inherent components regardless of whether they are inferior to other products in the market. So for example, instead of having network switches from the well established Cisco or Brocade, if you opt with the HP solution you’re looking at HP’s ProCurve, 3Com, H3C and TippingPoint. Worse still is if you go with the Oracle stack you’re condemned to have OracleVM as opposed to the market leading and technically superior VMware. Another challenge is that you’re also tied down to that one vendor and are now a victim of vendor lock-in. Instead of just having infrastructure that will fit your existing software toolset and service management, you will inevitably have to rip and replace these with the Single Stack’s product set. Additionally these complex and non-integrated software and hardware stacks require significant time to deploy and integrate, reducing a considerable amount of the value that comes from an accelerated deployment.

A true converged infrastructure is one that is not only pretested and preconfigured but also and more importantly pre-integrated; in other words it ships out as a single SKU and product to the customer. While it may use different components from different vendors, they are still components that are from market leaders and are well established in the Enterprise space. Furthermore while it may not have the “flexibility” of a Reference Architecture, it’s the rigidity and adherence to predefined standards that make the Converged Infrastructure the ideal fit for serious contenders who are looking for a robust, scalable, simply supported and accelerated Private Cloud infrastructure. The only solution that is on the market that fits that category is VCE's Vblock. By being built, tested, pre-integrated and configured before being sent to the end user as a single product, the Converged Infrastructure for the Amsterdam datacenter will be exactly the same as the deployment in Bangalore, Shanghai, Dubai, New York and London. In this instance the shipped Converged Infrastructure merely requires the end user to plug in and supply network connectivity.

With such a model, support issues are quickly resolved and vendor finger-pointing is eliminated. For example the support call is with one vendor (the Converged Infrastructure manufacturer) and they alone are the owner of the ticket because the Converged Infrastructure is their product. Moreover once a product model of a converged infrastructure has been shipped out, problems that may potentially be faced by a customer in Madrid can easily be replicated and tested on a like for like lab with the same product in London, rapidly resolving performance issues or trouble tickets.

Deploying a preconfigured, pretested and pre-integrated standardized model can also quickly eliminate issues with firmware updates and patching. With traditional deployments, keeping patches and firmwares up to date with multiple vendors, components and devices can be an operational role by itself. You would first have to assess the criticality of each patch and relevance to each platform as well as validate firmware compatibility with other components. Additionally you’d also need to validate the patches by creating ‘mirrored’ Production Test Labs and then also have to figure out what your rollback mechanism is if there are any issues. By having a pre-integrated Converged Infrastructure all of this laborious and tedious complication is removed. All patches and firmwares can be pretested and validated on standardized platforms in labs that are exactly the same as the standardized platforms that reside in your datacenter. Instead of a multitude of updates from a multitude of vendors each year, a converged infrastructure offers the opportunity to have a single matrix that upgrades the infrastructure as a whole and risk free.

The other distinctive feature of a Converged Infrastructure is its accelerated deployment. By being shipped to the customer as a ready assembled, logically configured product and solution, typical deployments can range from only 30-45 days i.e. from procurement to production. In contrast other solutions such as Reference Architectures could take twice as long if not longer as the staging, racking and logical build is still required once delivered to the customer. It’s this speed of deployment which makes the Converged Infrastructure the ideal solution for Private Cloud deployments and an immediate reduction in your total cost of ownership, especially when the business or application owners demands an instant platform for their new projects.

The other benefit of having a company that continuously builds standardized and consistent infrastructures that are configured and deployed for key applications such as Oracle, SAP or Exchange is that you end up with an infrastructure that not only consolidates your footprint and accelerates your time to deployment but also optimizes and in most cases improves the performance of your key apps. I’ve recently seen a customer gain a 300% performance improvement with their Oracle databases once they decided to migrate them off their Enterprise Storage Arrays, SPAARC servers and SAN switches in favour of a Converged Infrastructure, i.e. the Vblock. Of course there were a number of questions, head scratching and pontifications as to what was seemingly inexplicable; “how could you provide such performance when we’ve spent months optimizing our infrastructure?” The answer is straightforward in that regardless of how good an engineering team you have, it is rare that they are solely focused on building a standardized infrastructure on a daily basis that is customized for a key application and is factoring all of the components comprehensively.

To elaborate, typically customers will have an in house engineering department where they’ll have a Storage team, a Server team, a Network team, an Apps team, a SAN team etc. All of these silos then need to share their expertise and somehow correlate them together prior to building the infrastructure. Compare this to VCE and the Converged Infrastructure approach, where instead there are dedicated engineering teams for each step of the building process whose expertise is centred and focused upon a single enabling platform, i.e. the Vblock. Firstly there’s the engineering team that does the physical build (including thermals, power efficiency, cooling, cabling, equipment layout for upgrade paths etc.). This is then passed on to another dedicated engineering team that takes that infrastructure and certifies the software releases as well as test the logical build configurations all the way through to the hypervisor. There’s then another engineering organization that’s sole purpose is to test applications that are commonly deployed on these Vblock infrastructures such as Oracle, SAP, Exchange, VDI etc. This enables the customer that orders for example an “Oracle Vblock” to have an infrastructure that was specifically adapted both logically and physically to not only meet the needs of their Oracle workloads but also optimize its performance. This is just a glimpse of the pre-sales aspect; post sales you have a dedicated team responsible for the product roadmap of the entire infrastructure ensuring that software or component updates are checked and advised to customers once they are deemed suitable for a production environment. The list of dedicated teams goes on but the common denominator is that they are all part of a seamless process that aims at delivering and supporting an infrastructure designed and purpose built for mission critical application optimization.

So whether you’re feeling Pure, Flexy or Spexy the key thing is to distinguish between Reference Architectures, Single Stack Solutions and the Vblock i.e. a Converged Infrastructure and align the right solution to the right business challenge. For fun and adventure I'd always purchase a kit car over a factory built car. I'd have great fun building it from all the components available to me and have it based on my Reference handbook. I could even customize my kit car with a 20 inch exhaust pipe, Dr. Dre hydraulics and fluffy dice because it's flexible just like a Reference Architecture. Alternatively because I love Audi so much I could buy an Audi car that has all of its components made by Audi. So that means ripping out the Alpine CD player for an Audi one, the BOSE speakers for Audi ones and even removing the Michelin tyres for some new Audi ones, regardless of whether they're any good or if they’re just OEM’d from a budget manufacturer - just like a Single Stack Solution. Ultimately if I'm serious about performance and reliability I'll just buy a manufactured Audi S8 that's pre-integrated and deployed from the factory with the best of breed components. Sure I can choose the colour, I can decide on the interior etc. but it's still built to a standard that's designed and engineered to perform. Much like a Converged Infrastructure, while I may choose to have a certain amount of CPU for my Server blades and a certain amount of IOPS and capacity for Storage, I still have a standardized model that's designed and engineered to perform and scale at optimum levels. For a Private or Hybrid Cloud infrastructure that successfully hosts and optimizes critical applications as well as de-risk their virtualization, the solution can only mean one thing - it's Converged.