This blog title should probably be something like Peta/Exabyte rather than Tera. Still to continue the theme…
In my last two blogs (Terabyte…well Everything Tera Actually! (Part 1- the Servers) and Tera…well Everything Actually! (Part 2 - the Network – slower than the speed of light!) ) I described the tremendous progress in the underlying infrastructures of servers and networks underpinning clouds. I wanted to continue this blog series looking at the state of play in storage, in particular, sustaining clouds using virtualization platforms such as VMware vSphere 4.1.
In many of the client sites that we engage with, we see that one of the areas holding back full scale virtualization of all x86 workloads is the trust in being able to handle large VMware clusters with terabytes of storage hanging off the back. Some customers simply don’t believe this is possible. This basically slows the industry progression and effectively stymies the benefits extolled of virtualization.
Over the last couple of years storage vendors have stepped up to the plate to tackle some of these issues head on. This is typically through management integration allowing the storage to be effectively visible within VMware vCenter (or other appropriate management interface) and to facilitate streamlined provisioning of resources. However, scaling to a full vSphere cluster has still been a non-trivial task. VMware has being doing its bit in opening additional integration API frameworks allowing the underlying infrastructure to offload many tasks that they were specifically designed for.
During the tail end of 2009 and into 2010 we have seen massive improvements in storage infrastructures supporting the full capabilities of the VMware vSphere platform. The subject area is simply too large to cover in a single blog, and there are some excellent references that go into the nitty gritty of VMware and storage. In this post, I wanted to highlight a couple of areas regarding storage innovation related to virtualization effectively offloading tasks from software and dramatically speed things up all round particularly for the Cloud deployments.
What’s Been Happening with Storage?
I will be taking the EMC storage solutions as the example in this blog, due to their pervasive nature. However, other vendors typically have ‘similar’ functionality, albeit with different names. Hopefully you will be able to find the relevant technology within the platform of your choice. I will look at some of the milestone breakthroughs in vSphere 4.1 and how EMC with its storage technologies is backing up those capabilities. There is a lot here, so bear with me.
During the last 2 years or so EMC & VMware have delivered some great stuff for clients to support the official vSphere 4.1 configuration maximums:
· VMware vStorage VAAI (APIs for Array Integration – initially for block-based storage) which really starts to allow those highly optimized storage arrays to take on the storage operations previously performed by VMware vSphere software. Some specific features of this are:
o Block Zeroing - offload large, block-level write operations of zeros from virtual servers to the array. vSphere using an array supporting Block Zero, sends the zero block once only, accompanied by a SCSI instruction to "write this block xx times". Imagine the effect of this in an environment with thousands of VMs running!
o Full Copy – effectively allows vSphere to tell the array it should perform the copy operation and let vSphere know when this is completed. Previously ESX would have had to control the entire operation block by block. If this was an operation to create a VM from a template, ESX would have read block-by-block the data out of the template and re-written that to the VM it was creating. This is a tremendous amount of I/O. vSphere effectively tells the array it wants to create the VM from the template. The array makes a rapid copy of the template to the VM and on completion simply sends an acknowledgement to vSphere. Imagine this happening for the creation of a 1,000 VM (as in the case for virtual desktops)!
o Hardware Assisted Locking – allows vSphere to lock only the block it is writing to. In the past the entire volume would have been locked to let one VM complete write operations. This in turn can cause queuing of I/O at the ESX server itself. This feature allows all those simultaneous writes to different blocks of different VMs on the same LUN to be serviced concurrently. Imagine here thousands of VMs all starting up (think virtual desktops or datacenter restart)!
· Intelligent multipathing from EMC PowerPath/VE
o PowerPath/VE is the latest generation of multipathing software from EMC that runs within the ESX server itself. This allows those pipes going out to the SAN fabric to be used fully in parallel.
o ESX provides NMP (Native multipathing) round-robin based schemes to balance storage traffic, but this lacks the intelligence and automatic nature PowerPath/VE brings to the equation (examining number, size, type, and latency of I/Os, history of latency/throughput of I/Os before making a routing decision). All the SAN paths are used to the maximum all the time. Intelligent path testing verifies link availability. All this is done automatically! Think of entire ESX clusters all being able to automatically tune their storage I/O to get the best throughput on all links.
o This can result in around 30-40% better throughput, which translates to less HBAs, less CPU usage and more VMs per ESX host server. By the way, data can also be encrypted in transit through PowerPath!
· EMC FAST – Fully Automated Storage Tiering
o This allows sub-LUN data to be relocated to different storage tiers within storage pools transparently with features such as:
§ Changing RAID protection types on the fly as LUNs and volumes are moved
§ Support large scale movements simultaneously
§ Fast relocation with low impact on the array and applications such as ESX without disruption
o Data can be effectively moved based on I/O load – as vSphere DRS does for VMs on loading. As that changes over time, data moves to the most cost effective storage possible within the array. This is a real time and cost saver in Cloud environments!
o As the number of VMs grows, and the data encapsulated within, this becomes an increasingly important feature. Think of thousands of VMs which on startup need to have a high performance for the operating system virtual disks and all that data moving in background automatically to high speed disks! With VM startup complete, that same data will move to lower performance and cost effective storage tiers automatically, and allow other data areas such as databases to move up to that high performance tier. All automagically!
· SSD – Solid State Disks
o This is the dream of storage – no spinning parts, lightning fast performance, low power use and the ability to instantly improve application performance through simply slotting one of these into your array.
o EMC through offering EFD (Enterprise Flash Disks) in its arrays really allows customers to sample and productively use this great technology. Everyone complains about the price compared to normal disk, but nobody mentions that there are up to 30+ times the I/O performance available per disk.
o I like to think of 2010 +/-3 years as the coming of age of the SSD. The industry is responding by offering this technology and costs are coming down.
o Where virtualization is concerned, this technology really helps to support FAST and some of the VAAI enhancements. The ability of the array to near-instantly satisfy storage I/O demand filters all the way back to the applications and users. When thousands of VMs are competing for I/O then this is definitely one of those magic components that can be used
· EMC Virtual Provisioning – thin provisioning without the issues of doing this in the vSphere ESX software
o This allows over-provisioning of storage – as vSphere allows vCPU/vRAM to be overprovisioned. The array effectively presents a ‘fictitious’ storage capacity to an ESX host and VM, but consumes space only as needed from a shared pool. This improves total cost of ownership (TCO) by reducing initial over-allocation of storage capacity.
o EMC allows this to be managed automatically in the array facilitating simpler data layout and automated wide striping (as devices are added) supporting dynamic storage growth. No need to use the ESX feature for Virtual Thin Disk Provisioning.
o Shrinking of pools is also supported by repacking the data spread across the pool onto fewer disks (devices). This is almost like an array level defragmentation tool for the entire storage pool.
· EMC Auto-Provisioning Groups – one of the unsung heroes!
o This is a great feature that vastly simplifies storage provisioning for ESX clusters. Compliant virtual storage infrastructure supporting the needs of VMotion operations, or the addition of a new ESX host into a cluster is effectively provided.
o Cloud dynamics force one to think about large scale concurrent operations performed in the background. This feature facilitates concurrency of operations such that storage can be provisioned in minutes for the entire Cloud built on ESX clusters. This is a big deal in organizations typically having long delays before provisioning occurs; not any more – simply provision with Virtual Provisioning & auto-provisioning groups to get started and then add capacity in the background to the relevant storage pools!
· Deduplication of Primary Storage – still evolving.
o This has the potential to further save storage space by using a single instance storage notion for data held in VMs located across VMFS volumes.
o Imagine thousands of VMs running the same operating system (OS). There is a huge duplication of OS files tying up costly storage. Deduplication promises to reduce the number of copies of static files to literally a single version physically stored, with all VMs having effectively pointers to the original. If that data changes, then a copy of the changed data is made.
o The potential here is huge. With the Data Domain technology of EMC, we see the big gains of deduplication within backup infrastructures. I believe this will be extended o primary storage with the speed that is required in Cloud infrastructures.
· EMC VPlex – storage array virtualization appliances
o As we progress further into the cloud, one clear fact that comes to mind is that nothing is ever big enough! No single server or array can ever handle it all.
o VPlex is EMC’s solution (not its first attempt – think Invista) for grouping together arrays and presenting them as a huge pool of storage regardless of location. This is virtualization of arrays, this is scale-out for storage pools – this is storage federation to a whole new scale. Indeed this technology allows distributed datacenters to be seen as a single large virtual construct.
o Scott Loewe gives a great presentation this subject. I think we are in the early days of VPlex, and its roadmap looks very promising.
o This technology effectively provides the ability to do long-distance VMotion operations enabling a whole new slew of use cases for the virtualization industry
· EMC V-Max – storage support for Cloud infrastructures
o EMC has many storage platforms, but the one I feel really brings all this magic together is the V-Max with its ability to have over 2,000 disks in a single frame driven by multiple engines.
o In the blog The Journey to the Cloud – The Need for Speed & the Private Cloud Platform I outlined the strategic need for speed in the cloud environment. The V-Max is the EMC melting pot where all this great technology and integration starts to come to life.
o The V-Max is a complete Cloud platform all by itself with hundreds of processors operating inside, the Virtual Matrix tying it all together and the ability to see distributed storage caches as a single large global cache pool. Customers all round the world throw their applications storage needs at the V-Max and this handles these loads effortlessly.
o Chad Sakac's (EMC) blog is very informative on what the V-Max can do!
o The ability to scale out and up in any way possible is critical to support the variety of activities taking place within your Cloud infrastructure. Think beyond VM hosting to the workload itself. Databases, data warehouses, business intelligence, messaging systems, web platforms, content management platforms, ERP – the list is endless. As all of these can be virtualized, the supporting infrastructure needs to be able to handle these I/O patterns effortlessly and allow new workloads to come online as needed.
So on the storage side EMC is really stepping up to the challenges they envision in the storage infrastructure that supports the Cloud. Expect to see many more enhancements in the coming year to make handling thousands of VMs inside your Cloud simple!
On the subject of storage, some technologies worth keeping an eye on:
· Obviously SSDs are out there at the envelope of performance. Expect prices to drop and capacity per disk to rise. Think of what you could do for your Cloud if you had a V-Max with over 2,000 SSDs inside!
· Memory-storage hybrid models such as from Fusion IO are beginning to blur the difference between persistent storage such as disks and volatile storage such as RAM. Using NAND Flash, we have for the first time the idea of packing persistent memory in an ESX server to the tune of 5TB. That is 5TB of super-fast memory treated as memory and disk concurrently. For those workloads needing even more performance, this is going to be a series change in what we think of as the memory of an ESX server
· MRAM is an upcoming technology providing a single memory type with persistent storage qualities. Imagine the ESX server of the future having hundreds of TB of MRAM available to support local computing needs. Bearing in mind that typically in Cloud environments the single biggest constraint tends to be also memory – think how this will change our idea of processing and hosting VMs.
What does this all Mean for Virtualization and the Cloud?
Essentially the Clouds are becoming hyper-scaled environments with millions of virtual machines operating simultaneously.
Through technology integration with VAAI the Cloud storage jobs are being shunted back into the storage arrays. vSphere itself is using the highly optimized array for its Cloud storage needs to offloadstorage operations. This in turn rquires storage arrays to have not just the raw power to drive this furious I/O pace but also the intelligence to be able to manage itself in a cost/performance effective manner across the entire Cloud.
Storage intelligence is not enough. The intelligence to manage all that information, Information Intelligence, with geographic and jurisdictional compliance awareness, starts to become of critical competitive advantage. EMC is working very hard in that direction to allow Cloud wide information policies to be setup actively managing informaiton in the Cloud.
V-Max and VPLEX allow scale up and scale out strategies to be employed on massive scales. The idea of federating storage across any geography is innately linked to the idea of Cloud infrastructures. The V-Max as the storage workhorse of large scale Cloud infrastructures coupled with the intelligence of managing information is going to be a critical building block.
When we look at the current limits of VMware vSphere 4.1 in such integration packages as the VBlock, we can see the following:
· vSphere cluster consisting of 32 hosts (probably 64 in the near future), over 4,000 processing cores, 32TB and millions of I/Os as a single large pool
· Storage needs reaching millions of I/Os and petabytes of storage
· The whole shebang moving to full 64-bit removing all constraints on memory and storage
· All needing to be handled as a large pool that can be allocated in any way needed
Storage, and the V-Max fits this model, will be needed to underpin this enormous capability. That storage should be as flexible and as easy to dole out as the Cloud that it underpins. Information Intelligence will be the glue that brings this all together ensuring the correct storage capacity, at the right price, in the right geography, protected at the levels agreed anywhere on the planet!
Why is this important for the CIO?
If these abilities are coming into the virtualization industry as a whole, with integration at all levels, it is important that as a CIO one thinks about those strategic assets that should be acquired. Knowing the asset roadmap, the enormous capabilities and cost savings inherent in appropriate selection it surely makes sense to think beyond immediate short terms savings and think immediate transformational gains available for the business.
If all x86 workloads can be virtualized, including large scale data warehouses, and information will be the new competitive advantage, then the move to SSDs, lighting fast processing and information retrieval will be the abilities that IT should be providing to the business.
There is no need to wait for hours and days for information. Real-time information is already at your fingertips. Getting the organization to adapt to that capability which the CIOs can now provide, in the form of a full virtual infrastructure supported by enterprise strength storage suitable for the Cloud paradigms, is critical.
CIOs should be challenging the business with what they can do when information is available near instantly. The questions regarding how quick infrastructure can be provisioned are questions of yesterday. A full vSphere environment, even a single cluster environment, provides thousands of processors – backed up by storage for all that historic and current information.
That is enough for most developers. If you as the CIO can now answer positively to being able to provide thousands of VMs within a single day, then what can those developers, business analysts and corporate innovators think of that can improve the position of the organization and the business?