Engineered Systems

Datacenter Rack-scale Capability Architecture (RCA)

For the last couple of years there has been a resurgent theme cropping up regarding disaggregation of components within a rack to support hyperscale datacenters. Back in 2013 Facebook, as founder of the Open Compute Foundation, and Intel announced their collaboration on future data center rack technologies.

This architecture deals basically with the type of business that Facebook itself is running and as such is very component focused such that compute, storage and network components are disaggregated across trays, trays being interconnected with a silicon photonic internal network fabric.

This has the advantage for hyperscale datacenters of modularity allowing components such as CPU to be swapped out individually as opposed to the entire server construct. Intel, presenting at Interop 2013, had an excellent presentation on the architecture outlining various advantages. This architecture is indeed being used by Baidu, Alibaba Group, Tencent and China Telecom (according to Intel).

This in itself is not earth shattering, but seems to lack the added "magic sauce". As it stands this is simply a re-jigging of the arrangement of the form factor but in itself does not do anything to really enhance workload density outside of the consolidation of large numbers of components in the physical rack footprint (i.e. more core, RAM, network bandwidth).

Principally it is aimed at reducing cable, switch clutter, associated power requirements and upgrade modularity, essentially this increases the compute surface per rack. These are fine advantages when dealing with hyperscale datacenters as they represent considerable capital expenditure as outlined in the Moor Insights & Strategy paper on the subject.

Tying into the themes in my previous blog regarding the "Future of the Datacenter" there is a densifying effect taking place affecting the current datacenter network architecture as aptly shown in the Moor study:

Examining this architecture, the following points stand out:

  • Rack level architecture is essential in creating economies of scale for hyper-scale and private enterprise datacenters
  • East-West traffic is coming to front-and-center whilst most datacenters are still continuing in North-South network investment with monolithic switch topologies
  • Simply increasing the number of cores and RAM within a rack does not itself increase the workload density (load/unit)
  • Workload consolidation is more complex than this architecture indicates utilizing multiples components at different times under different loading
  • Many approaches are already available using an aggregation architecture (HP Moonshot, Calxeda ARM Architecture, even SoCs)

There is a lot of added value to be derived for an enterprise datacenter using some of these "integrated-disaggregated" concepts, but competing with and surviving in spite of hyperscale datacenters requires additional innovative approaches to be taken by the enterprise.

Enterprises that have taken on board the "rack as a computer" paradigm have annealed this with capability architecture to drive density increases up to and exceeding 10x over simple consolidation within a more capable physical rack:

  • General purpose usage can be well serviced with integrated/(hyper)converged architectures (e.g. Oracle's Private Cloud Appliance, VCE Vblock, Tintri, Nutanix)
  • Big Data architectures use a similar architecture but have the magic sauce embedded in the way that the Hadoop cluster software itself works
  • Oracle's Engineered Systems further up the ante in that they start to add magic sauce to the hardware mix and the software smarts – hence engineered rather than simply integrated. Other examples are available from Teradata, Microsoft and its appliance partners)
  • In particular, the entire rack needs to be thought of as a workload capability server:
    • If database capability is required then everything in that rack should be geared to that workload
    • In-Platform (in the database engine itself) capability used above general purpose virtualization to drive hyper-tenancy
    • Private networking fabric (Infiniband in the case of Oracle Exadata and most high-end appliances)
    • Storage should be modular and intelligent, offloading not just storage block I/O but also being able to deal with part of the SQL Database Workload itself whilst providing the usual complement of thin/sparse-provisioning, deduplication and compression
    • The whole of database workload consolidation is many times the sum of parts in the rack
  • The datacenter becomes a grouping of these hyper-dense intelligent capability rack-scale servers
    • Intelligent provisioning is used to "throw" the workload type onto the best place for doing it at scale, lowest overall cost and still deliver world-class performance and security
    • Integrate into the overall information management architecture of the enterprise
    • Ensure that as new paradigms related to Big Data Analytics and the tsunami of information expected from Internet-of-Things that they can be delivered in the rack scale computer form but with additional "smarts" to further increase value being delivered as well as provide agility to the business.

The enterprise Datacenter can deliver value beyond a hyper-scale datacenter through thinking about continuous consolidation in line with the business not just IT needs that need to be delivered. Such platform and rack-scale capability architecture (RCA) has been proven to provide massive agility to organizations and indeed prepares them for new technologies such that they can behave like "start-ups" with a fast-low-cost-to-fail mentality to power iterative innovation cycles.

Opportunities for the CIO

The CIO and senior team have a concrete opportunity here to steal a march on the Public Cloud vendors by providing hyper-efficient capability architectures for their business in re-thinking the datacenter rack through RCA paradigms.

Not only will this massively reduce footprint in the existing premises and costs, but focuses IT on how best to serve the business through augmentation with hybrid Cloud scenarios.

The industry has more or less spoken about the need for hybrid Cloud scenarios where private on-premise cloud is augmented with public cloud capabilities. Further today's announcement with regards to the EU making "Safe Harbour" data treaty effectively invalid should put organizational IT on point about how to rapidly deal with these changes.

Industry thinking indicates that enterprise private datacenters will shrink, and the CIO team can already ensure they are "thinking" that way and making concrete steps to realize compact ultra-dense datacenters.

A hyper-scale datacenter can't really move this quickly or be that agile as their operating scale inhibits this nimble thinking that should be the hallmark of the CIO of the 2020s.

In the 2020s perhaps nano- and pico-datacenters may be of more interest to enterprises as way of competing for business budgetary investment as post-silicon graphene compute substrates running at 400GHz room become the new norm!


In Memory Computing (IMC) Cloud - so what about the Memory?

There's been a lot of talk in 2013 about in-memory computing (IMC), with Gartner indicating strategic significance in 2012. Very little has been said about the memory needed for IMC!

IMC is claimed to be "new", "radical", "never before done in the industry" etc. Much of this has been from SAP's HANA marketing amongst others. The discussion is IT industry relevant and all workloads delivered locally or through IMC-enabled Clouds!

Larry Page, after proposing holding the Internet in memory in 2000 at the Intel Developer Forum,  moved forward with this idea of holding the Internet in memory -  resulting in Google. He had only 2,400 computers in the Google datacenter then!

The industry has responded in like kind - by stating large amounts of memory have been available in platforms for nigh on a decade. Indeed, the latest Oracle Exadata X3-8 Engineered System has 4TB of RAM and a 22TB of PCIe Flash - non-volatile RAM.

Exadata_Arch

So IMC is not new in the sense SAP and others would have you believe. It is a natural evolution of economies of scale bringing price/GB down accompanied by technological speed & capacity innovations.

A purist approach based on DRAM (nanosecond access) alone has a vast cost difference from NAND Flash (microseconds - 1,000x slower) and spinning disk (milliseconds) technolgies today - whilst being volatile - data gone on power cycling! Economically speaking - a hybrid approach has to be taken as a road to the IMC-Cloud!

Amongst the characteristics facilitating wholesale transformation to full IMC (hardware, software, application architectures) are:

  1. Performance - nanosecond to microseconds as DRAM/Flash currently
  2. Capacity -Terabytes initially and then Petabytes as Flash/Disk currently
  3. Volatility - non-volatile on power cycling much in the same way as Flash/Disk today
  4. Locality - as close as possible to the CPU, but needs to be manageable at cloud scale!

Individually, each of these characteristics has been achieved. Combined, they are technically challenging. Many promising technologies are evolving aiming to solve this quandry and change the face of computing as we know it forever. Basically Massive, Low-power consuming non-volatile RAM!  

Advances are being made in all areas:

  • HMC - Hybrid Memory Cube using stackable DRAM chips resulting in 320GB/s throughput (vs. DDR3 maxing out at 24GB/S) and 90% less space with 70% less energy. Still volatile though!
    RAM_hmc_layers
  • Phase-change memory (PCM/PRAM) producing non-volatile RAM. Micron already has this shipping in 1GB chips (in 2012). This does not require erase-before-rewriting cycles like Flash so potentially much faster. Current speeds are 400MB/s. 
  • HDIMM - Hybrid DIMMs - combine high-speed DRAM with non-volatile NAND storage (Flash). Micron (with DDR4) and Viking Technology (DDR3 NVDIMM) have these technologies with latencies of 25 nanoseconds.
    HDIMM1
  • NRAM - Carbon nanotube based non-volatile RAM. Nantero and Belgium's IMEC are working jointly to create this alternative to DRAM and scaling below 18nm sizes. Stackable like HMC and all non-volatile.
    Nram8
  • Graphene (single layer of carbon atoms) based non-volatile RAM such as efforts in 2013 in Lausanne, CH.
    Mos2_graphene_nvm

Which of these will be driving future architectures remains to be seen in the sensitive price-capacity-performance markets. Post-silicon era options based on carbon/graphene nanotube technology would of course power the next wave of compute as well as memory structures.

Questioning "Accepted Wisdom"

So In-Memory Computing is coming - has been for the last decade or so. So, are datacenter infrastructure and application architectures keeping pace or at least preparing for this "age of abundance" where non-volatile RAM is concerned?

In discussions with folks from IT and industry colleagues there is a clear focus on procurement at low price points with IT simply saying "everything is commodity"! This is like saying two cars of identical model/make/engine with different chip management software are the same - one clearly perfoms better than the other! The software magic in these hardware and software stacks makes them anything but commodity.

Many IT shops still think a centralized array of storage is the only way to go. They basically change media within the array to change storage characteristics - 7.2/10/15K RPM spinning disks to SSD drives. That is where their thinking essentially stops!

This short-term thinking will effectively result in the next wave of infrastructure and application sprawl OR revolution through IMC-Cloud enabled vectors turning IT on its colective head.

This would simply be too slow as a model for IMC Clouds. There are some clear trends emerging indicating how CIO/CTO and CFOs can prepare for IMC based datacenters of the future to drastically increase capability while changing the procurement equations:

  • Modular storage containers located close to processor & RAM driving the move away from islands of central/massive/SAN infrastructures.
  • Internetworking needs to be way faster to leverage IMC capabilities. Think 2013 for 40Gbps GbE/Infiniband now. Think 2016 for PCIe4 at 512Gb/s (x16 lane duplex). That speed is needed at least at the intersection points of compute/RAM/Storage!
  • Engineered (hardware and software optimized) for entire platforms. Simply not worth focusing all IT effort on individual best-of-breed components when "the whole needs to be greater than the sum of parts!".
  • Backup architectures need to keep up. Tape remains a cost-effective media for inactive/data-backup data sets, particularly when in open source Linear Tape File System (LTFS) format. A great blog on that from Oracle-StorageTek's Rick Ramsay.
  • Application architectures need to move away from bottleneck-resolution thinking! Most developers don't know what to do with Tbytes of RAM! Applications need to be massively parallel patterns where possible. Developers need to deliver data in real-time!
2013-2016 will see the strong rise of non-volatile memory technology and architectures. CIO/CTOs should be thinking about how they will leverage these capabilities. IT philosophers need to discuss and map out implications before handing over to Enterprise Architectes to enact.

Why is this important for the CIO, CTO & CFO?

Simple server consolidation has had its day! Most IT shops have used server virtualization in one form or another. The early fast returns are almost exhausted! Continuous workload consolidation needs to take center-front stage again- think thousands of workloads per server not 20-40!

Private IMC-Clouds provide an ability for CIOs to keep in-house IT relevant to the business.

CTOs should be thinking about how IMC-Clouds can power the next wave of innovative applications-services-products in an increasingly interconnected always-on manner. Scaling, performance, resiliency to failure should be designed into application platforms - NOT applications themselves. Fast moving application development can then proceed without recreating these features in every app.

For the CFO, IMC-enabled Private Clouds represent dramatic lowering of all costs associated with IT to the business. Consolidating massive chunks of datacenter infrastructure, decommission datacenters, simplify demands on CFO resources for more performance/capacity will allow CFOs to free trapped financial value that can be used directly by the business. Tech-refresh cycles may need to be shortened to bring this vision to fruition earlier!

IMC-enabled Clouds, combined with Intelligent Storage will allow fundamental transformations to take place at paces exceeding even those of hyper-Cloud providers such as Amazon. Business IT can choose to transform

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by my current employer and does not necessarily reflect the views and opinions of my employer.

Storage Intelligence - about time!

I was reading recently an article about Backblaze releasing storage designs. This is a 180TB NAS device in 4U! Absolutely huge! A 42U rack would be able to have around 1.8Petabyte in a single rack.

Blog-pod30-header

When thinking about Petabytes, one thinks about the big players in storage, EMC/NetApp/HDS, selling tens of storage racks covering substantial parts of the datacenter floor space and offering a fraction of this capability.

Vmax_images Fas_index

Clearly, the storage profile of what the large monolithic enterprise arrays offer is different. However, Backblaze clearly highlights the ability to get conventional "dumb" storage easily and at low cost! Packing some flash  cache or SSD in front would already bring these boxes to the same I/O capacity;-)

This makes the case that storage per se is not really a challenge anymore. However, making storage aid in the overall performance equation; making sure that storage helps in specific workload acceleration is going to be critical going forward. Basically Intelligent Storage!

Questioning "Accepted Wisdom"

Many IT shops still think of storage as a separate part of their estate. It should simply store data and provide it back rapidly when asked - politely. The continuing stifling of innovation in datacenters due to having a single answer for all questions - namely VMware/hypervisors and server virtualization - tends to stop any innovative thinking that may actually aid an organisation to accelerate those parts of the application landscape leveraging revenue.

Some questions that came to mind and also echoed by clients are:

  • Disk is cheap now. SSD answers my performance needs for storage access. Is there something that together with software actually increases the efficiency of how I do things in the business?

  • For whole classes of typical applications, structured data persistence pools, web servers etc what would "intelligent" storage do for the physical estate and the business consumers of this resource?

  • How can enterprise architecture concepts be overlaid to intelligent storage? What will this mean to how future change programmes or business initiatives are structured and architected?

  • How can current concepts of intelligent storage be used in the current datacenter landscape?

We are seeing the first impact of this type of thinking in the structured data / database world. By combining the database workload with storage and through software enablement we get  intelligent acceleration of store/retrieval operations. This is very akin to having map-reduce concepts within the relational database world.

Further combining storage processing, with CPU/RAM/Networking offload of workload specific storage requests, facilitatest unprecedented scale-out, performance and data compression capabilities.

Oracle's Engineered Systems, the Exadata Database Machine in particular, represents this intelligent storage concept, amongst other innovations, for accelerating the Oracle database workload.

These workload specific constructs foster security of precious data assets, physically and logically. This is increasingly important when one considers that organisations are using shared "dumb" storage for virtual machines, general data assets and application working data sets.

In the general marketplace other vendors (IBM PureSystems + DB2, Teradata, SAP HANA etc) starting to use variations of the technologies for intelligent storage. The level of maturity varies dramatically, with Oracle having a substantial time advantage as first mover.

2013-2015 will see more workload focused solutions materializing, replacing substantial swathes of datacenter assets built using the traditional storage view.

Why is this important for the CIO, CTO & CFO?

Intelligent workload-focused storage solutions are allowing CIO/CTOs to do things that were not easily implemented within solutions based on server virtualization technology using shared monolithic storage arrays - dumb storage - such as in the VMware enabled VCE Vblock and HP CloudSystem Matrix - which are effectively only IaaS solutions.

Workload specific storage solutions are allowing much greater consolidation ratios. Forget the 20-30-40 Virtual Machines per physical server. Think 100s of workloads per intelligent construct! An improvement of 100s of percent over the current situation!

It is important to verify how intelligent storage solutions can be a part of the CIO/CTO's product mix to support the business aspirations as well as simplify the IT landscape. Financing options are also vastly simplified with a direct link between business performance and physical asset procurement/leasing:

  • Intelligent storage removes architectural storage bottlenecks and really shares the compute/IO/networking load more fully.

  • Intelligent storage ensures those workloads supporting the business revenue generating activities are accelerated. Acceleration is linked to the cost of underlying storage assets. As cost of NAND flash, SSDs and rotating disks drop, more is automatically brought into the storage mix to reduce overall costs without disrupting the IT landscape.

  • Greater volumes of historic data are accessible thanks to the huge level of context sensitive, workload-specific data compression technologies. Big data analytics can be powered from here, as well as enterprise datawarehouse needs. This goes beyond simple static storage tiering and deduplication technologies that are unaware of WHAT they are storing!
  • Workload-specific stacking supports much higher levels of consolidation than simple server virtualization. The positive side effects of technologies such as Exadata include the rationalization of datacenter workload estates in terms of variety, operating systems can be rationalized and generally have net-net healthier estate. This means big savings for the CFO!

Intelligent storage within vertically engineered workload specific constructs, what Gartner calls Fabric Based Infrastructure present a more cogent vision of optimizing the organizational's IT capability. It provides a higher level of understanding how precious funding from CFOs is invested to those programmes necessary for the welfare of the concern.

CIO/CTOs still talking about x86 and server virtualization as the means to tackle every Business IT challenge would be well advised to keep an eye on this development.

Intelligent storage will be a fundamental part of the IT landscape allowing effective competition with hyperscale Cloud Providers such as Google/Amazon and curtailing the funding leakage from the business to external providers.

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by my current employer and does not necessarily reflect the views and opinions of my employer.

A Resurgent SPARC platform for Enterprise Cloud Workloads (Part 2) - SPARC T5

Some time ago, I blogged about the resurgence of the SPARC platform. The then newly designed SPARC T4 was showing tremendous promise in its own write to be able to take up its former mantle of being an innovation leader running extreme workloads with the Solaris 11 operating system.

Indeed, it was used as the driving engine of the SPARC Supercluster for dealing with not just massive acceleration of Oracle database workloads using the Exadata Storage Cell technology, but the ability to combine firmware embedded near-zero overhead virtualization concepts for electrically separate logical domains, carving up the physical hardware, and Solaris zones which allow near-native "virtual machines" sharing an installed Solaris operating system.

Up to 128 virtual machines (zones) supported on a system - a vast improvement over the 20-30 one gets under VMware-like hypervisors typically!

This welcome addition to the wider Oracle engineered systems family allowed the missing parts of the datacenter to be consolidated - these being typically glossed over or totally skipped when virtualization with VMware-like hypervisors was discussed. Customers were aware that their mission critical workloads were not always able to run with an x86 platform which was then further reduced in performance using a hypervisor to support large data set manipulation.

Well the rumor mills have started as the run up to Oracle Openworld 2012 at the end of September. One of the interesting areas is the "possible" announcement of the SPARC T5 processor. This is interesting in its own right as we have steadily been seeing the SPARC T4 and now the T5 having ever greater embedded capability in silicon to drive database consolidation and indeed the entire WebLogic middleware stack together with high-end vertical applications such as SAP, EBusiness Suite, Siebel CRM and so on.

Speculating on what "rumors" and the Oracle SPARC public roadmap, I'd like to indicate where I see this new chip making inroads in those extreme cloud workload environments whilst maintaining the paradigm of continuous consolidation. This paradigm that I outlined in a blog in 2010 is still very relevant - the SPARC T5 providing alternative avenues than simply following the crowd on x86.

Questioning "Datacenter Wisdom"

The new SPARC T5 will have, according to the roadmap the following features and technologies included:

  • Increasing System-on-a-Chip (SOC) orientation providing ever more enhanced silicon accelerators for offloading tasks that software typically struggles with at cloud scale. This combines cores, memory controllers, I/O ports, accelerators and network interface controllers providing a very utilitarian design.
  • 16 cores from the T4's 8-core. This takes them right up to the top end in core terms.
  • 8 threads per core - giving 128 threads of execution per processor providing exceptional performance for threaded applications such as with Java and indeed the entire SOA environment
  • Core speeds of 3.6GHz providing exceptional single threaded performance as well as the intelligence to detect thread workloads dynamically (think chip level thread workload elasticity)
  • Move to 28nm from 40nm - continuous consolidation paradigm being applied at silicon level
  • Crossbar bandwidth of 1TB/s (twice that of the T4) providing exceptional straight line scaling for applications as well as supporting the glueless NUMA design of the T5
  • Move to PCIe Generation 3 and 1TB/s memory bandwidth using 1GHz DDR3 memory chips will start to provide the means of creating very large memory server configuration (think double-digit TB of RAM for all in-memory workload processing)
  • QDR (40Gbps) Infiniband private networking
  • 10GbE Public networking
  • Database workload stacking becomes even more capable and effective than simple hypervisor based virtualization for datacenter estate consolidation at multiple levels (storage, server, network and licensed core levels)

This in itself at the processor level is really impressive, but the features that are on the roadmap aligned to the T5 possibly are the real crown jewels:

  •  on-die crypto accelerators for encryption (RSA, DH, DSA, ECC, AES, DES,3DES, Camellia, Kasum) providing excellent performance through offloading. This is particularly relevant in multi-tenant Cloud based environments
  • on-die message digest and hashing accelerators (CRC32c, MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512) providing excellent security offloading. Again particularly relevant in multi-tenant environments
  • on-die accelerator for random number generation
  • PCIe Generation 3 opens the door to even faster Infiniband networking (56Gbps instead of the current 40Gbps - with active-active links being possible to drive at wire speed)
  • Hardware based compression which will seriously reduce the storage footprint of databases. This will provide further consolidation and optimization of database information architectures.
  • Columnar database acceleration and Oracle number acceleration will provide extremely fast access to structured information. Further, when combined with in-memory structures, the database will literally be roaring !

Indeed when we think that the Exadata Storage cells will also be enhanced to support new chip generations, flash density as well as other optimizations, the next SPARC Supercluster (which has the embedded Exadata storage cells) will literally be one of the best performing database platforms on the planet!

To ignore the new SPARC T5 (whenever it arrives) is to really miss a trick. The embedded technology provides true sticky competitive advantage to anyone that running a database workload or indeed multi-threaded applications. As a Java platform, middleware and SOA platform as well as vertical application platform, the enterprise can seriously benefit from this new innovation.

Why is this important for the CIO & CFO?

CIOs and CFOs are constantly being bombarded with messages from IT that x86 is the only way to go, that Linux is the only way to go, that VMware is the only way to go. As most CFOs will have noted by now:

  • Financially speaking - the x86 servers may have been cheaper per unit, but the number of units is so large to get the job done that any financial advantage that might have been there has evaporated!
  • Overall end-2-end costs for those services that the CIO/CFO signed off on are never really well calculated for the current environment.
  • Focused investment on those activities that support revenue streams and those technologies that will continue to do that for at least the next decade with capacity upgrades of course
  • There must be other ways of doing things that make life easier and more predictable

Well Engineered Systems with the new SPARC T5 represent a way for the CIO/CFO to be able to power those projects that need investment which in turn drive revenue and value. The ability to literally roll the SPARC SuperCluster or any other Engineered System is going to be instrumental in:

  • Shortening project cycles at the infrastructure level
    • don't lose 6 months on a critical ERP/CRM/Custom application project in provisioning hardware, getting unexpected billing for general infrastructure layers such as networking that have nothing to do with this project, IT trying to tune and assemble, getting stuck in multi-vendor contract and support negotiations etc.
    • That time can be literally worth millions - why lose that value?
  • Concentrate valuable and sparse investment strategies literally to the last square meter in the datacenter!
    • If that next project is a risk management platform, then IT should be able to give exactly to the last datacenter floor tile the resources that are needed for that one project alone and the cost
    • Project based or zero-budgetting will allow projects to come online faster, predictably, reuse of existing platforms dealing with the load as well as supporting continuous workload consolidation paradigms
    • Finance enterprise architecture projects that put in the enabling conditions to support faster turnaround for critical revenue focused/margin increasing project activity
Engineered systems are already using the technologies that the rest of the industry is trying to re-package to meet the challenges customers are facing now and in the coming years.The lead is not just in technology but also the approach that customers are demanding - specific investments balanced with specific revenue generating high-yield business returns.

As a CIO it is important to recognize the value that Engineered Systems and the SPARC platform, as part of an overall datacenter landscape, bring in addressing key business requirements and ensure an overall simplification of the Datacenter challenge and large CAPEX requirements in general.

As Oracle and others proceed in acquiring or organically developing new capabilities in customer facing technologies, managing exabyte data sets it becomes strategically important to understand how that can be dealt with.

Hardware alone is not the only answer. Operating systems need to be able to deal with big thinking and big strategy as do applications and the hardware. By creating balanced designs that can then scale-out a consistent effective execution strategy can be managed at the CIO/CTO/CFO levels to ensure that business is not hindered but encouraged to the maximum through removing barriers that IT may well have propagated with the state of the art many years ago.

Engineered Systems enable and weaponize the datacenter to directly handle the real-time enterprise. High-end operating systems such as Solaris and the SPARC processor roadmap are dealing with the notions of having terabyte datasets, millions of execution threads and thousands of logical domains with hundreds of zones (virtual machines) each per purchased core.

Simply carving up a physical server's resources to make up for the deficiencies of operating system/application in dealing with workloads can't be an answer by itself. This is what is also fueling the Platform-as-a-Service strategies partly. How to get systems working cooperatively together to deal with more of the same workload (e.g. database access/web server content for millions of users) or indeed different workloads spread across systems transparently is the question!

High performance computing fields have been doing just this with stunning results albeit at extreme cost conditions and limited workloads. Engineered systems are facilitating this thinking at scale with relatively modest investment for the workloads being supported.

It is this big thinking from organizations such as Oracle and others, who are used to dealing with petabytes of data, and millions of concurrent users that can fulfill  requirements expressed by the CIO/CTO/CFO teams. If millions of users needing web/content/database/analytics/billing can be serviced per square meter of datacenter space - why not do it?

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Oracle and does not necessarily reflect the views and opinions of Oracle.

Datacenter Wisdom - Engineered Systems Must be Doing Something right! (Part 1 - Storage Layer)

Looking back over the last 2 years or so, we can start to see an emerging pattern of acquisitions and general IT industry manouevering that would suggest customer demand and technological capability packaging for specific workloads are more in alignment than ever.

I wanted to to write a couple of blogs to capture this in the context of the datacenter and in wider Oracle engineered systems penetration.

I will start with the Storage Layer has that seems to have garnered tremendous changes in the last 6 months alone although the pattern was already carved out in the early Oracle Exadata release in April 2010 (nice blog on this from Kerry Osborne - Fun with Exadata v2) in its innovative bundling of commodity hardware with specialized software capabilities.

Questioning "Datacenter Wisdom"

As you may know Oracle's Exadata v2 represent a sophisticated blend of balanced components for the tasks undertaken by the Oracle Database whether it being used as a high-transaction OLTP or long running query intensive Datawarehouse. Technologies include:

  • Commodity x86 servers with large memory footprints or high core counts for database nodes
  • x86 servers / Oracle Enterprise Linux for Exadata storage servers
  • Combining simple server based storage in clusters to give enterprise storage array capabilities
  • QDR (40Gbps) Infiniband private networking
  • 10GbE Public networking
  • SAS or SATA interfaced disks for high performance or high capacity
  • PCIe Flash cards
  • Database workload stacking as a more effective means than simple hypervisor based virtualization for datacenter estate consolidation at multiple levels (storage, server, network and licensed core levels)

Binding this together is the Oracle 11gR2 enteprise database platform, Oracle RAC database cluster technology allowing multiple servers to work in parallel on the same database and the Exadata Storage Server (ESS) software supporting the enhancements to facilitate intelligent caching of SQL result sets, offloading of queries and storage indices. There is a great blog from Kevin Closson - Seven Fundamentals Everyone Should Know about Exadata that cover this in more detail.

Looking at the IT industry we see:

  • EMC/Isilon acquisition that marries multiple NAS server nodes to an Infiniband fabric for scale-out NAS - indicating that Infiniband has a significant role to play in binding loosely connected servers for massive scalability.
  • EMC/Data Domain+Spectralogic showing that tape is not in fact dead as many are predicting and that it remains an extremely low cost media for Petabyte storage.
  • Embed flash storage (SSD or PCIe based) into servers closer to the workload than simply going across the SAN/LAN wires to an enterprise storage array showing that local storage with flash across a distributed storage node fabric is infinitely more effective than SAN storage for enteprise workloads.
  • EMC/NetApp with intelligent flash usage rather than as replacement for spinning disk significantly enhances certain workloads as we see in EMC's VFCache implementation and NetApp's Intelligent Caching.
  • Monolithic SAN attached arrays moving towards modular scalable arrays supporting the approach taken by Oracle's Pillar Axiom which scales I/O, storage capacity and performance independently using smaller intelligent nodes. EMC is using VMAX engines, NetApp with its GX (Spinnaker) architecture, and even IBM is going that way.

All these trends, and it is not so important really in what chronological order they happened or that I took some examples from leaders in their fields, clearly indicate convergence of technological threads.

I often hear from clients that Exadata is too new, uses strange Infiniband bits and has no link to a SAN array. Well clearly the entire industry is moving that way. Customers are indicating with their voices what they would like to have - capability and simplicity for the workloads that drive their revenue.

Why is this important for the CIO?

CIOs are typically confronted with a range of technologies to solve a limited array of challenges. They are constantly asked by the business and more recently CFOs to make sure that they are:

  • using future proofed technologies,
  • simpler vendor management,
  • focus investment on those activities that support revenue streams,
  • align IT with the business!

Well Engineered Systems are exactly all that. Oracle literally went back to the drawing board and questioned why certain things were done in certain ways in the past and what direct benefit that provided clients.

Engineered systems are already using the technologies that the rest of the industry is trying to re-package to meet the challenges customers are facing now and in the coming years.

Oracle, I believe, has at least a 2 year advantage in that they:

  • learnt from the early stages in the market,
  • fine-tuned their offerings,
  • aligned with support requirements of such dense capability blocks,
  • helped customers come to grips with such a cultural change
  • is continuing to add to its "magic sauce" and still engineering the best of commodity hardware to further increase the value add of Engineered Systems.

The lead is not just in technology but also the approach that customers are demanding - specific investments balanced with specific revenue generating high-yield business challenges.

As a CIO it is important to recognise the value that Engineered Systems bring in addressing key business requirements and ensure an overall simplification of the Datacenter challenge and large CAPEX requirements in general.

Engineered Systems provide the ability for IT to transform itself providing directly relevant Business Services.

It is not a general purpose approach where the IT organisation can hope for transformation - Engineered Systems enable and weaponise the datacenter to directly fulfill  requirements expressed by the CIO team through intense constant dialogue with business leaders!

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Oracle and does not necessarily reflect the views and opinions of Oracle.

Inflexibility of Datacenter Culture - 'The way we do things round here' & Engineered Systems

With a focus on large enterprise and service provider datacenter infrastructures, I get the chance to regularly meet with senior executives as well as rank and file administrators - top-2-bottom view.

One of the things that as always struck me as rather strange is the relative "inflexible" nature of datacenters and their management operations.

As an anecdotal example, I recall one organization with a heavy focus on cost cutting. At the same time the datacenter management staff decided that they would standardize on all grey racks from the firm Rittal. Nothing wrong here - a very respectable vendor.

The challenge arising was:

  • The selected Rittal racks at that time were around 12,000 Euro each approximately
  • The racks that came from suppliers such as Dell, Fujitsu, Sun etc were around 6,000 Euro each

See the problem? 50% savings thrown literally out of the window because someone wanted all grey racks. When we are talking about a couple of racks - that is no big deal. With say 1000 racks we are looking at over-expense of 6 million Euro - before anything has been put into those racks!

Upon asking why it was necessary to standardize the racks, the answers I got were:

  • independence from server manufacturers
  • create rack and cabling rows before servers arrive to facilitate provisioning
  • simpler ordering
  • perceived "better-ness" as enclosures is a Rittal specializ

Sounds reasonable at first glance - until we see that racks are all engineered t support certain loads, and typically optimized for what they will eventually contain. Ordering was also not really simpler - as the server vendors made that a "no brainer". Perception of quality was not validated either - just a gut feel.

The heart of the problem as came out later was that the datacenter would benefit from having everything homogenous. Homogenous = standardized for datacenter staff.

The problem with this is that datacenters are not flexible at all, they have a focus on homogeneity and ultimately cost lots of money to the business financing them.

In an age where flexibility and agility means the literal difference between life and death for an IT organization, it is incumbent on management to ensure that datacenter culture allows the rapid adoption of competitive technologies within the datacenter confines.

Standardized does not mean that everything needs to be physically the same. It involves the processes o dealing with change in such as way that it can be effected quickly, easily and effectively to derive key value!

I indicated the recent trend of CIOs reporting to CFOs and this would have provided financial stewardship and accountability in this case - getting staff and managers to really examine their decision in the light of what was best for the organization.

Questioning "Datacenter Wisdom"

The focus on homogenous environments has become so strong that everything is being made to equate a zero-sum game. This happens in cycles in the industry. We had mainframes in the past, Unix, Linux (which is Unix basically for x86), Windows - and the latest is VMware vSphere and all x86!

Don't get me wrong here - as a former EMC employee - I have the greatest respect for VMware and indeed the potential x86 cost savings.

What is a concern is this is translated to "strategy". In this case the approach has been selected without understanding why! It is a patch to cover profligate past spending and hoped that magically all possible issues will be solved.

After all - it is homogenous. All x86, all VMware, all virtualized. Must be good - everyone says it is good!

See what I mean - the thinking and strategizing has been pushed to the side. Apparently there is no time to do that. That is really hard to believe, as this is one of those areas that fall squarely into the CIO/CTO/CFO's collective lap.

There are other approaches, and indeed they are not mutually exclusive. Rather they stem from understanding underlying challenges - and verifying if there are now solutions to tackle those challenges head-on.

Why is this important for the CIO?

At a time of crisis and oversight, it is incumbent on the CIO to question the approach put on his/her table for Datacenter Infrastructure transformation.

The CIO has the authority to question what is happening in his/her turf.

At countless organisations, I have performed strategic analysis of macro challenges mapped to the IT infrastructure capability of an organization to deal with those changes. Time and again, when discussing with the techies and managers (who were from a technical background but seem to struggle with strategy formulation itself) it was shown that the marginal differences in technologies were not enough to justify the additional expenditure - or that there were other approaches.

Engineered Systems, in Oracle parlance, are one such challenge. They do not "fit" the datacenter culture. They can't be taken apart and then distributed into the slots that are free in racks spread over the datacenter.

From a strategy perspective, a great opportunity is not being exploited here. Engineered systems, such as Exadata, Exalogic, SPARC SuperCluster, Exalytics,  and Oracle Data Appliance represent the chance to change the Datacenter culture and indeed make the whole datacenter more flexible and agile.

They force a mindset change - that the datacenter is a housing environment that contains multiple datacenters within it. Those mini-datacenters each represent unique capabilities within the IT landscape. They just need to be brought in and then cabled up to the datacenter network, power, cooling and space cabilities of the housing datacenter.

There are other assets like this in the datacenter already - enterprise tape libraries provinding their unique capability to the entire datacenter. Nobody tries to take a tape drive or cartridge out and place it physically somewhere else!

Engineered Systems are like that too. Taking Exadata as an example. This is clearly assembled and tuned to do database work with the Oracle Database 11gR2 platform. It is tuned and tweaked to do that extremely well. It breaks down some of the traditional barriers of datawarehouse and OLTP workloads and indeed allows database workloads to be "stacked".

Taking the idea of what a datacenter really should be (facilities for storing and running IT infrastructure) and being flexible - Exadata should literally be placed on the floor, cabled to the main LAN and power conduits and the database infrastructure platform is in place. After that databases can be created in this mini-Datacenter. The capability is literally available immediately.

Contrast this with creating lots of racks in rows where it is not certain what will be in those racks, put VMware everywhere, add lots of SAN cabling as I/O will always be an issue - and then spend ages in tuning performance to make sure it works well.

The CIO should identify this as a waste of time and resources. These are clever people who should be doing clever things for the benefit of an organisation. It would be similar to the idea of buying a car to drive around in OR getting the techies to buy all the components of the car and trying to assemble.

This loss of value inherent in the Exadata when taking a x86/hypervisor route to creating many virtual virtual machines running each a separate database make no real sense in the case of databases.

The CIO can use valuable organizational knowledge gained over many years regarding the functions the business needs. If, as in this example, it is the ability to store/retrieve/manage structured information at scale - the answer should literally be to bring in that platform and leverage the cascading value it provides to the business.

Neither x86 or a certain operating system is "strategically" relevant - in that it is a platform - and the normal DBAs can manage this using tools they already know. This mini-datacenter concept can be used in extremely effective ways and supports the notion of continuous consolidation.

CIOs can get very rapid quick-wins for an organization in this way. Datacenter infrastructure management and strategy should be considered in terms of bringing in platforms that can do their job well with software and hardware well tuned to run together. Further, they should reduce "other" assets and software that is needed.

Exadata does with not needing a SAN switch/cabling infrastructure - it encapsulates paradigms of virtualization, cloud and continuous consolidation. This will drive deep savings and allow value to be derived rapidly.

Challenge datacenter ideas and culture in particular. Agility requires being prepared to change things and being equipped to absorb change quickly!

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Oracle and does not necessarily reflect the views and opinions of Oracle.