Datacenter Rack-scale Capability Architecture (RCA)

For the last couple of years there has been a resurgent theme cropping up regarding disaggregation of components within a rack to support hyperscale datacenters. Back in 2013 Facebook, as founder of the Open Compute Foundation, and Intel announced their collaboration on future data center rack technologies.

This architecture deals basically with the type of business that Facebook itself is running and as such is very component focused such that compute, storage and network components are disaggregated across trays, trays being interconnected with a silicon photonic internal network fabric.

This has the advantage for hyperscale datacenters of modularity allowing components such as CPU to be swapped out individually as opposed to the entire server construct. Intel, presenting at Interop 2013, had an excellent presentation on the architecture outlining various advantages. This architecture is indeed being used by Baidu, Alibaba Group, Tencent and China Telecom (according to Intel).

This in itself is not earth shattering, but seems to lack the added "magic sauce". As it stands this is simply a re-jigging of the arrangement of the form factor but in itself does not do anything to really enhance workload density outside of the consolidation of large numbers of components in the physical rack footprint (i.e. more core, RAM, network bandwidth).

Principally it is aimed at reducing cable, switch clutter, associated power requirements and upgrade modularity, essentially this increases the compute surface per rack. These are fine advantages when dealing with hyperscale datacenters as they represent considerable capital expenditure as outlined in the Moor Insights & Strategy paper on the subject.

Tying into the themes in my previous blog regarding the "Future of the Datacenter" there is a densifying effect taking place affecting the current datacenter network architecture as aptly shown in the Moor study:

Examining this architecture, the following points stand out:

  • Rack level architecture is essential in creating economies of scale for hyper-scale and private enterprise datacenters
  • East-West traffic is coming to front-and-center whilst most datacenters are still continuing in North-South network investment with monolithic switch topologies
  • Simply increasing the number of cores and RAM within a rack does not itself increase the workload density (load/unit)
  • Workload consolidation is more complex than this architecture indicates utilizing multiples components at different times under different loading
  • Many approaches are already available using an aggregation architecture (HP Moonshot, Calxeda ARM Architecture, even SoCs)

There is a lot of added value to be derived for an enterprise datacenter using some of these "integrated-disaggregated" concepts, but competing with and surviving in spite of hyperscale datacenters requires additional innovative approaches to be taken by the enterprise.

Enterprises that have taken on board the "rack as a computer" paradigm have annealed this with capability architecture to drive density increases up to and exceeding 10x over simple consolidation within a more capable physical rack:

  • General purpose usage can be well serviced with integrated/(hyper)converged architectures (e.g. Oracle's Private Cloud Appliance, VCE Vblock, Tintri, Nutanix)
  • Big Data architectures use a similar architecture but have the magic sauce embedded in the way that the Hadoop cluster software itself works
  • Oracle's Engineered Systems further up the ante in that they start to add magic sauce to the hardware mix and the software smarts – hence engineered rather than simply integrated. Other examples are available from Teradata, Microsoft and its appliance partners)
  • In particular, the entire rack needs to be thought of as a workload capability server:
    • If database capability is required then everything in that rack should be geared to that workload
    • In-Platform (in the database engine itself) capability used above general purpose virtualization to drive hyper-tenancy
    • Private networking fabric (Infiniband in the case of Oracle Exadata and most high-end appliances)
    • Storage should be modular and intelligent, offloading not just storage block I/O but also being able to deal with part of the SQL Database Workload itself whilst providing the usual complement of thin/sparse-provisioning, deduplication and compression
    • The whole of database workload consolidation is many times the sum of parts in the rack
  • The datacenter becomes a grouping of these hyper-dense intelligent capability rack-scale servers
    • Intelligent provisioning is used to "throw" the workload type onto the best place for doing it at scale, lowest overall cost and still deliver world-class performance and security
    • Integrate into the overall information management architecture of the enterprise
    • Ensure that as new paradigms related to Big Data Analytics and the tsunami of information expected from Internet-of-Things that they can be delivered in the rack scale computer form but with additional "smarts" to further increase value being delivered as well as provide agility to the business.

The enterprise Datacenter can deliver value beyond a hyper-scale datacenter through thinking about continuous consolidation in line with the business not just IT needs that need to be delivered. Such platform and rack-scale capability architecture (RCA) has been proven to provide massive agility to organizations and indeed prepares them for new technologies such that they can behave like "start-ups" with a fast-low-cost-to-fail mentality to power iterative innovation cycles.

Opportunities for the CIO

The CIO and senior team have a concrete opportunity here to steal a march on the Public Cloud vendors by providing hyper-efficient capability architectures for their business in re-thinking the datacenter rack through RCA paradigms.

Not only will this massively reduce footprint in the existing premises and costs, but focuses IT on how best to serve the business through augmentation with hybrid Cloud scenarios.

The industry has more or less spoken about the need for hybrid Cloud scenarios where private on-premise cloud is augmented with public cloud capabilities. Further today's announcement with regards to the EU making "Safe Harbour" data treaty effectively invalid should put organizational IT on point about how to rapidly deal with these changes.

Industry thinking indicates that enterprise private datacenters will shrink, and the CIO team can already ensure they are "thinking" that way and making concrete steps to realize compact ultra-dense datacenters.

A hyper-scale datacenter can't really move this quickly or be that agile as their operating scale inhibits this nimble thinking that should be the hallmark of the CIO of the 2020s.

In the 2020s perhaps nano- and pico-datacenters may be of more interest to enterprises as way of competing for business budgetary investment as post-silicon graphene compute substrates running at 400GHz room become the new norm!

Expressions of Big Data: Big Data Infrastructure BUT Small is Beautiful !!

In an effort to streamline costs, increase efficiency, and basically getting IT to focus on delivering real business value, a tremendous mind-shift is taking place. The battleground is the "build-your-own" approach versus the pre-built, pre-tested, pre-integrated built-for-purpose appliances using commodity hardware.

Big Data infrastructure is the next wave of advanced information infrastructures focused on to deliver competitive advantage and insight into pattern based behaviours.

Virtually every hardware vendor in the market has an offering. From Greenplum/EMC's (Pivotal) PivotalHD Hadoop Distribution with VMware/Isilon underpinning infrastructure, HP’s Haven and AppSystem ecosystem, IBM PureData, Teradata's Portfolio for Hadoop and indeed Oracle’s Big Data Appliance / Exalytics.

Many share the Cloudera (with Yahoo Hadoop founder Doug Cutting) distribution or directly to the roots in Apache Hadoop. Others are implementing with MapR or Hortonworks (the spinoff of Yahoo’s Hadoop team) distributions that are highly performant.

Clearly credibility is important to customers, either directly as in using Apache Hadoop or inherited through distributions such as Cloudera. Cluster management tools are critical differentiators when examining operations at scale - consequently this typically incurs licensing.

Significant players such as Facebook and Yahoo are developing Hadoop further and feeding back into the core Apache Hadoop. This allows anyone using this distribution to take advantage.

Over the coming blogs I will take a quick peek at these Big Data Infrastructures, their approaches, key differentiators and integration with the larger information management architecture. The focus will be density, speed and smallest form factors for infrastructure.

Questioning "Accepted Wisdom"

Whilst it is nice to know that we can have clusters of thousands of nodes performing some mind-boggling Big Data processing, it is almost MORE interesting to know how this can be performed on much smaller infrastructures, at the same/better speed and simpler operational management.

With that in mind, we should be asking ourselves if Big Data can be processed in other ways to take advantage of some very important technology drivers:

  • Multi-core/Multi-GHz/Hyper-threaded processors
  • Multi-terabyte (TB) memory chips
  • Non-volatile RAM (PCIe SSD mainly today, but possible Memristor or  even carbon nanotube based NanoRAM (NRAM)) as replacement for spinnng disk storage and volatile RAM
  • Terabit Interconnects for system components or to external resources
  • In-Memory/In-Database Big Data capabilities to span information silos vs. recreating RDBMS systems again.

With systems today such as the recently released Oracle SPARCT T5-8 alread having 128 cores and 1024 threads at 3.6GHz in a 8RU form factor - the compute side of the Big Data equation seems to be shaping up nicely - 16 cores/RU or 128 threads/RU.

Still too small as Hadoop clusters sport much greater processing power at the cost of requiring more server nodes of course.  

With Infiniband running at 40Gbps (QDR) and faster, component interconnects are also shaping up nicely. Many vendors are now using Infiniband to really get that performance up compared to Ethernet or even fibre channel for storage elements. Indeed some are literally skipping SANs/NASs and just moving to server based storage.

Many database vendors are actively using adapters to send Big Data jobs to the infrastructure and pull results back into the database. It will be a matter of time before Big Data is sucked into the RDBMS itself just as Java and Analytics has been.

However the memory technology vector is the one that is absolutely critical. The promise of non-volatile memory with performance outstripping the fastest RAM chips out there, the very shape of infrastructure for Big Data is radically different!

Not only is small beautiful - but it is essential to continuous consolidation paradigms allowing much greater densification of data and processing thereof!

Why is this important for the CIO, CTO & CFO?

Big Data technologies are certainly revolutionizing the way decisions are being informed and providing lighting fast insight into very dynamic landscapes.

Executive management should be aware that these technologies rely on scale-out paradigms. This would effectively reverse gains made through virtualization and workload optimized systems to reduce the datacenter estate footprint!

CIO/CFOs should be looking at investing in technology infrastructure minimizing the IT footprint, and yet still delivering revenue generating capabilities derived through innovative IT. In some cases this will be off-premise based clouds; in others competitive advantage will be derived from on-premise resources that are tightly integrated into the central information systems of the organization.

Action-Ideas such as:

  • Funding the change for smaller/denser Big Data Infrastructure will ensure server (physical or virtual) sprawl is avoided
  • Continuous consolidation paradigm funding by structuring business cases for faster ROI
  • Funding efficiency of operations. If Big Data can be done within a larger server with a single OS image and in-memory vs. 1000 OS images, this will be the option to make operations simpler and efficient. There maybe a cost premium at the server and integration layers.
  • Advanced integration of the Big Data infrastructure and information management architectures to allow seamless capability across information silos (structured/unstructured)
  • Cloud capabilities for scaling preferably within the box/system using what Gartner calls fabric-based infrastructures should be the norm rather than building your own environment.

Continuous workload consolidation needs to take center-front stage again - think thousands of Big Data workloads per server not just thousands of servers to run a single workload!  Think In-Memory for workloads rather than in simple spinning disk terms. Think commodity hardware/OS is not the only competitive advantage - only low hanging fruit!

We'll take a closer look at Big Data infrastructure in the coming blogs with a view to how Cloud pertains and still ensure deep efficiency at an information management infrastructure perspective using relevant KPIs.


The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by my current employer and does not necessarily reflect the views and opinions of my employer.