IBM Z Server: Best In Class For Availability – Does Form Factor Matter?

A recent ITIC 2017 Global Server Hardware and Server OS Reliability Survey classified the IBM Z server as delivering the highest levels of reliability/uptime, delivering ~8 Seconds or less of unplanned downtime per month.  This was the 9th consecutive year that such a statistic had been recorded for the IBM Z Mainframe platform.  This compares to ~3 Minutes of unplanned downtime per month for several other specialized server technologies, including IBM POWER, Cisco UCS and HP Integrity Superdome via the Linux Operating System.  Clearly, unplanned server downtime is undesirable and costly, impacting the bottom line of the business.  Industry Analysts state that ~80% of global business require 99.99% uptime, equating to ~52.5 Minutes downtime per year or ~8.66 Seconds per day.  In theory, only the IBM Z Mainframe platform exceeds this availability requirement, while IBM POWER, Cisco UCS and HP Integrity Superdome deliver borderline 99.99% availability capability.  The IBM Mainframe is classified as a mission-critical resource in 92 of the top 100 global banks, 23 of the top 25 USA based retailers, all 10 of the top 10 global insurance companies and 23 of the top 25 largest airlines globally…

The requirement for ever increasing amounts of corporate compute power is without doubt, satisfying the processing of ever increasing amounts of data, created from digital sources, including Cloud, Mobile and Social, requiring near real-time analytics to deliver meaningful information from these oceans of data.  Some organizations select x86 server technology to deliver this computing power requirement, either in their own Data Centre or via a 3rd party Cloud Provider.  However, with unplanned downtime characteristics that don’t meet the seeming de facto 99.99% uptime availability metric, can the growth in x86 server technology continue?  From many perspectives, Reliability, Availability & Serviceability (RAS), Data Security via Pervasive Encryption and best-in-class Performance and Scalability, you might think that the IBM Z Mainframe would be the platform of choice?  For whatever reason, this is not always the case!  Maybe we need to look at recent developments and trends in the compute power delivery market and second guess what might happen in the future…

Significant Cloud providers deliver vast amounts of computing power and associated resources, evolving their business models accordingly.  Such business models have many challenges, primarily uptime and data security related, convincing their prospective customers to migrate their workloads from traditional internal Data Centres, into these massive rack provisioned infrastructures.  Recently Google has evolved from using Intel as its primary supplier for Data Centre CPU chips, including CPU chips from IBM and other semiconductor rivals.

In April 2016, Google declared it had ported its online services to the IBM POWER CPU chip and that its toolchain could output code for Intel x86, IBM POWER and 64-bit ARM cores at the flip of a command-line switch.  As part of the OpenPOWER and Open Compute Project (OCP) initiatives, Google, IBM and Rackspace are collaborating to develop an open server specification based on the IBM POWER9 architecture.  The OCP Rack & Power Project will dictate the size and shape or form factor for housing these industry standard rack infrastructures.  What does this mean for the IBM Z server form factor?

Traditionally and over the last decade or more, IBM has utilized the 24 Inch rack form factor for the IBM Z Mainframe and Enterprise Class POWER Systems.  Of course, this is a different form factor to the industry standard 19 Inch rack, which finally became the de facto standard for the ubiquitous blade server.  Unfortunately there was no tangible standard for a 19 Inch rack, generating power, cooling and other issues.  Hence the evolution of the OCP Rack & Power Standard, codenamed Open Rack.  Google and Facebook have recently collaborated to evolve the Open Rack Standard V2.0, based upon an external 21 Inch rack Form factor, accommodating the de facto 19 Inch rack mounted equipment.

How do these recent developments influence the IBM Z platform?  If you’re the ubiquitous global CIO, knowing your organizations requires 99.99%+ uptime, delivering continuous business application change via DevOps, safeguarding corporate data with intelligent and system wide encryption, perhaps you still view the IBM Z Mainframe as a proprietary server with its own form factor?

As IBM have already demonstrated with their OpenPOWER offering, collaborating with Google and Rackspace, their 24 Inch rack approach can be evolved, becoming just another CPU chip in a Cloud (E.g. IaaS, Paas) service provider environment.  Maybe the final evolution step for the IBM Z Mainframe is evolving its form factor to a ubiquitous 19 Inch rack format?  The intelligent and clearly defined approach of the Open Rack Standard makes sense and if IBM could deliver an IBM Z Server in such a format, it just becomes another CPU chip in the ubiquitous Cloud (E.g. IaaS, Paas) service provider environment.  This might be the final piece of the jigsaw for today’s CIO as their approach to procuring compute power might be based solely upon the uptime and data security metrics.  For those organizations requiring in excess of 99.99% uptime and fully compliant security, there only seems to be one choice, the IBM Z Mainframe CPU chip technology, which has been running Linux workloads since 2000!

Cloudy With A Chance Of Mainframe?

With the advent of Computer Generated Imagery (CGI) there is seemingly no end to the number of books, especially “children’s” books that can be encapsulated and delivered in animated movie format.  I’m always surprised and arguably never surprised by the messaging in these stories; supposedly written for the younger person, but invariably delivering a message of good morals, ethics and human qualities, typically finding creative solutions to a myriad of problems.  Of course, we’re all human, and typically as human beings, we’re responsible for the majority of our problems, either knowingly, or not.

Cloudy with a Chance of Meatballs is a book based on a town named Chewandswallow characterized by its strange daily meteorological pattern, providing townsfolk with all of their required daily meals by raining food.  Although the residents of the town enjoy a lifestyle devoid of any grocery shopping or cookery, the weather unexpectedly and inexplicably takes a turn for the worse, devastating the local community with destructive and uncontrollable storms of either unpleasant or dangerously oversized foods, resulting in unstoppable catastrophes for the townspeople.  Their lives endangered by the threats of the storms, they relocate to a different community of average meteorological patterns, safe from the hazards that once were presented by raining meals.  However, they are forced to learn how to obtain food the normal way.

So what?  Continuing with the creativity thought, the ethos of this story might be somewhat analogous to the sometimes polarized opinion between Distributed Systems and Mainframe computing.  So depending on your philosophical bent or which side-of-the-fence you sit, there is only one choice, even if this seemingly perfect and de facto world is generating significant challenges… 

Recently, z/OS 2.1 became Generally Available (GA) and most notably from my viewpoint was its continued and demonstrable ability to participate in cloud computing environments.  So is the IBM Mainframe ready for the cloud?  Wasn’t it always!

The fundamental ethos of the Mainframe environment is virtualization and was forever thus.  The Mainframe has always shared the basic IT architecture components, including CPU, Memory, Storage, Networking and other peripherals, originally in a physical single-image structure, but since the late 1990’s in a shared (SYSPLEX) complex of interconnected physical servers (CPCs).  So the Mainframe is and always has been ready for “Prime Time Cloud”!

z/OS V2.1 is a platform designed to dynamically respond and scale to workload change with enhancements to scalability and performance that cover operations, I/O, virtual storage constraint relief, memory management, and more.  These enhancements are suitable for organizations that would like to catalyse a journey to highly scalable virtualized solutions like cloud.

IBM delivers improved scalability and performance for outstanding throughput and service within existing Mainframe environments.  Smarter scalability can better prepare the user for growth and spikes in workloads while maintaining the qualities of service and balanced design that customers have come to expect of the IBM mainframe.

As customers consider all the components of downtime, the true costs can be surprising, which is why superior availability continues to remain a key factor in platform selection. With z/OS V2.1, IBM introduces new capabilities designed to improve upon the already legendary z/OS system availability.  The industry-leading resiliency and high availability of System z remain key reasons why organizations keep their most critical processing on System z.  With its attention to outage reduction, the availability of System z and z/OS is well recognized in the industry.  In z/OS V2.1, IBM continues enhancements that improve critical IT systems availability, helping achieve an even higher level of service for customers.

Some of the “cloud friendly” z/OS 2.1 benefits include:

  • Support for Shared Memory Communications-RDMA (SMC-R), for low latency, application transparent communications to help you move data quickly between z/OS images on the same CPC or between CPCs.
  • Flash Express support for certain coupling facility list structures, such as IBM WebSphere MQ for z/OS, V7 (5655-R36), in order to strengthen resiliency for enterprise messaging workload spikes.
  • For zEC12 or zBC12 systems, shared engine coupling facilities can be used in many production environments, for improved economics by offering a high level of performance without requiring the use of dedicated CF engines.
  • EXCP support for System z High-Performance FICON (zHPF) is designed to help improve I/O start rates and improve bandwidth for more workloads on existing hardware and fabric.
  • Usability and performance improvements for z/OS FICON Discovery and Auto Configuration (zDAC), including discovery of directly attached devices.
  • Serial Coupling Facility structure rebuild processing, designed to help improve performance and availability by rebuilding coupling facility structures more quickly and in priority order.
  • 100-way symmetric multiprocessing (SMP) support in a single LPAR on IBM zEC12 or zBC12 systems.  Support for an architectural limit of 4 TB of real memory per LPAR.
  • Support for 2 GB pages is provided on zEC12 and zBC12 systems.  This feature is designed to reduce memory management overhead and improve overall system performance by enabling middleware to use 2 GB pages.  These improvements are expected due to improved effective translation lookaside buffer (TLB) coverage and a reduction in the number of steps the system must perform to translate a 2 GB page virtual address.
  • Capacity Provisioning is designed to provide support for manual and policy-based management of Defined Capacity and Group Capacity.  This function broadens the range of automatic, policy-based responses available to help manage capacity shortage conditions when WLM cannot meet your workload policy goals.

There are numerous new and enhanced functions delivered with z/OS 2.1, too numerous to mention, but categorised as Quality Of Service, Availability, Networking, Security, Data Usability, Integrity, Systems Management, Application Development, Simplification & Usability, International Standards Compliance, et al.

So let’s not forget, this foundation and support for an IT infrastructure and its supporting eco (software) system is in one scalable, secure and “zero” downtime environment!

So maybe for us open-minded and enlightened generation of parents (oops, I forgot, Grandparents for us Dinosaur Mainframe folk!) that can now “access” children’s stories, even if it’s in the form of a CGI animated movie, maybe we can be dispassionate enough to consider all platforms, Distributed and Mainframe for our evolving business and associated IT requirements. 

So you decide, can it be Cloudy With A Chance Of Mainframe?  To overlook such an option, might be an oversight, just as overlooking the abundance of human stories, classified as children’s books or not…